My website's pages are not being indexed correctly
-
Hi,
One of our websites, which is actually a price comparison engine, facing indexing problem at Google.
When we check “site:mywebsite.com “, there are lots of pages indexed which are not from mywebsite.com but from merchants websites. The index result page also shows merchant’s page title. In some cases the title is from merchant’s site but when the given link is accessed it points to mywebsite.com/index. Also the cache displays the merchant’s product page as the last indexed version rather than showing ours.
The mywebsite.com has quite few Merchants that send us their product feed. Those products are listed on comparison page with prices. The merchant’s links on comparison page are all no-follow links but some of the (not all) merchant’s product pages are indexed against mywebsite.com as mentioned above instead of product comparison page of mywebsite.com
How can we fix the issue?
Thanks!
-
Yeah i was thinking the same....
The interesting thing is we've removed the redirect page a week ago and replaced it with javascript redirect code. is that a good practice?
-
Ah. Regarding #3: If you have a disallow in the robots.txt the search engines won't pick up the noindex. Ensure the noindex code is in place on the applicable pages, remove the disallow, and the pages should be removed after they're crawled. getting that relationship straightened out might help with some of the other things as well. Cheers!
-
Thanks Ryan for the response. We'll surely prevent crawling of search result pages. Please check below points too. Thanks!!!
- The cache page shows merchant product page in full version as well as in text-only version.
- The title shown on the result page is also of the merchant's product page title.
- One thing on the comparison price page is merchants are redirected to their respective websites, the links are nofollow, but redirect page is indexed even after having it on robots.txt and noindex on redirect page.
- The redirect page is indexed like mywebsite.com/redirect-50187889-0
- Comparison listing is not similar to internal search result page but result pages are crawl-able.
-
no iFrames being used.
-
Thumbs up to Don's rec. Also when you look at the text only cache what kind of page are you seeing, if any? Sometimes the site: search is a little inconsistent so you can try forcing the delivery of certain pages with the inurl: modifier. One last caveat that comes to mind is that if the comparison listing is similar to an internal search results page, Google may not ever list it, "Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines." from: https://support.google.com/webmasters/answer/35769 Cheers!
-
How are you merchant prices / info being displayed on your site? From your site or using IFrames?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No: 'noindex' detected in 'robots' meta tag
Pages on my site show No: 'noindex' detected in 'robots' meta tag. However, when I inspect the pages html, it does not show noindex. In fact, it shows index, follow. Majority of pages show the error and are not indexed by Google...Not sure why this is happening. The page below in search console shows the error above...
Technical SEO | | Sean_White_Consult0 -
404's being re-indexed
Hi All, We are experiencing issues with pages that have been 404'd being indexed. Originally, these were /wp-content/ index pages, that were included in Google's index. Once I realized this, I added in a directive into our htaccess to 404 all of these pages - as there were hundreds. I tried to let Google crawl and remove these pages naturally but after a few months I used the URL removal tool to remove them manually. However, Google seems to be continually re/indexing these pages, even after they have been manually requested for removal in search console. Do you have suggestions? They all respond to 404's. Thanks
Technical SEO | | Tom3_151 -
Specific pages won't index
I have a few pages on my site that Google won't index, and I can't understand why. I've looked into possible issues with Robots, noindex, redirects, canonicals, and Search Console rules. I've got nothing. Example: I want this page to index https://tour.franchisebusinessreview.com/services/franchisee-satisfaction-surveys/ When I Google the full URL, I get results including the non-subdomain homepage, and various pages on the subdomain, including a child page of the page I want, but not the page itself. Any ideas? Thanks for the help!
Technical SEO | | ericstites0 -
Google Indexing Pages with Made Up URL
Hi all, Google is indexing a URL on my site that doesn't exist, and never existed in the past. The URL is completely made up. Anyone know why this is happening and more importantly how to get rid of it. Thanks 🙂
Technical SEO | | brian-madden0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
How do I fix issue regarding near duplicate pages on website associated to city OR local pages?
I am working on one e-commerce website where we have added 300+ pages to target different local cities in USA. We have added quite different paragraphs on 100+ pages to remove internal duplicate issue and save our website from Panda penalty. You can visit following page to know more about it. And, We have added unique paragraphs on few pages. But, I have big concerns with other elements which are available on page like Banner Gallery, Front Banner, Tool and few other attributes which are commonly available on each pages exclude 4 to 5 sentence paragraph. I have compiled one XML sitemap with all local pages and submitted to Google webmaster tools since 1st June 2013. But, I can see only 1 indexed page by Google on Google webmaster tools. http://www.bannerbuzz.com/local http://www.bannerbuzz.com/local/US/Alabama/Vinyl-Banners http://www.bannerbuzz.com/local/MO/Kansas-City/Vinyl-Banners and so on... Can anyone suggest me best solution for it?
Technical SEO | | CommercePundit0 -
Ecommerce website with too many links on page
Hi, I'm working on onsite seo for an ecommerce website and my recent report has shown that I have a high number of pages where there are 'too many links on page'. Does anyone have tips on how to avoid this when we're using mega menus, plenty of navigation for the user and links to products on each page? Thanks
Technical SEO | | Will_Craig1 -
Adding 'NoIndex Meta' to Prestashop Module & Search pages.
Hi Looking for a fix for the PrestaShop platform Look for the definitive answer on how to best stop the indexing of PrestaShop modules such as "send to a friend", "Best Sellers" and site search pages. We want to be able to add a meta noindex ()to pages ending in: /search?tag=ball&p=15 or /modules/sendtoafriend/sendtoafriend-form.php We already have in the robot text: Disallow: /search.php
Technical SEO | | reallyitsme
Disallow: /modules/ (Google seems to ignore these) But as a further tool we would like to incude the noindex to all these pages too to stop duplicated pages. I assume this needs to be in either the head.tpl or the .php file of each PrestaShop module.? Or is there a general site wide code fix to put in the metadata to apply' Noindex Meta' to certain files. Current meta code here: Please reply with where to add code and what the code should be. Thanks in advance.0