My website's pages are not being indexed correctly
-
Hi,
One of our websites, which is actually a price comparison engine, facing indexing problem at Google.
When we check “site:mywebsite.com “, there are lots of pages indexed which are not from mywebsite.com but from merchants websites. The index result page also shows merchant’s page title. In some cases the title is from merchant’s site but when the given link is accessed it points to mywebsite.com/index. Also the cache displays the merchant’s product page as the last indexed version rather than showing ours.
The mywebsite.com has quite few Merchants that send us their product feed. Those products are listed on comparison page with prices. The merchant’s links on comparison page are all no-follow links but some of the (not all) merchant’s product pages are indexed against mywebsite.com as mentioned above instead of product comparison page of mywebsite.com
How can we fix the issue?
Thanks!
-
Yeah i was thinking the same....
The interesting thing is we've removed the redirect page a week ago and replaced it with javascript redirect code. is that a good practice?
-
Ah. Regarding #3: If you have a disallow in the robots.txt the search engines won't pick up the noindex. Ensure the noindex code is in place on the applicable pages, remove the disallow, and the pages should be removed after they're crawled. getting that relationship straightened out might help with some of the other things as well. Cheers!
-
Thanks Ryan for the response. We'll surely prevent crawling of search result pages. Please check below points too. Thanks!!!
- The cache page shows merchant product page in full version as well as in text-only version.
- The title shown on the result page is also of the merchant's product page title.
- One thing on the comparison price page is merchants are redirected to their respective websites, the links are nofollow, but redirect page is indexed even after having it on robots.txt and noindex on redirect page.
- The redirect page is indexed like mywebsite.com/redirect-50187889-0
- Comparison listing is not similar to internal search result page but result pages are crawl-able.
-
no iFrames being used.
-
Thumbs up to Don's rec. Also when you look at the text only cache what kind of page are you seeing, if any? Sometimes the site: search is a little inconsistent so you can try forcing the delivery of certain pages with the inurl: modifier. One last caveat that comes to mind is that if the comparison listing is similar to an internal search results page, Google may not ever list it, "Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines." from: https://support.google.com/webmasters/answer/35769 Cheers!
-
How are you merchant prices / info being displayed on your site? From your site or using IFrames?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No: 'noindex' detected in 'robots' meta tag
I'm getting an error in Search Console that pages on my site show No: 'noindex' detected in 'robots' meta tag. However, when I inspect the pages html, it does not show noindex. In fact, it shows index, follow. Majority of pages show the error and are not indexed by Google...Not sure why this is happening. Unfortunately I can't post images on here but I've linked some url's below. The page below in search console shows the error above... https://mixeddigitaleduconsulting.com/ As does this one. https://mixeddigitaleduconsulting.com/independent-school-marketing-communications/ However, this page does not have the error and is indexed by Google. The meta robots tag looks identical. https://mixeddigitaleduconsulting.com/blog/leadership-team/jill-goodman/ Any and all help is appreciated.
Technical SEO | | Sean_White_Consult0 -
Page missing from Google index
Hi all, One of our most important pages seems to be missing from the Google index. A number of our collections pages (e.g., http://perfectlinens.com/collections/size-king) are thin, so we've included a canonical reference in all of them to the main collection page (http://perfectlinens.com/collections/all). However, I don't see the main collection page in any Google search result. When I search using "info:http://perfectlinens.com/collections/all", the page displayed is our homepage. Why is this happening? The main collection page has a rel=canonical reference to itself (auto-generated by Shopify so I can't control that). Thanks! WUKeBVB
Technical SEO | | leo920 -
3,511 Pages Indexed and 3,331 Pages Blocked by Robots
Morning, So I checked our site's index status on WMT, and I'm being told that Google is indexing 3,511 pages and the robots are blocking 3,331. This seems slightly odd as we're only disallowing 24 pages on the robots.txt file. In light of this, I have the following queries: Do these figures mean that Google is indexing 3,511 pages and blocking 3,331 other pages? Or does it mean that it's blocking 3,331 pages of the 3,511 indexed? As there are only 24 URLs being disallowed on robots.text, why are 3,331 pages being blocked? Will these be variations of the URLs we've submitted? Currently, we don't have a sitemap. I know, I know, it's pretty unforgivable but the old one didn't really work and the developers are working on the new one. Once submitted, will this help? I think I know the answer to this, but is there any way to ascertain which pages are being blocked? Thanks in advance! Lewis
Technical SEO | | PeaSoupDigital0 -
Medium sizes forum with 1000's of thin content gallery pages. Disallow or noindex?
I have a forum at http://www.onedirection.net/forums/ which contains a gallery with 1000's of very thin-content pages. We've currently got these photo pages disallowed from the main googlebot via robots.txt, but we do all the Google images crawler access. Now I've been reading that we shouldn't really use disallow, and instead should add a noindex tag on the page itself. It's a little awkward to edit the source of the gallery pages (and keeping any amends the next time the forum software gets updated). Whats the best way of handling this? Chris.
Technical SEO | | PixelKicks0 -
How do I fix issue regarding near duplicate pages on website associated to city OR local pages?
I am working on one e-commerce website where we have added 300+ pages to target different local cities in USA. We have added quite different paragraphs on 100+ pages to remove internal duplicate issue and save our website from Panda penalty. You can visit following page to know more about it. And, We have added unique paragraphs on few pages. But, I have big concerns with other elements which are available on page like Banner Gallery, Front Banner, Tool and few other attributes which are commonly available on each pages exclude 4 to 5 sentence paragraph. I have compiled one XML sitemap with all local pages and submitted to Google webmaster tools since 1st June 2013. But, I can see only 1 indexed page by Google on Google webmaster tools. http://www.bannerbuzz.com/local http://www.bannerbuzz.com/local/US/Alabama/Vinyl-Banners http://www.bannerbuzz.com/local/MO/Kansas-City/Vinyl-Banners and so on... Can anyone suggest me best solution for it?
Technical SEO | | CommercePundit0 -
Has Google stopped rendering author snippets on SERP pages if the author's G+ page is not actively updated?
Working with a site that has multiple authors and author microformat enabled. The image is rendering for some authors on SERP page and not for others. Difference seems to be having an updated G+ page and not having a constantly updating G+ page. any thoughts?
Technical SEO | | irvingw0 -
When rankings dip what's the best diagnostic procedure?
Bonjourno from 10 degrees C lighly raining Wetherby UK 🙂 Every so often SEO feels like a game of snakes & ladders. One minute your rankings go up and then then within the click of a mouse they drop back down. Like a Greek play you begin to feel our mortal lives as SEO pundits is controlled by the Google Gods. A case in point is illustrated here in this graph:
Technical SEO | | Nightwing
http://i216.photobucket.com/albums/cc53/zymurgy_bucket/lincoln-drop_zpseeb04690.jpg Now if i want to explain why the rapid dip has occured for target term "Lincoln Solicitors" here's is what i'd do: 1. Go to webmaster tools and check for crawl errors 2. See if a Google algo change has changed the rules of engagment 3. Check another site administrator hasnt tinkered with the original layout But i wonder what process do other SEO practitioners follow to explain to a disgruntled client - "Why have my rankings that i pay you to look after nose dived?" Any insights welcome:-)0 -
What's the SEO impact of url suffixes?
Is there an advantage/disadvantage to adding an .html suffix to urls in a CMS like WordPress. Plugins exist to do it, but it seems better for the user to leave it off. What do search engines prefer?
Technical SEO | | Cornucopia0