Big discrepancies between pages in Google's index and pages in sitemap
-
Hi,
I'm noticing a huge difference in the number of pages in Googles index (using 'site:' search) versus the number of pages indexed by Google in Webmaster tools. (ie 20,600 in 'site:' search vs 5,100 submitted via the dynamic sitemap.)
Anyone know possible causes for this and how i can fix?
It's an ecommerce site but i can't see any issues with duplicate content - they employ a very good canonical tag strategy. Could it be that Google has decided to ignore the canonical tag?
Any help appreciated,
Karen
-
Take a look at the pages that are indexed. Chances are that since it is a cart or CMS-based site, you just need to use robots.txt to block out some areas you don't want indexed. You also need to look at your indexed pages, to see if any of them are duplicates, meaning you have 2 or more url's that display the same content.
"It's an ecommerce site but i can't see any issues with duplicate content - they employ a very good canonical tag strategy. Could it be that Google has decided to ignore the canonical tag? "
Could be that your cms or cart is not forwarding all the pages to the canonical version. Again, check to see if you can access multiple versions of the same page. Ecom and CMS sites always have these types of errors if you dont keep a close eye on the URL's since they are database driven, vs static HTML. Look for www or non-www versions of pages, url's with and without index.php, etc.
Once you target what the offending url's are, use redirects to forward them to the proper and search engine friendly version.
-
Hi,
Thanks so much for that, it's really interesting.
I've resolved the issue now, it was a case of some (a lot!) of missing canonical tags. Phew!
Thanks for your help!
-
Sorry wrong interpretation of your question. Have you excluded the site search pages using robots.txt.? If not this might be the reason why you've that many pages indexed.
Anyway this discussion might give you more answers:
-
Just 1 at the moment.
-
Are you using just one sitemap or multiple?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why does Google display the home page rather than a page which is better optimised to answer the query?
I have a page which (I believe) is well optimised for a specific keyword (URL, title tag, meta description, H1, etc). yet Google chooses to display the home page instead of the page more suited to the search query. Why is Google doing this and what can I do to stop it?
Intermediate & Advanced SEO | | muzzmoz0 -
Sitemap indexing
Hi everyone, Here's a duplicate content challenge I'm facing: Let's assume that we sell brown, blue, white and black 'Nike Shoes model 2017'. Because of technical reasons, we really need four urls to properly show these variations on our website. We find substantial search volume on 'Nike Shoes model 2017', but none on any of the color variants. Would it be theoretically possible to show page A, B, C and D on the website and: Give each page a canonical to page X, which is the 'default' page that we want to rank in Google (a product page that has a color selector) but is not directly linked from the site Mention page X in the sitemap.xml. (And not A, B, C or D). So the 'clean' urls get indexed and the color variations do not? In other words: Is it possible to rank a page that is only discovered via sitemap and canonicals?
Intermediate & Advanced SEO | | Adriaan.Multiply1 -
Magento: Should we disable old URL's or delete the page altogether
Our developer tells us that we have a lot of 404 pages that are being included in our sitemap and the reason for this is because we have put 301 redirects on the old pages to new pages. We're using Magento and our current process is to simply disable, which then makes it a a 404. We then redirect this page using a 301 redirect to a new relevant page. The reason for redirecting these pages is because the old pages are still being indexed in Google. I understand 404 pages will eventually drop out of Google's index, but was wondering if we were somehow preventing them dropping out of the index by redirecting the URL's, causing the 404 pages to be added to the sitemap. My questions are: 1. Could we simply delete the entire unwanted page, so that it returns a 404 and drops out of Google's index altogether? 2. Because the 404 pages are in the sitemap, does this mean they will continue to be indexed by Google?
Intermediate & Advanced SEO | | andyheath0 -
Page position dropped on Google
Hey Guys, My web designer has recommended this forum to use, the reason being: my google position has been dropped from page 1 to page 10 in the last week. The site is weloveschoolsigns.co.uk, but our main business site is textstyles.co.uk the school signs are a product of text styles. I have been told off my SEO company, that because I have changed the school logo to the text styles logo, Google have penalised me for it, and dropped us from page 1 for numerous keywords, to page 10 or more. They have also said that duplicate content within the school site http://www.weloveschoolsigns.co.uk/school-signs-made-easy/ has also a contributed to the drop in positions. (this content is not on the textstyles site) Lastly they said, that having the same telephone number is a definate no no. They said that I have been penalised, because google see the above as trying to monopolise on the market. I don’t know if all this is true, as the SEO is way above my head, but they have quoted me £1250 to repair all the errors, when the site only cost £750. They have also mentioned that because of the above changes, the main text styles site will also be punished. Any thoughts on this matter would be much appreciated as I don't know whether to pay them to crack on, or accept the new positions. Either way I'm very confused. Thanks Thomas
Intermediate & Advanced SEO | | TextStylesUK0 -
What to do when you buy a Website without it's content which has a few thousand pages indexed?
I am currently considering buying a Website because I would like to use the domain name to build my project on. Currently that domain is in use and that site has a few thousand pages indexed and around 30 Root domains linking to it (mostly to the home page). The topic of the site is not related to what I am planing to use it for. If there is no other way, I can live with losing the link juice that the site is getting at the moment, however, I want to prevent Google from thinking that I am trying to use the power for another, non related topic and therefore run the risk of getting penalized. Are there any Google guidelines or best practices for such a case?
Intermediate & Advanced SEO | | MikeAir0 -
To index or de-index internal search results pages?
Hi there. My client uses a CMS/E-Commerce platform that is automatically set up to index every single internal search results page on search engines. This was supposedly built as an "SEO Friendly" feature in the sense that it creates hundreds of new indexed pages to send to search engines that reflect various terminology used by existing visitors of the site. In many cases, these pages have proven to outperform our optimized static pages, but there are multiple issues with them: The CMS does not allow us to add any static content to these pages, including titles, headers, metas, or copy on the page The query typed in by the site visitor always becomes part of the Title tag / Meta description on Google. If the customer's internal search query contains any less than ideal terminology that we wouldn't want other users to see, their phrasing is out there for the whole world to see, causing lots and lots of ugly terminology floating around on Google that we can't affect. I am scared to do a blanket de-indexation of all /search/ results pages because we would lose the majority of our rankings and traffic in the short term, while trying to improve the ranks of our optimized static pages. The ideal is to really move up our static pages in Google's index, and when their performance is strong enough, to de-index all of the internal search results pages - but for some reason Google keeps choosing the internal search results page as the "better" page to rank for our targeted keywords. Can anyone advise? Has anyone been in a similar situation? Thanks!
Intermediate & Advanced SEO | | FPD_NYC0 -
Certain Product Pages Not Indexing
Hey All, We discovered an issue where new product pages on our site were not getting indexed because a "noindex" tag was inadvertently being added to section when those pages were created. We removed the noindex tag in late April and some of the pages that had not been previously indexed are now showing up, but others are still not getting indexed and I'd appreciate some help on why this could be. Here is an example of a page that was not in the index but is now showing after removal of noindex: http://www.cloud9living.com/san-diego/gaslamp-quarter-food-tour And here is an example of a page that is still not showing in the index: http://www.cloud9living.com/atlanta/race-a-ferrari UPDATE: The above page is now showing after I manually submitted it in WMT. I had previously submitted another page like a month ago and it was still not indexing so I thought the manual submission was a dead end. However, it just so happens that the above URL just had its Page Title and H1 updated to something more specific and less duplicative so I am currently running a test to see if that's the problem with these pages not indexing. Will update this soon. Any suggestions? Thanks!
Intermediate & Advanced SEO | | GManSEO0 -
Why my own page is not indexed for that keyword?
hi, I recently recreated the page www.zenucchi.it /ITA/poltrona-frau-brescia.html on the third level domain poltronafraubrescia.zenucchi.it by putting it on the home page. The first page is still indexed for the keyword poltrona frau brescia . But the new page is no indexed for that keyword and i don't know why ( even if the page is indexed in google ) .. I state that the new domain has the same autorithy and that i put a 301 redirect to pass his authority to the new one that has many more incoming links that did not have previous .. i hope you'll help me thanks a lot
Intermediate & Advanced SEO | | guidoboem0