Why extreme drop in number of pages indexed via GWMT sitemaps?
-
Any tips on why our GWMT Sitemaps indexed pages dropped to 27% of total submitted entries (2290 pages submitted, 622 indexed)? Already checked the obvious Test Sitemap, valid URLs etc. We had typically been at 95% of submitted getting indexed.
-
Thanks, that coves it!
-
Yes, this is the norm. You will generally have a variety of update frequencies in your xml sitemap. If you look at your sitemap you will usually see a value from 0.1 to 1.0. Those request the frequency in which the page is updated. If Googlebot will generally adhere to your guidelines and only crawl those pages when you tell them they are updated. If all of your pages are set to the same frequency, which they shouldn't be, Google will generally only crawl a certain amount of data on your site on a given crawl. So, a slow increase in indexed pages is the norm.
-
Yes, looking back at change logs was helpful. Canonical tags was it! We found a bug, the canonical page tags were being truncated at 8 characters. The number of pages indexed has started to increase rather than decrease, so it appears the issue is resolved. But I would have thought the entire sitemap would get indexed once the issue was resolved, rather than small increases each day. Does that seem correct to have a slow increase back to normal, rather than getting back to nearly 100% indexed overnight?
-
Do you have the date of the change? Try to see if you can see the when the change happened because we might be able to figure it out that way too.
WMT > sitemaps > webpages tab
Once you find the date you may be able to go through your notes and see if you've done anything around that date or if Google had any sort of update (PageRank just updated).
I have had sites that had pages unindexed and then a few crawls later it got reindexed. I just looked at 20 sites in our WMT and all of our domains look good as far as percentage of submitted vs indexed.
Only other things I can think of is to check for duplicate content, canonical tags, noindex tags, pages with little or no value (thin content) and (I've done this before) keep your current sitemap structure but add an additional sitemap with all of your pages and posts to it. Don't break it down, just add it all to one sitemap. I've had that work before for a similar issue but that was back in 2010. Multiple sitemaps for that site never seemed to work out. Having it all on one did the trick. The site was only about 4,000 pages at the time but I thought I would mention it. I haven't been able to duplicate the error and no other site has had that problem but that did do the trick.
Definitely keep an eye on it over the next few crawls. Please let us know what the results are and what you've tried so we can help troubleshoot.
-
We use multiple site maps.
Thanks, I had not thought about page load speed. But it turned up okay. Had already considered your other suggestions. Will keep digging. Appreciate your feedback. -
Not sure why the drop but are you using just one sitemap or do you have multiple ones?
Check the sizes of your pages and the crawl rate that Google is crawling your site. If they have an issue with the time it takes them to crawl your sitemap, it will start to reduce the number of indexed pages it serves up. You can check your crawl stats by navigating to WMT, crawl > crawl stats. Check to see if you've notice any delays in the numbers.
Also, make sure that your robots.txt isn't blocking anything.
Have you checked your site with a site: search?
These are pretty basic stuff but let us know what you've looked into so we can help you more. Thanks.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google does not index image sitemap
Hi, we put an image sitemap in the searchconsole/webmastertools http://www.sillasdepaseo.es/sillasdepaseo/sitemap-images.xml it contains only the indexed products and all images on the pages. We also claimed the CDN in the searchconsole http://media.sillasdepaseo.es/ It has been 2 weeks now, Google indexes the pages, but not the images. What can we do? Thanks in advance. Dieter Lang
Intermediate & Advanced SEO | | Storesco0 -
HTTPS pages - To meta no-index or not to meta no-index?
I am working on a client's site at the moment and I noticed that both HTTP and HTTPS versions of certain pages are indexed by Google and both show in the SERPS when you search for the content of these pages. I just wanted to get various opinions on whether HTTPS pages should have a meta no-index tag through an htaccess rule or whether they should be left as is.
Intermediate & Advanced SEO | | Jamie.Stevens0 -
How to associate content on one page to another page
Hi all, I would like associate content on "Page A" with "Page B". The content is not the same, but we want to tell Google it should be associated. Is there an easy way to do this?
Intermediate & Advanced SEO | | Viewpoints1 -
Dynamic pages - ecommerce product pages
Hi guys, Before I dive into my question, let me give you some background.. I manage an ecommerce site and we're got thousands of product pages. The pages contain dynamic blocks and information in these blocks are fed by another system. So in a nutshell, our product team enters the data in a software and boom, the information is generated in these page blocks. But that's not all, these pages then redirect to a duplicate version with a custom URL. This is cached and this is what the end user sees. This was done to speed up load, rather than the system generate a dynamic page on the fly, the cache page is loaded and the user sees it super fast. Another benefit happened as well, after going live with the cached pages, they started getting indexed and ranking in Google. The problem is that, the redirect to the duplicate cached page isn't a permanent one, it's a meta refresh, a 302 that happens in a second. So yeah, I've got 302s kicking about. The development team can set up 301 but then there won't be any caching, pages will just load dynamically. Google records pages that are cached but does it cache a dynamic page though? Without a cached page, I'm wondering if I would drop in traffic. The view source might just show a list of dynamic blocks, no content! How would you tackle this? I've already setup canonical tags on the cached pages but removing cache.. Thanks
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Should pages of old news articles be indexed?
My website published about 3 news articles a day and is set up so that old news articles can be accessed through a "back" button with articles going to page 2 then page 3 then page 4, etc... as new articles push them down. The pages include a link to the article and a short snippet. I was thinking I would want Google to index the first 3 pages of articles, but after that the pages are not worthwhile. Could these pages harm me and should they be noindexed and/or added as a canonical URL to the main news page - or is leaving them as is fine because they are so deep into the site that Google won't see them, but I also won't be penalized for having week content? Thanks for the help!
Intermediate & Advanced SEO | | theLotter0 -
Get Duplicate Page content for same page with different extension ?
I have added a campaign like "Bannerbuzz" in SEOMOZ Pro account and before 2 or 3 days i got errors related to duplicate page content . they are showing me same page with different extension. As i mentioned below http://www.bannerbuzz.com/outdoor-vinyl-banners.html
Intermediate & Advanced SEO | | CommercePundit
&
http://www.bannerbuzz.com/outdoor_vinyl_banner.php We checked our whole source files but we didn't define php related urls in our source code. we want to catch only our .html related urls. so, Can you please guide us to solve this issue ? Thanks <colgroup><col width="857"></colgroup>
| http://www.bannerbuzz.com/outdoor-vinyl-banners.html |0 -
How long till pages drop out of the index
In your experience how long does it normally take for 301-redirected pages to drop out of Google's index?
Intermediate & Advanced SEO | | bjalc20110 -
Should the sitemap include just menu pages or all pages site wide?
I have a Drupal site that utilizes Solr, with 10 menu pages and about 4,000 pages of content. Redoing a few things and we'll need to revamp the sitemap. Typically I'd jam all pages into a single sitemap and that's it, but post-Panda, should I do anything different?
Intermediate & Advanced SEO | | EricPacifico0