Webmaster Tools Indexed pages vs. Sitemap?
-
Looking at Google Webmaster Tools and I'm noticing a few things, most sites I look at the number of indexed pages in the sitemaps report is usually less than 100% (i.e. something like 122 indexed out of 134 submitted or something) and the number of indexed pages in the indexed status report is usually higher. So for example, one site says over 1000 pages indexed in the indexed status report but the sitemap says something like 122 indexed.
My question: Is the sitemap report always a subset of the URLs submitted in the sitemap? Will the number of pages indexed there always be lower than or equal to the URLs referenced in the sitemap?
Also, if there is a big disparity between the sitemap submitted URLs and the indexed URLs (like 10x) is that concerning to anyone else?
-
Unfortunately not, the closest you'll get is selecting a long period of time in Analytics and then exporting all the pages that received organic search traffic. If you could then cross check them with your list of URLs on your site it could provide you with a small list. But I would still check them in Google to make sure they aren't indexed. As I said it's not the best way.
-
Is there a reliable way to determine which pages have not been indexed?
-
Great answer by Tom already, but I want to add that probably images and other types of content whom are mostly not by default included in sitemaps could also be among the indexed 'pages'.
-
There's no golden rule that your sitemap > indexed pages or vice versa.
If you have more URLs in your sitemap than you have indexed pages, you want to look at the pages not indexed to see why that is the case. It could be that those pages have duplicate and/or thin content, and so Google is ignoring them. A canonical tag might be instructing Google to ignore them. Or the pages might be off the site navigation and are more than 4 links/jumps away from the homepage or another page on the site, make them hard to find.
Conversely, if you had lots more pages indexed than in your sitemap, it could be a navigation or URL duplication problem. Check to see if any of the pages are duplicate versions caused by things like dynamic URLs generated through search on the site or the site navigation, for example. If those pages are the only physical pages that you have created and you know every single one has been submitted in a sitemap - and so any other indexed URLs would be unaccounted for, that may well be cause for concern, so check nothing is being indexed multiple times.
Just a couple of scenarios, but I hope it helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When Deployment done for my site in google analytic page views little bit affected
Hello Expert, For my ecommerce site every minute I have 500 visitors now when my development team do deployment like minor JS update where site restart not require then what I found is on website (front end) not able to find anything happen that means visitor not event realize page even refresh but in analytic real time it shows minor pageviews down. So can anyone guess here what happen actually in this case as per google? Thanks!
Reporting & Analytics | | micey1230 -
Tracked Event for Button Now I would I like to know that Customer reached to thank you page?
Hello Expert, For my ecommerce site I am tracking "Retrieve My Basket" button as event tracking. Now I want to know after retrieving basket did my customer reached to thank you page? So how should I track such goal? Thanks!
Reporting & Analytics | | dsouzac0 -
Increase in 404 errors in Webmaster Tools
We have recently updated our website www.cooke.co.uk and twice webmaster tools have reported an increase in 404 errors. However, these errors are not for normal pages, they are things like http://www.cooke.co.uk/?post_type=crown_enquiry&p=1045 I have used the redirection tool in WordPress to redirect all these links to the homepage but does anyone know why this is happening since it is the second time I had to do it. Thanks.
Reporting & Analytics | | AAttias0 -
Can 500 errors hurt rankings for an entire site or just the pages with the errors?
I'm working with a site that had over 700 500 errors after a redesign in april. Most of them were fixed in June, but there are still about 200. Can 500 errors affect rankings sitewide, or just the pages with the errors? Thanks for reading!
Reporting & Analytics | | DA20130 -
800,000 pages blocked by robots...
We made some mods to our robots.txt file. Added in many php and html pages that should not have been indexed. Well, not sure what happened or if there was some type of dynamic conflict with our CMS and one of these pages, but in a few weeks we checked webmaster tools and to our great surprise and dismay, the number of blocked pages we had by robots.txt was up to about 800,000 pages out of the 900,000 or so we have indexed. 1. So, first question is, has anyone experienced this before? I removed the files from robots.txt and the number of blocked files has still been climbing. Changed the robots.txt file on the 27th. It is the 29th and the new robots.txt file has been downloaded, but the blocked pages count has been rising in spite of it. 2. I understand that even if a page is blocked by robots.txt, it still shows up in the index, but does anyone know how the blocked page affects the ranking? i.e. while it might still show up even though it has been blocked will google show it at a lower rank because it was blocked by robots.txt? Our current robots.txt just says: User-agent: *
Reporting & Analytics | | TheCraig
Disallow: Sitemap: oursitemap Any thoughts? Thanks! Craig0 -
Homepage on page 2 for site:domain
Hi all, today I noticed that our homepage is located on page 2 if you do the site:domain query. As far as I know, the site:domain results mirror the importance in the eyes of Google. Some time ago, our homepage was the first result. I have to say that we do not often have changing elements or new content on the homepage, it is more like a static page. But still the most linked to page on the domain... What conclusion can I come to? Is our homepage of lower importance to Google than some time ago? Is it a problem for SEO? As we backed down our advertisments, the traffic from branded keywords fell the last months - could this be an explanation? And, most important: do I have to worry? (Besides, the SEO-traffic is fine and growing..)
Reporting & Analytics | | accessKellyOCG0 -
Interpreting Keyword Difficulty Tool results
I have been running various reports using the Keyword Difficulty tool. The columns are color coded - are there one set of signals that are more important than others? I am trying to analylze why we are ranking where we are and where our competitors are weak enough to overcome in a short term. I cannot seem to find a formula for the top indicatators for moving up in ranking. One of my sites in ranking number 5 but is neck and neck with the number one site for results. The other site is number 6 for the same term but has virtually no strength in the data like links, mozrank, trust, Social, etc...The sites 7-10 are much stronger on data. Maybe since they are from the same C block that helps? Any advice or experience would be helpful.
Reporting & Analytics | | devonkrusich0 -
Google: show all images indexed on a domain
Is there a way to display all images that google has indexed on a domain / subdomain? I'm basically looking for something like a site:-command for google image search.
Reporting & Analytics | | jmueller0