Webmaster Tools Indexed pages vs. Sitemap?
-
Looking at Google Webmaster Tools and I'm noticing a few things, most sites I look at the number of indexed pages in the sitemaps report is usually less than 100% (i.e. something like 122 indexed out of 134 submitted or something) and the number of indexed pages in the indexed status report is usually higher. So for example, one site says over 1000 pages indexed in the indexed status report but the sitemap says something like 122 indexed.
My question: Is the sitemap report always a subset of the URLs submitted in the sitemap? Will the number of pages indexed there always be lower than or equal to the URLs referenced in the sitemap?
Also, if there is a big disparity between the sitemap submitted URLs and the indexed URLs (like 10x) is that concerning to anyone else?
-
Unfortunately not, the closest you'll get is selecting a long period of time in Analytics and then exporting all the pages that received organic search traffic. If you could then cross check them with your list of URLs on your site it could provide you with a small list. But I would still check them in Google to make sure they aren't indexed. As I said it's not the best way.
-
Is there a reliable way to determine which pages have not been indexed?
-
Great answer by Tom already, but I want to add that probably images and other types of content whom are mostly not by default included in sitemaps could also be among the indexed 'pages'.
-
There's no golden rule that your sitemap > indexed pages or vice versa.
If you have more URLs in your sitemap than you have indexed pages, you want to look at the pages not indexed to see why that is the case. It could be that those pages have duplicate and/or thin content, and so Google is ignoring them. A canonical tag might be instructing Google to ignore them. Or the pages might be off the site navigation and are more than 4 links/jumps away from the homepage or another page on the site, make them hard to find.
Conversely, if you had lots more pages indexed than in your sitemap, it could be a navigation or URL duplication problem. Check to see if any of the pages are duplicate versions caused by things like dynamic URLs generated through search on the site or the site navigation, for example. If those pages are the only physical pages that you have created and you know every single one has been submitted in a sitemap - and so any other indexed URLs would be unaccounted for, that may well be cause for concern, so check nothing is being indexed multiple times.
Just a couple of scenarios, but I hope it helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking Standard pages with Robots.txt (t&c's, shipping policy, pricing & privacy policies etc)
Hi I've just had best practice site migration completed for my old e-commerce store into a Shopify environment and I see in GSC that it's reporting my standard pages as blocked by robots.txt, such as these below examples. Surely I don't want these blocked ? is that likely due to my migrators or s defaults setting with Shopify does anyone know? : t&c's shipping policy pricing policy privacy policy etc So in summary: Shall I unblock these? What caused it Shopify default settings or more likely my migration team? All Best Dan
Reporting & Analytics | | Dan-Lawrence0 -
Which page speed tools do you trust?
We are at a loss. We have been following Google Pagespeed Insights which gives us horrible marks. Nothing we do seems to really make a difference. GTMetrix and Pingdom give us much better scores. We don't know what to do...
Reporting & Analytics | | HashtagHustler0 -
PPC ads: landing page vs website page ... other metrics to consider?
I run 3 PPC ads in Chinese language on Baidu, of which one landing page consistently ranks around 5th amongst 80+ other English website pages. So, when our site was recently developed for Chinese language, I redirected the ads to the relevant Chinese website pages for a month to see which would attract more visitors (Chinese landing pages vs Chinese website). I haven't fully analysed the results yet, but what other metrics should I consider than just the volume of visitors? Landing page
Reporting & Analytics | | SteveMauldin
https://www.mogas.com/en-us/ppc-ads/mogas-球阀
https://www.mogas.com/en-us/ball-valves (in English) Website page
https://www.mogas.com/zh-cn/产品
https://www.mogas.com/en-us/products (in English)0 -
Webmaster Tools V. OSE Backlinks to Disavow
Hey Everybody, So I am getting to the point where I need to disavow some backlinks. I am noticing a discrepancy between the OSE backlinks and the GWMT backlinks. My question is do I put both sets of links in the Disavow tool? Just the links from GWMT? Just the links from OSE? Etc... Etc... Thanks Everybody!
Reporting & Analytics | | HashtagHustler0 -
Sudden Increase In Number of Pages Indexed By Google Webmaster When No New Pages Added
Greetings MOZ Community: On June 14th Google Webmaster tools indicated an increase in the number of indexed pages, going from 676 to 851 pages. New pages had been added to the domain in the previous month. The number of pages blocked by robots increased at that time from 332 (June 1st) to 551 June 22nd), yet the number of indexed pages still increased to 851. The following changes occurred between June 5th and June 15th: -A new redesigned version of the site was launched on June 4th, with some links to social media and blog removed on some pages, but with no new URLs added. The design platform was and is Wordpress. -Google GTM code was added to the site. -An exception was made by our hosting company to ModSecurity on our server (for i-frames) to allow GTM to function. In the last ten days my web traffic has decline about 15%, however the quality of traffic has declined enormously and the number of new inquiries we get is off by around 65%. Click through rates have declined from about 2.55 pages to about 2 pages. Obviously this is not a good situation. My SEO provider, a reputable firm endorsed by MOZ, believes the extra 175 pages indexed by Google, pages that do not offer much content, may be causing the ranking decline. My developer is examining the issue. They think there may be some tie in with the installation of GTM. They are noticing an additional issue, the sites Contact Us form will not work if the GTM script is enabled. They find it curious that both issues occurred around the same time. Our domain is www.nyc-officespace-leader. Does anyone have any idea why these extra pages are appearing and how they can be removed? Anyone have experience with GTM causing issues with this? Thanks everyone!!!
Reporting & Analytics | | Kingalan1
Alan1 -
Google Webmaster Tools - When will the links go away!?
About 9 months back we thought having an extremely reputable company build our client some local citations would be a good idea. You definitely know this citation company, but I'll leave names out. Regardless, it's our mistake to cut corners. Google Webmaster Tools quickly picked up these new citations and added them to the links section. One of these citation spawned a complete mess of about 60K+ links on their network of sites through ridiculous subdomains of every state in the country and so many other domain variations. We immediately went into remove mode and had the site's webmaster take down the bad links from their site. This process took about a month for outreach. The bad links (60K+) have not been on the spam site for well over 6 months but GWT still shows them in the "links to your site" section. Majestic, Bing, and OSE only displayed the bad links for a brief time. Why is webmaster tools still showing these links after 6+ months? We typically see GWT update about every 2 weeks, a month tops. Any ideas? Could a changed robots.txt on the bad site prevent Google from updating the links displayed in GWT? We have submitted to disavow, but Google replied with "no manual penalty". We even blasted the bad site with Fiverr links, in hopes that Google would re-crawl them. No luck with anything we do. We have patiently waited for way too long. The rankings for this site got crushed on Google after these citations. How do we fix this? Should we worry about this? Any advice would really help. Thanks so much in advance.
Reporting & Analytics | | zadro0 -
Posting on blog comments with anchor text on high ranked pages effective?
So i've identified some blogs which have a fairly high ranking and lots of traffic. They also allow anchor text in the name field. Does it make sense for me to comment on these blogs, or does google treat these with less authority that true page links? Any advice is greatly appreciated! TIA
Reporting & Analytics | | symbolphoto0 -
Tracking pages in two separate analytics accounts
Hi All, I'm trying to track some pages on one website in two separate Google Analytics accounts. Has anybody done this before that could help with the tracking code? Thanks in advance, Elias
Reporting & Analytics | | A_Q0