Substantial difference between Number of Indexed Pages and Sitemap Pages
-
Hey there,
I am doing a website audit at the moment.
I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise.
Total indexed: 2,360 (Search Consule)
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLsCheers,
Jochen -
Those discrepancies would not concern me, but there are some differences between all the things you list:
Total indexed: 2,360 Search Console - this is likely a reasonably accurate list of the number of pages you have indexed in Google. You could use a tool like URL Profiler to check index status of specific URLs.
About 2,920 results Google search "site:example.com" - site: search is less accurate and will likely return a different number each time you do it, even if it's just moments apart.
Sitemap: 1,229 URLs: these are URLs you added to a sitemap because they are priority pages you want to make sure Google has indexed and hopefully ranked. You control this number.
Screaming Frog Spider: 1,352 URLs - Screaming Frog is going to start on your homepage and crawl the site attempting to discover as many URLs as possible. If you are not linking to a page, SF won't be able to crawl it. Google on the other hand may have old pages, old URL structures or pages that were linked from an external website in their index and they won't forget them.
A really important question is: how many pages do you have that you want to be indexed? Is Google's index bloated with pages that you want to keep out? Figure these things out, and then try to adjust your sitemaps, noindex, robots.txt as needed.
-
Thanks for your reply Dmitrii,
we have excluded all query parameters in search console so this shouldn't be an issue. What is also strange is that when I try to scrape the SERPS via a site:example.com search Google is only showing a fraction (about 700) of the 2,920 results.
Cheers,
Jochen
- ★
- ★
- ☆
- ☆
- ☆
MozPoints: 810
Good Answers: 47
Endorsed Answers: 20">- ★
- ★
- ☆
- ☆
- ☆
-
Hi there.
I think that as long as rankings are good (especially historically), there is no reason to worry, because google includes in index pages, which wouldn't be in sitemap - for example pages, generated with query parameters (domain.com?x=value). Sometimes these pages do not really exist by themselves (like filters in online stores), they only exist "on the fly".
Hope this makes sense and helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page with metatag noindex is STILL being indexed?!
Hi Mozers, There are over 200 pages from our site that have a meta tag "noindex" but are STILL being indexed. What else can I do to remove them from the Index?
Intermediate & Advanced SEO | | yaelslater0 -
May integrating my main category page in the index page improve my ranking of main category keyword?
90% of our sales are made with products in one of our product categories.
Intermediate & Advanced SEO | | lcourse
A search for main category keyword returns our root domain index page in google, not the category page.
I was wondering whether integrating the complete main category directly in the index page of the root domain and this way including much more relevant content for this main category keyword may have a positive impact on our google ranking for the main category keyword. Any thoughts?1 -
No Index No follow instead of Rel canoncical on product pages
Hi all, we handle our product pages no with rel canonical now, we have 1 url that is indexed http://www.prams.net/cam-combi-family the other colours have different urls like http://www.prams.net/cam-combi-family-3-in-1-pram-reversible-seat-car-seat-grey-d which canonicalize to the indexed page. Google still crawls all those pages. For crawl budget reasons we want to use "no index, no follow" instead on these pages (the pages for the other colours)? Google would then crawl fewer pages more often? Does this make sense? Are their any downsides doing it? Thanks in advance Dieter
Intermediate & Advanced SEO | | Storesco1 -
Multiple Sitemaps Vs One Sitemap and Why 500 URLs?
I have a large website with rental listings in 14 markets, listings are added and taken off weekly if not daily. There are hundreds of listings in each market and all have their own landing page with a few pages associated. What is the best process here? I could run one sitemap and make each market's landing page .8 priority in the sitemap or make 14 sitemaps for each market and then have one sitemap for the general and static pages. From there, what would be the better way to structure? Should I keep all the big main landing pages in the general static sitemap or have them be at the top of the market segmented sitemaps? Also, I have over 5,000 urls, what is the best way to generate a sitemap over 500 urls? Is it necessary?
Intermediate & Advanced SEO | | Dom4410 -
Should I use individual product pages for different formats of the same product?
Hi All -- I'm working with a publishing client who is launching a new site. They have a large product catalogue offered in a number of format types (print, ebook, online learning, packages) with each one possessing a unique ISBN code. From past experience, I know that ISBN codes can be a really important ranking factor. We are currently trying to sort out product page guidelines. The proposed methods are: A single product page for all formats. The user then has the option to select which format they wish to purchase. The page would contain all key descriptors for each format, including: individual ISBN, format, title, price, author, etc. We would then use schema mark-up just to assist search engines with understanding and crawling. BUT we worry that the single page won't rank as well as say an invidual product page with a unique ISBN in the URL (for example: http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470573325.html) Which leads to the next option... Individual URLs for each format. We understand that most e-commerce guidelines state you shouldn't dilute link equity amongst multiple pages with very similar products and descriptions. BUT we want searchers to be able to search by individual ISBN and still find that specific format within the SERPs. This seems to rule out canonicalizing, because we don't prefer one format over the other and still want say the ebook to show up as much as the print version. If anyone has any other options or considerations that we haven't thought about, it would be greatly appreciated. Thanks, U
Intermediate & Advanced SEO | | HarborOneBank0 -
Certain Product Pages Not Indexing
Hey All, We discovered an issue where new product pages on our site were not getting indexed because a "noindex" tag was inadvertently being added to section when those pages were created. We removed the noindex tag in late April and some of the pages that had not been previously indexed are now showing up, but others are still not getting indexed and I'd appreciate some help on why this could be. Here is an example of a page that was not in the index but is now showing after removal of noindex: http://www.cloud9living.com/san-diego/gaslamp-quarter-food-tour And here is an example of a page that is still not showing in the index: http://www.cloud9living.com/atlanta/race-a-ferrari UPDATE: The above page is now showing after I manually submitted it in WMT. I had previously submitted another page like a month ago and it was still not indexing so I thought the manual submission was a dead end. However, it just so happens that the above URL just had its Page Title and H1 updated to something more specific and less duplicative so I am currently running a test to see if that's the problem with these pages not indexing. Will update this soon. Any suggestions? Thanks!
Intermediate & Advanced SEO | | GManSEO0 -
Sudden increase in number of indexed URLs. How ca I know what URLs these are?
We saw a spike in the total number of indexed URLs (17,000 to 165,000)--what would be the most efficient way to find out what the newly indexed URLs are?
Intermediate & Advanced SEO | | nicole.healthline0 -
Should the sitemap include just menu pages or all pages site wide?
I have a Drupal site that utilizes Solr, with 10 menu pages and about 4,000 pages of content. Redoing a few things and we'll need to revamp the sitemap. Typically I'd jam all pages into a single sitemap and that's it, but post-Panda, should I do anything different?
Intermediate & Advanced SEO | | EricPacifico0