Sitemap - % of URL's in Google Index?
-
What is the average % of links from a sitemap that are included in the Google index? Obviously want to aim for 100% of the sitemap urls to be indexed, is this realistic?
-
If all the pages in your sitemap are worthy of the Google index, then you should expect around a 100% indexation rate. On the flip side, if you reference low quality pages in your sitemap file, you will not got them indexed and may even be hurting the trust of your sitemap file. As a point in case, Bing just recently announced that if they see an error rate greater than 1% in the sitemap, then they will just ignore your sitemap file.
-
Clients, so I have no idea how they do it. It's a complex automated process for sure.
-
Wow. Do you have a third party program to build your site map files or our you using something built in house?
-
Ryan's point is important to note. 100% is achievable under the correct circumstances. I've got a client with 34 million pages on their main site (and contained within a combined 909 sitemap xml files), and they have 34 million pages indexed.
-
The percent of pages indexed varies greatly with each site. If you desire 100% of your site indexed then 100% of your site's pages should be reviewed to ensure their content is worthy of being indexed. The content should be unique, well written and properly presented. Your sitemap process also needs to be carefully reviewed. Many site owners simply set up an automated process without taking the time to ensure it is properly configured. Often pages which are blocked by robots.txt are included in the site map, and those pages will not be indexed.
Many people say "I want 100% of my site indexed" just how many people say "I want to be #1 rank in Google". Both results are achievable, but both require time and effort, and perhaps money.
-
Hi. We have a stiemap with over 250,000 URLs and we are at 87%. This is a high for us. We have never been able to get 100%. We have been trying to clean up the sitemap a bit but with so many URLs it is hard to go through it line by line. We are making more of an effort to fix the errors Google tells us about in Webmaster Tools but these only account for a fraction of the URLs apparently not indexed.
We also do site searches on Google to see how many URLs total we have in Google as our sitemap only includes "the most important" pages. Doing a search for "site:www.sierratradingpost.com" comes up with over 400,000 URLs.
For us, I don't think 100% is realistic. We have never been able to achieve it. It will be interesting to see what other SEOmozers have to report!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did Google cache & index a different domain than my own?
We own www.homemenorca.com, a real estate website based in Spain. Pages from this domain are not being indexed: https://www.google.com/search?q=site%3Awww.homemenorca.com&oq=site%3Awww.homemenorca.com&aqs=chrome..69i57j69i58j69i59l2.3504j0j7&sourceid=chrome&ie=UTF-8Please notice that the URLs are Home Menorca, but the titles are not Home Menorca, they are Fincas Mantolan, a completely different domain and company: http://www.fincasmantolan.com/. Furthermore, when we look at Google's cache of Home Menorca, we see a different website: http://webcache.googleusercontent.com/search?q=cache%3Awww.homemenorca.com%2Fen&oq=cache%3Awww.homemenorca.com%2Fen&aqs=chrome..69i57j69i58j69i59.1311j0j4&sourceid=chrome&ie=UTF-8We reviewed Google Search Console, Google Fetch, the canonical tags, the XML sitemap, and many more items. Google Search Console accepted our XML sitemap, but is only indexing 5-10% of the pages. Google is fetching and rendering the pages properly. However, we are not seeing the correct content being indexed in Google. We have seen issues with page loading times, loading content longer than 4 seconds, but are unsure why Google would be indexing a different domain.If you have suggestions or thoughts, we would very much appreciate it.Additional Language Issue:When a user searches "Home Menorca" from America or the UK with "English" selected in their browser as their default language, they are given a Spanish result. It seems to have accurate hreflang annotations within the head section on the HTML pages, but it is not working properly. Furthermore, Fincas Mantolan's search result is listed immediately below Home Menorca's Spanish result. We believe that if we fix the issue above, we will also fix the language issue. Please let us know any thoughts or recommendations that can help us. Thank you very much!
Intermediate & Advanced SEO | | CassG12340 -
I'm noticing that URL that were once indexed by Google are suddenly getting dropped without any error messages in Webmasters Tools, has anyone seen issues like this before?
I'm noticing that URLs that were once indexed by Google are suddenly getting dropped without any error messages in Webmasters Tools, has anyone seen issues like this before? Here's an example:
Intermediate & Advanced SEO | | nystromandy
http://www.thefader.com/2017/01/11/the-carter-documentary-lil-wayne-black-lives-matter0 -
Facets Being Indexed - What's the Impact?
Hi Our facets are from what I can see crawled by search engines, I think they use javascript - see here http://www.key.co.uk/en/key/lockers I want to get this fixed for SEO with an ajax solution - I'm not sure how big this job is for developers, but they will want to know the positive impact this could have & whether it's worth doing. Does anyone have any opinions on this? I haven't encountered this before so any help is welcome 🙂
Intermediate & Advanced SEO | | BeckyKey0 -
Google Webmaster Tools -> Sitemap suddent "indexed" drop
Hello MOZ, We had an massive SEO drop in June due to unknown reasons and we have been trying to recover since then. I've just noticed this yesterday and I'm worried. See: http://imgur.com/xv2QgCQ Could anyone help by explaining what would cause this sudden drop and what does this drop translates to exactly? What is strange is that our index status is still strong at 310 pages, no drop there: http://imgur.com/a1sRAKo And when I do search on google site:globecar.com everything seems normal see: http://imgur.com/O7vPkqu Thanks,
Intermediate & Advanced SEO | | GlobeCar0 -
Is it a good or bad idea (in Google's eyes) to add a forum to my website?
I have an active website with many users adding dozens of comments on the many pages of the site daily. I'm am wondering if it would be good for the overall ranking strength of the site if I were to add a forum to it (in a subdirectory, like forum.mysite.com). On one hand, I can see the forum posts as thin content, which Google wouldn't care for. On the other hand, I see the additional user engagement on the site, which I think Google would like. I know the benefits it can have to the users, but for this question, all I want to know is if this would be seen by Google as a plus or a minus for my site, assuming the forum succeeded in becoming popular. I don't want to do anything that will diminish the value of my site in Google's eyes. Thank you.
Intermediate & Advanced SEO | | bizzer0 -
Will Canonical tag on parameter URLs remove those URL's from Index, and preserve link juice?
My website has 43,000 pages indexed by Google. Almost all of these pages are URLs that have parameters in them, creating duplicate content. I have external links pointing to those URLs that have parameters in them. If I add the canonical tag to these parameter URLs, will that remove those pages from the Google index, or do I need to do something more to remove those pages from the index? Ex: www.website.com/boats/show/tuna-fishing/?TID=shkfsvdi_dc%ficol (has link pointing here)
Intermediate & Advanced SEO | | partnerf
www.website.com/boats/show/tuna-fishing/ (canonical URL) Thanks for your help. Rob0 -
Urls missing from product_cat sitemap
I'm using Yoast SEO plugin to generate XML sitemaps on my e-commerce site (woocommerce). I recently changed the category structure and now only 25 of about 75 product categories are included. Is there a way to manually include urls or what is the best way to have them all indexed in the sitemap?
Intermediate & Advanced SEO | | kisen0 -
Indexing/Sitemap - I must be wrong
Hi All, I would guess that a great number of us new to SEO (or not) share some simple beliefs in relation to Google indexing and Sitemaps, and as such get confused by what Web master tools shows us. It would be great if somone with experience/knowledge could clear this up for once and all 🙂 Common beliefs: Google will crawl your site from the top down, following each link and recursively repeating the process until it bottoms out/becomes cyclic. A Sitemap can be provided that outlines the definitive structure of the site, and is especially useful for links that may not be easily discovered via crawling. In Google’s webmaster tools in the sitemap section the number of pages indexed shows the number of pages in your sitemap that Google considers to be worthwhile indexing. If you place a rel="canonical" tag on every page pointing to the definitive version you will avoid duplicate content and aid Google in its indexing endeavour. These preconceptions seem fair, but must be flawed. Our site has 1,417 pages as listed in our Sitemap. Google’s tools tell us there are no issues with this sitemap but a mere 44 are indexed! We submit 2,716 images (because we create all our own images for products) and a disappointing zero are indexed. Under Health->Index status in WM tools, we apparently have 4,169 pages indexed. I tend to assume these are old pages that now yield a 404 if they are visited. It could be that Google’s Indexed quotient of 44 could mean “Pages indexed by virtue of your sitemap, i.e. we didn’t find them by crawling – so thanks for that”, but despite trawling through Google’s help, I don’t really get that feeling. This is basic stuff, but I suspect a great number of us struggle to understand the disparity between our expectations and what WM Tools yields, and we go on to either ignore an important problem, or waste time on non-issues. Can anyone shine a light on this for once and all? If you are interested, our map looks like this : http://www.1010direct.com/Sitemap.xml Many thanks Paul
Intermediate & Advanced SEO | | fretts0