Google indexing less url's then containded in my sitemap.xml
-
My sitemap.xml contains 3821 urls but Google (webmaster tools) indexes only 1544 urls. What may be the cause? There is no technical problem. Why does Google index less URLs then contained in my sitemap.xml?
-
Thank you for helping
-
Unless you have a SEO actively reviewing your site, it is quite normal for Google to index less pages then are offered in your sitemap.
How exactly was your sitemap created? Did you go by hand through your site's 3281 pages and add them to a sitemap? Or more likely, did you use a tool to create the sitemap? If you used a tool, how much knowledge do you have regarding how this tool works or its settings?
Just a few examples of URLs which may be included in your sitemap that Google would likely not index:
-
Your home page and other pages may have multiple URLs which lead to the same page. For example: www.mysite.com and www.mysite.com/index.html may be two URLs for the same page. Google will likely only index one of them.
-
You may have links to various URLs which contain parameters which Google will reduce to a single URL. For example: www.mysite.com/product_id=308&sort=asc&color=black, and another URL www.mysite.com/product_id=308&sort=desc&color=black. Both URLs lead to the same content sorted differently.
-
You may have duplicate content on your site. For example, you can sell chairs and list the same chair under multiple paths such as /furniture/wood/chair123 and /furniture/dining-room/chair123. Google will recognize these two pages are the same content presented under multiple URLs.
-
You may have submitted pages to your sitemap which are blocked via robots.txt or the "noindex" tag or are canonicalized to another page.
In order to better understand the root issue you need to examine a list of all URLs in your sitemap and compare that to a list of all indexed URLs. Determine which URLs Google has not indexed and research the reason for each one independently.
-
-
Are they index worthy?
Having them on your sitemap does not mean google wants them in its index
-
He just said it. Is this a new domain? Im in the same boat as you for some of my domains.
-
Yes, I understand this. But In this situation Google first indexes all the URL's within my sitemap.xml uploaded in Google Webmaster tools. Now Google indexes less URL's, only 50%. What can be the cause if there are no technical problems?
-
Hi!
Google will only spend 'so much time' on any new domain. The more traffic and links and page authority you get, the more time Google will dedicate to crawling your website. You should also make sure that the site is not slow, as this will reduce the crawling speed even more! See Google page speed for tips on speeding up the load time of your site
Good Luck,
Sven Witteveen
Expand Online
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing Personal content from Google Index
Hi everyone, A user is complaining that her name is appearing in google search through our job ads site, so I removed such ads through Search Console, but the problem is not the ads anymore but our internal search results. The ads are no longer live but our searches has been indexed by google back then, We have been manually taking over 500 pages that included such name but more and more keep coming through pagination, we haven't found a pattern yet so pretty much any search result might have contained such name. We might get some legal issues here, did you guys got into anything similar before? We have just set some rules so that this doesn't happen again, but still can't find a way to deal with this one. Thanks in advance. PD: Not sure if this is the right category to fit it.
Technical SEO | | JoaoCJ0 -
Google Indexing Pages with Made Up URL
Hi all, Google is indexing a URL on my site that doesn't exist, and never existed in the past. The URL is completely made up. Anyone know why this is happening and more importantly how to get rid of it. Thanks 🙂
Technical SEO | | brian-madden0 -
Xml sitemaps giving 404 errors
We have recently made updates to our xml sitemap and have split them into child sitemaps. Once these were submitted to search console, we received notification that the all of the child sitemaps except 1 produced 404 errors. However, when we view the xml sitemaps in a browser, there are no errors. I have also attempted crawling the child sitemaps with Screaming Frog and received 404 responses there as well. My developer cannot figure out what is causing the errors and I'm hoping someone here can assist. Here is one of the child sitemaps: http://www.sermonspice.com/sitemap-countdowns_paged_1.xml
Technical SEO | | ang0 -
Why seomoz.org still in Google index?
I searched in Google, the number of URLs indexed left in the seomoz.org domain since it changed to moz.comI am surprised that after all this time more than 15,000 URLs indexed:https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=site%3Aseomoz.org%20inurl%3Aseomoz.org If I clicked on any of the results it will be redirect (301) to the new domain, so it is working, but Google still keep these URLs in the index.
Technical SEO | | Yosef
What could be the reason?Will not cause duplicated content issue on moz.com?0 -
Unused url 'A' contains frameset - can it damage the other site B?
Client has an old unused site 'A' which I've discovered during my backlink research. It contains this source code below which frames the client's 'proper' site B inside the old unused url A in the browser address. Quick question - will google penalise the website B which is the one I'm optimising? Should the client be using a redirect instead? <frameset <span class="webkit-html-attribute-name">border='0' frameborder='0' framespacing='0'></frameset <span> <frame src="http: www.clientwebsite.co.ukb" frameborder="0" noresize="noresize" scrolling="yes"></frame src="http:> Please go to http://www.clientwebsite.co.ukB <noframes></noframes> Thanks, Lu.
Technical SEO | | Webrevolve0 -
What's the best way to handle Overly Dynamic Url's?
So my question is What the best way to handle Overly Dynamic Url's. I am working on a real estate agency website. They are selling/buying properties and the url is as followed. ttp://www.------.com/index.php?action=calculator&popup=yes&price=195000
Technical SEO | | Angelos_Savvaidis0 -
Sitemap.xml showing up in Google Search
Hello when I do a Google search my sitemap.xml shows up for lots of queries. Does anyone have any advise on this? Should I remove url in Google Webmaster? Thanks,
Technical SEO | | Socialdude0 -
We changed the URL structure 10 weeks ago and Google hasn't indexed it yet...
We recently modified the whole URL structure on our website, which resulted in huge amount of 404 pages changing them to nice human readable urls. We did this in the middle of March - about 10 weeks ago... We used to have around 5000 404 pages in the beginning, but this number is decreasing slowly. (We have around 3000 now). On some parts of the website we have also set up a 301 redirect from the old URLs to the new ones, to avoid showing a 404 page thus making the “indexing transmission”, but it doesn’t seem to have made any difference. We've lost a significant amount of traffic, because of the URL changes, as Google removed the old URLs, but hasn’t indexed our new URLs yet. Is there anything else we can do to get our website indexed with the new URL structure quicker? It might also be useful to know that we are a page rank 4 and have over 30,000 unique users a month so I am sure Google often comes to the site quite often and pages we have made since then that only have the new url structure are indexed within hours sometimes they appear in search the next day!
Technical SEO | | jack860