Google indexing less url's then containded in my sitemap.xml
-
My sitemap.xml contains 3821 urls but Google (webmaster tools) indexes only 1544 urls. What may be the cause? There is no technical problem. Why does Google index less URLs then contained in my sitemap.xml?
-
Thank you for helping
-
Unless you have a SEO actively reviewing your site, it is quite normal for Google to index less pages then are offered in your sitemap.
How exactly was your sitemap created? Did you go by hand through your site's 3281 pages and add them to a sitemap? Or more likely, did you use a tool to create the sitemap? If you used a tool, how much knowledge do you have regarding how this tool works or its settings?
Just a few examples of URLs which may be included in your sitemap that Google would likely not index:
-
Your home page and other pages may have multiple URLs which lead to the same page. For example: www.mysite.com and www.mysite.com/index.html may be two URLs for the same page. Google will likely only index one of them.
-
You may have links to various URLs which contain parameters which Google will reduce to a single URL. For example: www.mysite.com/product_id=308&sort=asc&color=black, and another URL www.mysite.com/product_id=308&sort=desc&color=black. Both URLs lead to the same content sorted differently.
-
You may have duplicate content on your site. For example, you can sell chairs and list the same chair under multiple paths such as /furniture/wood/chair123 and /furniture/dining-room/chair123. Google will recognize these two pages are the same content presented under multiple URLs.
-
You may have submitted pages to your sitemap which are blocked via robots.txt or the "noindex" tag or are canonicalized to another page.
In order to better understand the root issue you need to examine a list of all URLs in your sitemap and compare that to a list of all indexed URLs. Determine which URLs Google has not indexed and research the reason for each one independently.
-
-
Are they index worthy?
Having them on your sitemap does not mean google wants them in its index
-
He just said it. Is this a new domain? Im in the same boat as you for some of my domains.
-
Yes, I understand this. But In this situation Google first indexes all the URL's within my sitemap.xml uploaded in Google Webmaster tools. Now Google indexes less URL's, only 50%. What can be the cause if there are no technical problems?
-
Hi!
Google will only spend 'so much time' on any new domain. The more traffic and links and page authority you get, the more time Google will dedicate to crawling your website. You should also make sure that the site is not slow, as this will reduce the crawling speed even more! See Google page speed for tips on speeding up the load time of your site
Good Luck,
Sven Witteveen
Expand Online
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search Console 'Change of Address' Just 301s on source domain?
Hi all. New here, so please be gentle. 🙂 I've developed a new site, where my client also wanted to rebrand from .co.nz to .nz On the source (co.nz) domain, I've setup a load of 301 redirects to the relevant new page on the new domain (the URL structure is changing as well).
Technical SEO | | WebGuyNZ
E.G. On the old domain: https://www.mysite.co.nz/myonlinestore/t-shirt.html
In the HTACCESS on the old/source domain, I've setup 301's (using RewriteRule).
So that when **https://www.mysite.co.nz/**myonlinestore/t-shirt.html is accessed, it does a 301 to;
https://mysite.nz/shop/clothes/t-shirt All these 301's are working fine. I've checked in dev tools and a 301 is being returned. My question is, is having the 301's just on the source domain only enough, in regards to starting a 'Change of Address' in Google's Search Console? Their wording indicates it's enough but I'm concerned, maybe I also need redirects on the target domain as well? I.E. Does the Search Console Change of Address process work this way?
It looks at the source domain URL (that's already in Google's index), sees the 301 then updates the index (and hopefully pass the link juice) to the new URL. Also, I've setup both source and target Search Console properties as Domain Properties. Does that mean I no longer need to specify that the source and target properties are HTTP or HTTPS? I couldn't see that option when I created the properties. Thanks!0 -
Google Appending Blog URL inbetween my homepage and product page is it issue with base url?
Hi All, Google Appending Blog URL inbetween my homepage and product page. Is it issue or base url or relative url? Can you pls guide me? Looking to both tiny url you will get my point what i am saying. Please help Thanks!
Technical SEO | | amu1230 -
Google crawling but not indexing for no apparent reason
Client's site went secure about two months ago and chose root domain as rel canonical (so site redirects to https://rootdomain.com (no "www"). Client is seeing the site recognized and indexed by Google about every 3-5 days and then not indexed until they request a "Fetch". They've been going through this annoying process for about 3 weeks now. Not sure if it's a server issue or a domain issue. They've done work to enhance .htaccess (i.e., the redirects) and robots.txt. If you've encountered this issue and have a recommendation or have a tech site or person resource to recommend, please let me know. Google search engine results are respectable. One option would be to do nothing but then would SERPs start to fall without requesting a new Fetch? Thanks in advance, Alan
Technical SEO | | alankoen1230 -
If a URL canonically points to another link, is that URL indexed?
Hi, I have two URL both talking about keyword phrase 'counting aggregated cells' The first URL has canonical link pointing to the second URL, but if one searches for 'counting aggregated cells' both URLs are shown in the results. The first URL is the pdf, and i need only second URL (the landing page) to be shown in the search results. The canonical links should tell Google which URL to index, i don't understand why both URLs are present in search results? Is 'noindex' for the first URL only solution? I am using Yoast SEO for my website. Thank you for the answers.
Technical SEO | | Chemometec0 -
Why google indexed pages are decreasing?
Hi, my website had around 400 pages indexed but from February, i noticed a huge decrease in indexed numbers and it is continually decreasing. can anyone help me to find out the reason. where i can get solution for that? will it effect my web page ranking ?
Technical SEO | | SierraPCB0 -
Xml Sitemap
Hi mozzers, I am about to submit a sitemap for one of my clients via webmaster tools. The issue is that I have way too many urls that I don't want them to be indexed by Google such as testing pages, auto generated pages... Is there way to remove certain URL from the XML sitemap or is this impossible? If impossible, is the only way to control these urls is to "No index" all these pages that i don't want the search engine to see? Thanks Mozzers,
Technical SEO | | Ideas-Money-Art0 -
Sitemaps for Google
In Google Webmaster Central, if a URL is reported in your site map as 404 (Not found), I'm assuming Google will automatically clean it up and that the next time we generate a sitemap, it won't include the 404 URL. Is this true? Do we need to comb through our sitemap files and remove the 404 pages Google finds, our will it "automagically" be cleaned up by Google's next crawl of our site?
Technical SEO | | Prospector-Plastics0 -
Google shows the wrong domain for client's homepage
Whenever the homepage of my client's homepage appears in Google results, the search engine is not showing our URL as our domain, but instead a partner domain that is linking to us. (The correct title and meta description of our homepage is showing.) I believe this is caused by the partner website (with a much higher pank rank) linking to our homepage from their footer to a URL with it's own domain that 302 redirects to our homepage. Example: Link: http://www.partnerwebsite.com/?ad2203 302 redirects to: http://www.clientwebsite.com/?moreadtracking The simple fix would be for the client to ask for removal of the 302 hijacking link - but they are uncomfortable with this request since they had requested it prior, and their relationship is not the best. Is there any other way to fix this?
Technical SEO | | Conor_OShea_ETUS0