Not All Submitted URLs in Sitemap Get Indexed
-
Hey Guys,
I just recognized, that of about 20% of my submitted URL's within the sitemap don't get indexed, at least when I check in the webmaster tools. There is of about 20% difference between the submitted and indexed URLs. However, as far as I can see I don't get within webmaster tools the information, which specific URLs are not indexed from the sitemap, right?
Therefore I checked every single page in the sitemap manually by putting site:"URL" into google and every single page of the sitemap shows up. So in reality every page should be indexed, but why does webmaster tools shows something different?
Thanks for your help on this
Cheers
-
Thanks Dan, but I have registered the right URL (http).
However today I have again 100% indexed from the submitted URLs (changed nothing). Really Crazy.
Cheers,
Heiko
-
This can happen if you don't have the correct version of your URL registered in webmaster tools, so something to check
-
Hi There
One thing to check - do you have the exact version of domain registered in webmaster tools? So www or non-www and http or httpS? This has to be exact, webmaster tools considers them all different sites and you can get limited data if the wrong one is registered.
That would be the biggest cause of discrepancy. If this is not the case, there are many times Webmaster Tools data can lag behind, or be different than the index. I would go with what you see in actual Google searches though as the "final answer".
-
I get the same thing. Nobody on here seems to know the answer (I asked a similar question in the last week or so) - if the pages are there when you do a manual search then I wouldn't sweat it. I have taken the view that it's not worth worrying about!
Good luck Amelia
-
I didn't change the sitemap in the last 4 months. At the beginning the numbers match exactly, so submitted and indexed URLs where the same. But this week I recognized, that now of about 20% are not indexed any more. So I already got confused, but the manual check showed that everything is ok.
However, I just would like to know, why there is this difference in webmaster tools....
Cheers
-
this is clear, but has nothing to do with my original question. I just wanted to know why webmaster tools doesn't display the right number of indexed pages from the sitemap. It would just be the easiest way to recognize when some pages will get de-indexed for whatever reason.
-
Hi there
This is pretty common. Google sometimes shows varying numbers in Webmaster Tools and what actually appears in the index. When did you submit your sitemap?
Here are some reasons that Google may not index all of your pages.
Check your robots.txt to be sure, but give yourself a bit of time for the indexing number in WMT to update. The good news is that you are seeing your pages in search - so that's a positive.
I would also check to see if you have any duplicate or thin content on the website, dynamic URLs in your sitemap, check how deep your pages go (this is especially important due to crawl budgets), and also your website's canonical tag situation.
These are some things I would look into. Hope this helps! Good luck!
-
sitemap does not ensure you are in the index. they just inform the search engine about your site.
in fact Bing suggest you only put hidden pages and important pages in sitemap.
IMO they are overrated unless you have something special to inform them of, or a very large site , they will find it crawling your site normaly
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap generator partially finding list of website URLs
Hi everyone, When creating my XML sitemap here it is only able to detect a portion of the website. I am missing at least 20 URLs (blog pages + newly created resource pages). I have checked those missing URLs and all of them are index and they're not blocked by the robots.txt. Any idea why this is happening? I need to make sure all wanted URLs to be generated in an XML sitemap. Thanks!
Technical SEO | | Taysir0 -
I am looking for best way to block a domain from getting indexed ?
We have a website http://www.example.co.uk/ which leads to another domain (https://online.example.co.uk/) when a user clicks,in this case let us assume it to be Apply now button on my website page. We are getting meta data issues in crawler errors from this (https://online.example.co.uk/) domain as we are not targeting any meta content on this particular domain. So we are looking to block this domain from getting indexed to clear this errors & does this effect SERP's of this domain (**https://online.example.co.uk/) **if we use no index tag on this domain.
Technical SEO | | Prasadgotteti0 -
Will Google Recrawl an Indexed URL Which is No Longer Internally Linked?
We accidentally introduced Google to our incomplete site. The end result: thousands of pages indexed which return nothing but a "Sorry, no results" page. I know there are many ways to go about this, but the sheer number of pages makes it frustrating. Ideally, in the interim, I'd love to 404 the offending pages and allow Google to recrawl them, realize they're dead, and begin removing them from the index. Unfortunately, we've removed the initial internal links that lead to this premature indexation from our site. So my question is, will Google revisit these pages based on their own records (as in, this page is indexed, let's go check it out again!), or will they only revisit them by following along a current site structure? We are signed up with WMT if that helps.
Technical SEO | | kirmeliux0 -
Should I Edit Sitemap Before Submitting to GWMT?
I use the XML sitemap generator at http://www.auditmypc.com/xml-sitemap.asp and use the filter that forces the tool to respect robots.txt exclusions. This generator allows me to review the entire sitemap before downloading it. Depending on the site, I often see all kinds of non-content files still listed on the sitemap. My question is, should I be editing the sitemap to remove every file listed except ones I really want spidered, or just ignore them and let the Google spiderbot figure it all out after I upload-submit the XML?
Technical SEO | | DonB0 -
What to do if my site was De-indexed?
Hello fellow SEOs, I have been doing SEO for about a year now, I'm not expert, but I know enough to get the job done. I'm learning everyday about better techniques. So enough about that... Tonight I noticed that my site has, I believe, been de-indexed. Its a fairly new site, as we just launched it a few days ago and I went in and did all the title tags and meta. I still have to go in to do the h1 and h2 tags...plus add some alt tags and anchor text. Well anyways, after a couple of days after the title tags were implemented. I was propagating all over the place. Using my keyword tool here...I was number on the first page in Google for 71 or the 88 keywords. My new site was just indexed yesterday and thats when i noticed all my keywords. Well today I noticed that I am no where to be found, even if i type in my company's name. PLEASE help me out...any advice would be appreciated. Thank you. p.s. could my competitors could have done something to my site? just wondering... The website is www.eggheadconsultants.com
Technical SEO | | Jegghead1 -
How does a sitemap affect the definition of canonical URLs?
We are having some difficulty generating a sitemap that includes our SEO-friendly URLs (the ones we want to set as canonical), and I was wondering if we might be able to simply use the non-SEO-friendly, non-canonical URLs that the sitemap generator has been producing and then use 301 redirects to send them to the canonical. Is there a reason why we should not be doing this? We don't want search engines to think that the sitemap URLs are more important than the pages to which they redirect. How important is it that the sitemap URLs match the canonical URLs? We would like to find a solution outside of the generation of the sitemap itself as we are locked into using a vendor’s product in order to generate the sitemap. Thanks!
Technical SEO | | emilyburns0 -
Why is a 301 redirected url still getting indexed?
We recently fixed a redirect issue in a website, and although it appears that the redirection is working fine, the url in question keeps on getting crawled, indexed and cached by google. The redirect was done a month ago, and google shows cached version of it, even for a couple of days ago. Manual checking shows that its being redirected, and also a couple of online tools i checked report a 301 redirect. Do you have any idea why this could be happening? The website I'm talking about is www.hotelmajestic.gr and its being redirected to www.hotel-majestic.gr
Technical SEO | | dim_d0 -
URL rewrite question
I have adjusted a setting in my CMS and the URL's have changed from http://www.ensorbuilding.com/section.php/43/1/firestone-epdm-rubbercover-flat-roofing to http://www.ensorbuilding.com/section/43/1/firestone-epdm-rubbercover-flat-roofing This has changed all the URL's on the website not just this example. As you can see , the .php extension has now been removed but people can still access the .php version of the page. What I want is a site-wide 301 redirect but can not figure out how to implement it? Any help is appreciated 🙂 Thanks
Technical SEO | | danielmckay70