Google Webmaster Tools Sitemap errors for phantom urls?
-
Two weeks ago we changed our urls so the correct addresses are all lowercase. Everything else 301 redirects to those. We have submitted and made sure that Google has downloaded our updated sitemap several times since.
Even so, Webmaster Tools is reporting 33000 + errors in our sitemap for urls that are no longer in our sitemap and haven't been for weeks. It claims to have found the errors within the last couple of days but the sitemap has been updated for a couple of weeks and has been downloaded by Google at least three times since.
Here is our sitemap: http://www.aquinasandmore.com/urllist.xml
Here are a couple of urls that Webmaster Tools says are in the sitemap:
http://www.aquinasandmore.com/catholic-gifts/Caroline-Gerhardinger-Large-Sterling-Silver-Medal/sku/78664
Redirect errorunavailable
Oct 7, 2011
http://www.aquinasandmore.com/catholic-gifts/Catherine-of-Bologna-Small-Gold-Filled-Medal/sku/78706
Redirect errorunavailable
Oct 7, 2011 -
How long does the actual data usually take to catch up with what WMT says is current?
I have not experienced any delay before. There should only be one sitemap record for your site at any time. That record could be composed of multiple files, but it is one collection of records.
When Google identifies crawl errors, those errors should be generated from the sitemap on file at the time of the error. There is a view sitemap option in Google WMT you can use to see the sitemap they have on file. This step would be next. If you can confirm the bad URL does not appear in the sitemap, I would then wait to see if the issue re-appears after today, October 11th.
I know this is frustrating but the system is very straight forward. I cannot explain why a URL not included in your sitemap would appear on your sitemap crawl errors tab. The only two possibilities I can come with is either you have made an error when sharing some information, or there is an unusual glitch on Google's end.
With all the above noted, working with sitemaps is not a good investment of your time. If your site navigation is properly designed, your sitemap offers no benefit whatsoever.
-
"then these links should not appear going forward." - They are showing up now even though Google says they have our latest sitemap and that the errors were found yesterday. How long does the actual data usually take to catch up with what WMT says is current?
The image urls are built from the actual title on the fly and don't 301 so those aren't a problem. The other one you mentioned does need to be cleaned up in the site map. Thanks for catching that.
These errors are showing up when I go to the crawl errors section and click the sitemap tab. Yes, the sitemap I shared is the same one in WMT.
-
I was unable to locate the URLs listed in your sitemap. If you Google WMT tools settings are correct and the sitemap which you have shared is the same one listed in your Google WMT account, then these links should not appear going forward.
You would need to examine your Google WMT account closely to determine the exact source of these errors.
Where exactly within your Google WMT are you seeing these errors? How are you identifying the source of these URLs are being from your sitemap?
Two weeks ago we changed our urls so the correct addresses are all lowercase.
There are many URLs in your site map which are not lower case. An example:
http://www.aquinasandmore.com/title/Brian-Kolodiejchuk/FuseAction/store.AuthorSearch/Author/2337/
Also you share a lot of image URLs which are not lower case either.
I would not necessarily advise cleaning up the entire site, but at least establish the best practice going forward.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How would you address these URLS
Hey Mozzers, long time no post. Just a quick one for you regarding URLS, this is an example of a url on a site https://www.thisismyurl.co.uk/products/spacehoppers/special-spacehopper.html Many of these pages are getting flagged for having a url that is too long. The target of this page is "special spacehoppers". Should i be concerned with the url being to long given my keyword is at the end? Would this be a suitable idea? https://www.thisismyurl.co.uk/p/spacehoppers/special.html Would changing products to p be worthwhile? It would remove length from nearly all urls but would require a site wide re-direct. 2)Would removing the "spacehoppers" bit from the url be worth it? Yes it would shorten the url but would also remove the exact keyword from the url which could be detrimental to rankings.
Intermediate & Advanced SEO | | ATP0 -
Google News Sitemap in Different Languages
Thought I'd ask this question to confirm what I already think. I'm curious that if we're publishing something in two language and both are verified by the publishing center if the group would recommend publishing two separate Google News Sitemaps (one in each language) or publishing one in each language.
Intermediate & Advanced SEO | | mattdinbrooklyn0 -
Should I include URLs that are 301'd or only include 200 status URLs in my sitemap.xml?
I'm not sure if I should be including old URLs (content) that are being redirected (301) to new URLs (content) in my sitemap.xml. Does anyone know if it is best to include or leave out 301ed URLs in a xml sitemap?
Intermediate & Advanced SEO | | Jonathan.Smith0 -
Remove URLs that 301 Redirect from Google's Index
I'm working with a client who has 301 redirected thousands of URLs from their primary subdomain to a new subdomain (these are unimportant pages with regards to link equity). These URLs are still appearing in Google's results under the primary domain, rather than the new subdomain. This is problematic because it's creating an artificial index bloat issue. These URLs make up over 90% of the URLs indexed. My experience has been that URLs that have been 301 redirected are removed from the index over time and replaced by the new destination URL. But it has been several months, close to a year even, and they're still in the index. Any recommendations on how to speed up the process of removing the 301 redirected URLs from Google's index? Will Google, or any search engine for that matter, process a noindex meta tag if the URL's been redirected?
Intermediate & Advanced SEO | | trung.ngo0 -
Bypassing Google, Data Highlighter and Webmaster tools
eLLo! Has anyone used Data Highlighter? I've had colleagues mentioning a jump in CTR after using the data highlighter on pages. Thought I'll do the same and went into my webmaster tools but I've hit a brick wall. Whenever I highlight a product page, my country selector pops up and I'm unable to highlight a product page. A colleague of mine mentioned to bypass google by basing it on user agent, this will allow you to avoid the country selector. But if I bypass Google, wouldn't it affect Google Analytics, Indexing etc?
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Sudden increase in number of indexed URLs. How ca I know what URLs these are?
We saw a spike in the total number of indexed URLs (17,000 to 165,000)--what would be the most efficient way to find out what the newly indexed URLs are?
Intermediate & Advanced SEO | | nicole.healthline0 -
403, 301, 302, 404 errors & possible google penalty
William Rock ran a Xenu site scan on nlpca(dot)com and mentioned the following: ...ran a test with Xenu site scan and it found a lot of broken links with 403, 301, 302, 404 Errors. Other items found: Broken page-local links (also named 'anchors', 'fragmentidentifiers'): http://www.nlpca.com/DCweb/Interesting_NLP_Sites.html#null anchor occurs multiple timeshttp://www.nlpca.com/DCweb/Interesting_NLP_Sites.html#US not found Could somone give us an output of that list, and which ones of these errors do we need to clean up for SEO purposes? Thank you.
Intermediate & Advanced SEO | | BobGW0 -
Is there any delay between crawling a page by google and displaying of the ratings in rich snippet of the results in google?
Is there any delay between crawling a page by google and displaying of the ratings in rich snippet of the results in google?
Intermediate & Advanced SEO | | NEWCRAFT0