Why might Google be crawling via old sitemap, when the new one has been submitted and verified?

scoutzie

We have recently relaunched Scoutzie.com and re-submitted our new sitemap to Google. When I look on Webmaster tools, our new sitemap has been submitted just fine, but at the same time, Google is finding a lot of 404s when crawling the site. My understanding, it is still using crawling the old links, which do not exists. How can I tell Google to refresh it's index and to stop looking at all the old links?

mihaiaperghis

Yes it should. However, as Alan mentioned below, if you still have links pointing to the 404 pages, Google will always attempt to crawl them, and will keep you informed that you have errors.

If you do have external links to those 404 pages, you can 301 redirect them to an appropriate page using .htaccess. This way you'll keep the link value and also get rid of the Webmaster Tools error.

If you don't have any links to them, then yes, Google will eventually stop trying to crawl them.

scoutzie

It's very likely that we do. Given that I cannot track down a 1000+ links that now 404, will they eventually fall out by themselves, or do I have to tell Google that everything that's 404'ed should be dropped from crawl index? Thanks!

scoutzie

What if I simply pushed the new sitemap over the old one? In other words, scoutzie.com/sitemap is the same link, except now it contains the new map. That should be okay, right?

AlanMosley

you may still have links pointing to those 404 pages on your site or externally. If not then eventually they will fall out of the index

mihaiaperghis

Hey scoutzie,

This is actually covered pretty well in Joe Robison's blog post on fixing Webmaster Tools crawl errors: http://moz.com/blog/how-to-fix-crawl-errors-in-google-webmaster-tools

I'll quote the related info:

"One frustrating thing that Google does is it will continually crawl old sitemaps that you have since deleted to check that the sitemap and URLs are in fact dead. If you have an old sitemap that you have removed from Webmaster Tools, and you don’t want being crawled, make sure you let that sitemap 404 and that you are not redirecting the sitemap to your current sitemap."

Hope this helps, good luck!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Why might Google be crawling via old sitemap, when the new one has been submitted and verified?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Site Crawl 4xx Errors?

Get into Google : New Sites

SEO on-demand crawl

Crawl test from tools

Unable to crawl pages

Not all pages are being crawled

Loss of Google AdWords API

Why does Crawl Diagnostics report this as duplicate content?