Sitemaps for Google
-
In Google Webmaster Central, if a URL is reported in your site map as 404 (Not found), I'm assuming Google will automatically clean it up and that the next time we generate a sitemap, it won't include the 404 URL.
Is this true?
Do we need to comb through our sitemap files and remove the 404 pages Google finds, our will it "automagically" be cleaned up by Google's next crawl of our site?
-
Nice - thanks Kane. Cool Chrome tool too, thanks for the suggestion.
I'm in GWT every morning to check things out since our site is fairly large - about 220,000 pages. The sitemap checker is a really cool new feature in GWT too!
-
You should be monitoring them periodically. A quick scan monthly or quarterly depending on how large your site is should be sufficient. I've read reports within the last year that say Bing tends to be picky about sites with 404s in their sitemaps and will begin to distrust them if they get too messy. I am assuming that Google doesn't like it much, either.
It's very easy to check a large number of links easily, luckily: download the Check My Links Chrome Add-on here:
https://chrome.google.com/webstore/detail/ojkcdipcgfaekbeaelaapakgnjflfglf
Click that while looking at your sitemap, and it will test all of the URLs and make any errors stick out in red. Then fix them.
On the other hand, GWT will report them within a few days of finding them. If you're already in your GWT on a frequent basis, make that one more task to check on.
You should be fixing all 404s with 301 directs. Read "How to Fix Crawl Errors in GWT" for any tricky ones that you can't figure out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google crawl drop
the crawl request of my company site: https://www.dhgate.com/ has dropped nearly over 95%, from daily 6463599 requests to 476493 requests at 12:00am on 9th, Oct (GMT+8). This dramatic dropping trend not only showed in our GSC crawl stats report but also our company's own log report. We have no idea what’s going on. We want to know whether there is an update of google about crawlling, or is this the issue of our own site? If something is wrong with our site, in what aspects would you recommend us to check, analyze and accordingly optimize?
Technical SEO | | DHgate_20140 -
Image Sitemap
I currently use a program to create our sitemap (xml). It doesn't offer creating an mage sitemaps. Can someone suggest a program that would create an image sitemap? Thanks.
Technical SEO | | Kdruckenbrod0 -
Sitemap duplicate title
At the moment we have a html sitemap which is pulling the same h1's/ titles. How big a problem is the duplicate content issue which is medium priority in the moz pro softaware? Would you recommend changes as sitemap page 1 - page 2 etc. Thanks
Technical SEO | | VUK-SEO0 -
Multiple Sitemaps
Hello everyone! I am in the process of updating the sitemap of an ecommerce website and I was thinking to upload three different sitemaps for different part (general/categories and subcategories/productgroups and products) of the site in order to keep them easy to update in the future. Am I allowed to do so? would that be a good idea? Open to suggestion 🙂
Technical SEO | | PremioOscar0 -
Getting Recrawled by Google
I have been updating my site a lot and some of the updates are showing up in Google and some are not. Is there a best practice in getting your site fully recrawled by Google?
Technical SEO | | ShootTokyo0 -
Site being indexed by Google before it has launched
We are currently coming towards the end of migrating one of our retail sites over to magento. To our horror, we find out today that some pages are already being indexed by Google, and we have started receiving orders through new site. Do you have any suggestions for what may have caused this? Or similarly, what the best solution would be to de-index ourselves? We most recently excluded anything with a certain parameter from robots.txt - could this being implemented incorrectly have caused this issue? Thanks
Technical SEO | | Sayers0 -
Is this against google rules
Hi i am wanting to know if this is against google rules. I am building a website which will have lots of different sections and i wanted to know if you were allowed to have a new domain name pointing to a section of the site. so for example if i had a site with a domain name of manchester and then i wanted a section of the site to be called www.manchester.com/complimentary health I want to know if to help with traffic to the site and to have a better domain name, if it was allowed to have a new domain name pointing to that section of the site which could be called www.complimentaryhealth.com and have that pointing to the section. would love to hear your thoughts on this
Technical SEO | | ClaireH-1848860 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0