Sitemaps for Google
-
In Google Webmaster Central, if a URL is reported in your site map as 404 (Not found), I'm assuming Google will automatically clean it up and that the next time we generate a sitemap, it won't include the 404 URL.
Is this true?
Do we need to comb through our sitemap files and remove the 404 pages Google finds, our will it "automagically" be cleaned up by Google's next crawl of our site?
-
Nice - thanks Kane. Cool Chrome tool too, thanks for the suggestion.
I'm in GWT every morning to check things out since our site is fairly large - about 220,000 pages. The sitemap checker is a really cool new feature in GWT too!
-
You should be monitoring them periodically. A quick scan monthly or quarterly depending on how large your site is should be sufficient. I've read reports within the last year that say Bing tends to be picky about sites with 404s in their sitemaps and will begin to distrust them if they get too messy. I am assuming that Google doesn't like it much, either.
It's very easy to check a large number of links easily, luckily: download the Check My Links Chrome Add-on here:
https://chrome.google.com/webstore/detail/ojkcdipcgfaekbeaelaapakgnjflfglf
Click that while looking at your sitemap, and it will test all of the URLs and make any errors stick out in red. Then fix them.
On the other hand, GWT will report them within a few days of finding them. If you're already in your GWT on a frequent basis, make that one more task to check on.
You should be fixing all 404s with 301 directs. Read "How to Fix Crawl Errors in GWT" for any tricky ones that you can't figure out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google crawl drop
the crawl request of my company site: https://www.dhgate.com/ has dropped nearly over 95%, from daily 6463599 requests to 476493 requests at 12:00am on 9th, Oct (GMT+8). This dramatic dropping trend not only showed in our GSC crawl stats report but also our company's own log report. We have no idea what’s going on. We want to know whether there is an update of google about crawlling, or is this the issue of our own site? If something is wrong with our site, in what aspects would you recommend us to check, analyze and accordingly optimize?
Technical SEO | | DHgate_20140 -
Why google does not remove my page?
Hi everyone, last week i add "Noindex" tag into my page, but that site still appear in the organic search. what other things i can do for remove from google?
Technical SEO | | Jorge_HDI0 -
Domain not ranking in Google
https://www.buitenspeelgoed.nl/ is a domain acquired by our client. Previously this website was on http://www.buitenspeelgoed-keupink.nl. With the old domain they were ranking top 30 on 'buitenspeelgoed' in google.nl. Now with the new exact match domain they aren't ranking any more (for months). However, the website is indexed, as you can see on http://1l1.be/nz I don't know what to do anymore. Need some advise. What we allready have done the last months: made adjustments to the 301-redirects (this was originaly setup wrong by the webdesigner (de) optimized the homepage on 'buitenspeelgoed' (strange is the fact that the Moz robot can't access the site). Checked the robots.txt to see if the website was blocked for Google Checked the meta robots to see if the website was blocked for Google Disavowed some spammy (old) links which linked to the old domain Checked Search console > Fetch as Google if there isn't any Malware of some kind (and to see if Google can access the site) Checked Search consol to see if there manual spam actions (isn't the case) Checked for duplicate content by copy/paste some texts in Google and see if any other results are showing up (isn't the case for most of the texts) Please let me know what we can do.
Technical SEO | | InventusOnline0 -
Google Ecommerce Alerts
I recently started getting email notifications from Google re: new products on our websites. I am subscribed to Google alerts. Can anyone shed some light on this?
Technical SEO | | AMHC0 -
Google is ranking the wrong page
We are trying to figure out why google is ranking the wrong page for the key word motorcycle tires. We have a few ideas but are not sure yet. If you do a search for Motorcycle Tires you will see site on page 2 or top of 3; however, the page will be going to our dirt bike tires page (http://www.rockymountainatvmc.com/t/44/86/176/742/Dirt-Bike-Tires-All) not our Motorcycle page (http://www.rockymountainatvmc.com/t/49/-/181/750/Motorcycle-Tires-All) any thoughts? We think we know why but want others opinions too.
Technical SEO | | DoRM0 -
About google Disavow tool
My website is attacked by spammed link method, so should i use Goolge disavow tool to remove that links? And i have an question that when i use google Disavow to remove backlinks, but i still not remove it on the webpage that placed my links. Does Google index that backlink again? or never?
Technical SEO | | magician0 -
Google Analytics - Custom Variables
Hi guys, I'd appreciate any advice with this one. At the moment I'm in the process of arranging a URL re-structure. I was wondering what the best way would be to track the performance of the old URLs against new ones? We will be ammending the URLs for any new property pages which go live on our website but leaving the old URLs in play for any old properties listed. We're taking this approach for the moment so we can conduct analysis on the change. It has been mentioned to me that placing a 'setvariable' in the code of pages with the old URLs and ones with the new URLs would be a way of tracking performance. However, my knowledge in this area is a little bit grey. Any advice? Cheers, Mark
Technical SEO | | MarkScully0 -
XML Sitemap Issue or not?
Hi Everyone, I submitted a sitemap within the google webmaster tools and I had a warning message of 38 issues. Issue: Url blocked by robots.txt. Description: Sitemap contains urls which are blocked by robots.txt. Example: the ones that were given were urls that we don't want them to be indexed: Sitemap: www.example.org/author.xml Value: http://www.example.org/author/admin/ My issue here is that the number of URL indexed is pretty low and I know for a fact that Robot.txt aren't good especially if they block URL that needs to be indexed. Apparently the URLs that are blocked seem to be URLs that we don't to be indexed but it doesn't display all URLs that are blocked. Do you think i m having a major problem or everything is fine?What should I do? How can I fix it? FYI: Wordpress is what we use for our website Thanks
Technical SEO | | Tay19860