Sitemaps for Google
-
In Google Webmaster Central, if a URL is reported in your site map as 404 (Not found), I'm assuming Google will automatically clean it up and that the next time we generate a sitemap, it won't include the 404 URL.
Is this true?
Do we need to comb through our sitemap files and remove the 404 pages Google finds, our will it "automagically" be cleaned up by Google's next crawl of our site?
-
Nice - thanks Kane. Cool Chrome tool too, thanks for the suggestion.
I'm in GWT every morning to check things out since our site is fairly large - about 220,000 pages. The sitemap checker is a really cool new feature in GWT too!
-
You should be monitoring them periodically. A quick scan monthly or quarterly depending on how large your site is should be sufficient. I've read reports within the last year that say Bing tends to be picky about sites with 404s in their sitemaps and will begin to distrust them if they get too messy. I am assuming that Google doesn't like it much, either.
It's very easy to check a large number of links easily, luckily: download the Check My Links Chrome Add-on here:
https://chrome.google.com/webstore/detail/ojkcdipcgfaekbeaelaapakgnjflfglf
Click that while looking at your sitemap, and it will test all of the URLs and make any errors stick out in red. Then fix them.
On the other hand, GWT will report them within a few days of finding them. If you're already in your GWT on a frequent basis, make that one more task to check on.
You should be fixing all 404s with 301 directs. Read "How to Fix Crawl Errors in GWT" for any tricky ones that you can't figure out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Getting 'Indexed, not submitted in sitemap' for around a third of my site. But these pages ARE in the sitemap we submitted.
As in the title, we have a site with around 40k pages, but around a third of them are showing as "Indexed, not submitted in sitemap" in Google Search Console. We've double-checked the sitemaps we have submitted and the URLs are definitely in the sitemap. Any idea why this might be happening? Example URL with the error: https://www.teacherstoyourhome.co.uk/german-tutor/Egham Sitemap it is located on: https://www.teacherstoyourhome.co.uk/sitemap-subject-locations-surrey.xml
Technical SEO | | TTYH0 -
Google ignoring the Title Tag?
Anybody seen this too? We have a webpage with tiny different title tag and H1. If you search for let's say "Renovatie", you get to see the title tag "De kostprijs van je renovatie". However, when you search with the search term "Wat kost een renovatie", we see the H1 title in the SERP, which is "Wat kost een renovatie". So that's normal when you search a term that's exact the same as the H1 tag, Google ignores the title tag? N.
Technical SEO | | nans0 -
Google Webmaster tools Sitemap submitted vs indexed vs Index Status
I'm having an odd error I'm trying to diagnose. Our Index Status is growing and is now up to 1,115. However when I look at Sitemaps we have 763 submitted but only 134 indexed. The submitted and indexed were virtually the same around 750 until 15 days ago when the indexed dipped dramatically. Additionally when I look under HTML improvements I only find 3 duplicate pages, and I ran screaming frog on the site and got similar results, low duplicates. Our actual content should be around 950 pages counting all the category pages. What's going on here?
Technical SEO | | K-WINTER0 -
Why is my blog disappearing from Google index?
My Google blogger blog is about 10 months old. In that time i have worked really hard with adding unique content, building relationships with other bloggers in the same niche, and done some inbound marketing. 2 weeks ago I updated the template to something cleaner, with a little more "wordpress" feel to it. This means i've messed about with the code a lot in these weeks, adding social buttons etc. The problem is that from some point late last week thurs/fri my pages started disappearing from Googles index. I have checked webmaster tools and have no manual actions. My link profile is pretty clean as its a new site, and i have manually checked every piece of content published for plagiarism etc. So what is going on? Did i break my blog? Or is something else amiss? Impressions are down 96% comparing Nov 1-5th to previous 5 days. site is here: http://bit.ly/174beVm Thanks for any help in advance.
Technical SEO | | Silkstream0 -
When do you use 'Fetch as a Google'' on Google Webmaster?
Hi, I was wondering when and how often do you use 'Fetch as a Google'' on Google Webmaster and do you submit individual pages or main URL only? I've googled it but i got confused more. I appreciate if you could help. Thanks
Technical SEO | | Rubix1 -
Why Google not picking My META Description? Google itself populate the description.. How to control this Search Snippets??
Why Google not picking My META Description? Google itself populate the description.. How to control this Search Snippets??
Technical SEO | | greyniumseo0 -
Unexplained spikes in Google Analytics
My site has modest traffic (50 unique visitors per day). In the past week, I've seen two unexplained spikes in my Google Analytics. Yesterday, there were 140 unique visitors, and these unique visitors each visited one unique page. This appears to be a bot of some sort. If this is a bot, why does Google Analytics think these are unique visitors? Is there a was for small sites to deal with this? Best,
Technical SEO | | ChristopherGlaeser
Christopher0 -
Sitemaps - Format Issue
Hi, I have a little issue with a client site whose programmer seems kind of unwilling to change things that he has been doing a long time. So, he has had this dynamic site set up for a few years and active in google webmaster tools and others, but is not happy with the traffic it is getting. When I looked at webmaster tools I see that he has a sitemap registered, but it is /sitemap.php When I said that we should be offering the SE's /sitemap.xml his response is that sitemap.php checks the site every day and generates /sitemap.xml, but there is no /sitemap.xml registered in webmaster tools. My gut is telling me that he should just register /sitemap.xml in webmaster tools, but it is a hard sell 🙂 Anyone have any definitive experience of people doing this before and whether it is an issue? My feeling is that it doesn't need to be rocket science... Any input appreciated, Sha
Technical SEO | | ShaMenz0