Sitemaps for Google
-
In Google Webmaster Central, if a URL is reported in your site map as 404 (Not found), I'm assuming Google will automatically clean it up and that the next time we generate a sitemap, it won't include the 404 URL.
Is this true?
Do we need to comb through our sitemap files and remove the 404 pages Google finds, our will it "automagically" be cleaned up by Google's next crawl of our site?
-
Nice - thanks Kane. Cool Chrome tool too, thanks for the suggestion.
I'm in GWT every morning to check things out since our site is fairly large - about 220,000 pages. The sitemap checker is a really cool new feature in GWT too!
-
You should be monitoring them periodically. A quick scan monthly or quarterly depending on how large your site is should be sufficient. I've read reports within the last year that say Bing tends to be picky about sites with 404s in their sitemaps and will begin to distrust them if they get too messy. I am assuming that Google doesn't like it much, either.
It's very easy to check a large number of links easily, luckily: download the Check My Links Chrome Add-on here:
https://chrome.google.com/webstore/detail/ojkcdipcgfaekbeaelaapakgnjflfglf
Click that while looking at your sitemap, and it will test all of the URLs and make any errors stick out in red. Then fix them.
On the other hand, GWT will report them within a few days of finding them. If you're already in your GWT on a frequent basis, make that one more task to check on.
You should be fixing all 404s with 301 directs. Read "How to Fix Crawl Errors in GWT" for any tricky ones that you can't figure out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Which Sitemap to keep - Http or https (or both)
Hi, Just finished upgrading my site to the ssl version (like so many other webmasters now that it may be a ranking factor). FIxed all links, CDN links are now secure, etc and 301 Redirected all pages from http to https. Changed property in Google Analytics from http to https and added https version in Webmaster Tools. So far, so good. Now the question is should I add the https version of the sitemap in the new HTTPS site in webmasters or retain the existing http one? Ideally switching over completely to https version by adding a new sitemap would make more sense as the http version of the sitemap would anyways now be re-directed to HTTPS. But the last thing i can is to get penalized for duplicate content. Could you please suggest as I am still a rookie in this department. If I should add the https sitemap version in the new site, should i delete the old http one or no harm retaining it.
Technical SEO | | ashishb010 -
Pages to be indexed in Google
Hi, We have 70K posts in our site but Google has scanned 500K pages and these extra pages are category pages or User profile pages. Each category has a page and each user has a page. When we have 90K users so Google has indexed 90K pages of users alone. My question is. Should we leave it as they are or should we block them from being indexed? As we get unwanted landings to the pages and huge bounce rate. If we need to remove what needs to be done? Robots block or Noindex/Nofollow Regards
Technical SEO | | mtthompsons0 -
Is a Rel="cacnonical" page bad for a google xml sitemap
Back in March 2011 this conversation happened. Rand: You don't want rel=canonicals. Duane: Only end state URL. That's the only thing I want in a sitemap.xml. We have a very tight threshold on how clean your sitemap needs to be. When people are learning about how to build sitemaps, it's really critical that they understand that this isn't something that you do once and forget about. This is an ongoing maintenance item, and it has a big impact on how Bing views your website. What we want is end state URLs and we want hyper-clean. We want only a couple of percentage points of error. Is this the same with Google?
Technical SEO | | DoRM0 -
Does Google index has expiration?
Hi, I have this in mind and I think you can help me. Suppose that I have a pagin something like this: www.mysite.com/politics where I have a list of the current month news. Great, everytime the bot check this url, index the links that are there. What happens next month, all that link are not visible anymore by the user unless he search in a search box or google. Does google keep those links? The current month google check that those links are there, but next month are not, but they are alive. So, my question is, Does google keep this links for ever if they are alive but nowhere in the site (the bot not find them anymore but they work)? Thanks
Technical SEO | | informatica8100 -
Ranking on google.com.au but not google.com
Hi there, we (www.refundfx.com.au) rank on google.com.au for some keywords that we target, but we do not rank at all on google.com, is that because we only use a .com.au domain and not a .com domain? We are an Australian company but our customers come from all over the world so we don't want to miss out on the google.com searches. Any help in this regard is appreciated. Thanks.
Technical SEO | | RefundFX0 -
Google Alerts News Images
I have a Google Alert setup which is pulling information from a blog. I am receiving images as part of the alert. The issue that I am having is that the images have nothing to do with the blog post. Is there a way to control what images are received in the alert. From what I have gathered, if it grabs an image it should be part of the blog post.
Technical SEO | | ricknakao0 -
Does Google Bot accept Cookies
I am working with a per page results refinement that stores a cookie on the users computer and then keeps that same per page as the user goes around the site. I was just wondering if that was true for Google bot or Bing bot as well. Will they keep the cookie or would they not be able to accept it. I just want to know as I dont want different urls created if they can keep the cookie. Thanks!
Technical SEO | | Gordian0 -
Google Places and Name Change
Hello - I have a client who is a realtor and changed agencies. I edited their Google Places entry and the new name of their agency and address are showing - but so is their old listing. The agency they left is now trying to sue them for showing up in a number one position with Google Places under their agency name. Is this an indexing issue with Google? Their name shows up under both agency names. The corrected one shows most often, but the old one is still popping up on occasion. Thanks,
Technical SEO | | seoessentials1