Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How important are sitemap errors?
-
If there aren't any crawling / indexing issues with your site, how important do thing sitemap errors are? Do you work to always fix all errors?
I know here: http://www.seomoz.org/blog/bings-duane-forrester-on-webmaster-tools-metrics-and-sitemap-quality-thresholds
Duane Forrester mentions that sites with many 302's 301's will be punished--does any one know Googe's take on this?
-
Very important. Particularly if you have a large site. We operate a large site with 100,000's of pages and as Dan said it can be difficult to maintain. We use something called Unlimited XML Sitemap Generator which builds XML sitemaps for us automatically. I'd highly recommend it although it takes a bit of fiddling with to get it up and running as it's software which sits on site. We couldn't manage without it as we'd be forever on sitemaps.
We found that getting sitemaps right on a large site made a huge difference to the crawl rate that we encountered in GWT and a huge indexation to follow.
In particular check for 302's. I made the mistake of leaving those for a while and am sure that we suffered from some loss of link equity along the way.
Hope it helps
Dawn
-
Your sitemap should only list pages that actually exist.
If you delete some pages, then you need to rebuild the sitemap.
Ditto if you delete them and redirect.
Google is always lagging, so if you delete 10 pages and then update the sitemap, even if google downloads the sitemap immediately, they will still be running crawls on the old map, and they may be crawling the now-missing pages, but haven't shown the failures in your WMT yet.
If you update your sitemap quickly, it is possible they will never crawl the missing pages and get a 404 or 301.
(but of course, there could be other sites pointing to the now-missing pages, and the 404s will show up elsewhere as missing)
I am always checking, adding, deleting and redirecting pages, and I update the current sitemap every hour and all the others are rebuilt at midnight every night. I usually do deletions just before midnight if I can, to minimize the time the sitemap is out of sync.
-
As far as I know Google is more lenient with sitemap errors, but I would still recommend looking into it. The first step would be to be sure your sitemap is up to date to begin with - and has all the URLs you want (and not any you don't want). The main thing is none of them should 404 and then beyond that, yes, they should return 200's.
Unless you're dealing with a gigantic site which might be hard to maintain, in theory there shouldn't be errors in sitemaps if you have the correct URLs in there.
Even better, if you're running WordPress the Yoast SEO plugin will generate an XML sitemap for you and it update automatically.
Hope that helps!
-Dan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does anyone know how to fix this structured data error on search console? Invalid value in field "itemtype"
I'm getting the same structured data error on search console form most of my websites, Invalid value in field "itemtype" I take off all the structured data but still having this problem, according to Search console is a syntax problem but I can't find what is causing this. Any guess, suggestion or solution for this?
Intermediate & Advanced SEO | | Alexanders0 -
How long after https migration that google shows in search console new sitemap being indexed?
We migrated 4 days ago to https and followed best practices..
Intermediate & Advanced SEO | | lcourse
In search console now still 80% of our sitemaps appear as "pending" and among those sitemaps that were processed only less than 1% of submitted pages appear as indexed? Is this normal ?
How long does it take for google to index pages from sitemap?
Before https migration nearly all our pages were indexed and I see in the crawler stats that google has crawled a number of pages each day after migration that corresponds to number of submitted pages in sitemap. Sitemap and crawler stats show no errors.0 -
[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console
I just opened my G Search Console and was shocked to see more than 150 Not Found errors under Crawl errors. Mine is a Wordpress site (it's consistently updated too): Here's how they show up: Example 1: URL: www.example.com/search/adult-site-keyword/page2.html/feed/rss2 Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword/page2.html Example 2 (this surprised me the most when I looked at the linked from data): URL: www.example.com/search/adult-site-keyword-2.html/page/3/ Linked From: www.example.com/search/adult-site-keyword-2.html/page/2/ (this is showing as if it's from our own site) http://a-spammy-adult-site.com/search/adult-site-keyword-2.html Example 3: URL: www.example.com/search/adult-site-keyword-3.html Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword-3.html How do I address this issue?
Intermediate & Advanced SEO | | rmehta10 -
URL structure change and xml sitemap
At the end of April we changed the url structure of most of our pages and 301 redirected the old pages to the new ones. The xml sitemaps were also updated at that point to reflect the new url structure. Since then Google has not indexed the new urls from our xml sitemaps and I am unsure of why. We are at 4 weeks since the change, so I would have thought they would have indexed the pages by now. Any ideas on what I should check to make sure pages are indexed?
Intermediate & Advanced SEO | | ang0 -
How to handle a blog subdomain on the main sitemap and robots file?
Hi, I have some confusion about how our blog subdomain is handled in our sitemap. We have our main website, example.com, and our blog, blog.example.com. Should we list the blog subdomain URL in our main sitemap? In other words, is listing a subdomain allowed in the root sitemap? What does the final structure look like in terms of the sitemap and robots file? Specifically: **example.com/sitemap.xml ** would I include a link to our blog subdomain (blog.example.com)? example.com/robots.xml would I include a link to BOTH our main sitemap and blog sitemap? blog.example.com/sitemap.xml would I include a link to our main website URL (even though it's not a subdomain)? blog.example.com/robots.xml does a subdomain need its own robots file? I'm a technical SEO and understand the mechanics of much of on-page SEO.... but for some reason I never found an answer to this specific question and I am wondering how the pros do it. I appreciate your help with this.
Intermediate & Advanced SEO | | seo.owl0 -
Images Returning 404 Error Codes. 301 Redirects?
We're working with a site that has gone through a lot of changes over the years - ownership, complete site redesigns, different platforms, etc. - and we are finding that there are both a lot of pages and individual images that are returning 404 error codes in the Moz crawls. We're doing 301 redirects for the pages, but what would the best course of action be for the images? The images obviously don't exist on the site anymore and are therefore returning the 404 error codes. Should we do a 301 redirect to another similar image that is on the site now or redirect the images to an actual page? Or is there another solution that I'm not considering (besides doing nothing)? We'll go through the site to make sure that there aren't any pages within the site that are still linking to those images, which is probably where the 404 errors are coming from. Based on feedback below it sounds like once we do that, leaving them alone is a good option.
Intermediate & Advanced SEO | | garrettkite0 -
Is it a problem to use a 301 redirect to a 404 error page, instead of serving directly a 404 page?
We are building URLs dynamically with apache rewrite.
Intermediate & Advanced SEO | | lcourse
When we detect that an URL is matching some valid patterns, we serve a script which then may detect that the combination of parameters in the URL does not exist. If this happens we produce a 301 redirect to another URL which serves a 404 error page, So my doubt is the following: Do I have to worry about not serving directly an 404, but redirecting (301) to a 404 page? Will this lead to the erroneous original URL staying longer in the google index than if I would serve directly a 404? Some context. It is a site with about 200.000 web pages and we have currently 90.000 404 errors reported in webmaster tools (even though only 600 detected last month).0 -
Tool to check XML sitemap
Hello, Can anyone help me finding a tool to have closer look of the XML sitemap? Tks in advance! PP
Intermediate & Advanced SEO | | PedroM0