How important are sitemap errors?
-
If there aren't any crawling / indexing issues with your site, how important do thing sitemap errors are? Do you work to always fix all errors?
I know here: http://www.seomoz.org/blog/bings-duane-forrester-on-webmaster-tools-metrics-and-sitemap-quality-thresholds
Duane Forrester mentions that sites with many 302's 301's will be punished--does any one know Googe's take on this?
-
Very important. Particularly if you have a large site. We operate a large site with 100,000's of pages and as Dan said it can be difficult to maintain. We use something called Unlimited XML Sitemap Generator which builds XML sitemaps for us automatically. I'd highly recommend it although it takes a bit of fiddling with to get it up and running as it's software which sits on site. We couldn't manage without it as we'd be forever on sitemaps.
We found that getting sitemaps right on a large site made a huge difference to the crawl rate that we encountered in GWT and a huge indexation to follow.
In particular check for 302's. I made the mistake of leaving those for a while and am sure that we suffered from some loss of link equity along the way.
Hope it helps
Dawn
-
Your sitemap should only list pages that actually exist.
If you delete some pages, then you need to rebuild the sitemap.
Ditto if you delete them and redirect.
Google is always lagging, so if you delete 10 pages and then update the sitemap, even if google downloads the sitemap immediately, they will still be running crawls on the old map, and they may be crawling the now-missing pages, but haven't shown the failures in your WMT yet.
If you update your sitemap quickly, it is possible they will never crawl the missing pages and get a 404 or 301.
(but of course, there could be other sites pointing to the now-missing pages, and the 404s will show up elsewhere as missing)
I am always checking, adding, deleting and redirecting pages, and I update the current sitemap every hour and all the others are rebuilt at midnight every night. I usually do deletions just before midnight if I can, to minimize the time the sitemap is out of sync.
-
As far as I know Google is more lenient with sitemap errors, but I would still recommend looking into it. The first step would be to be sure your sitemap is up to date to begin with - and has all the URLs you want (and not any you don't want). The main thing is none of them should 404 and then beyond that, yes, they should return 200's.
Unless you're dealing with a gigantic site which might be hard to maintain, in theory there shouldn't be errors in sitemaps if you have the correct URLs in there.
Even better, if you're running WordPress the Yoast SEO plugin will generate an XML sitemap for you and it update automatically.
Hope that helps!
-Dan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
LinkedIn 999 HTTP Errors
I am working on a website, https://linkedinforbusiness.net and a ton of 999 HTTP errors have only now surfaced. I would venture from reading the "Request denied" error in the log, LinkedIn means to block BLCs attempts to check those links. It might be because the site has a lot of LinkedIn links; maybe they find it suspicious that the server is sending a lot of requests for their links. How do you see this? Any fixes? What level of harm do you think it brings to the site? I have removed all similar links to LinkedIn from my site to avoid this (https://www.hillwebcreations.com). However, this isn't so easily done for LinkedIn For Business, as her work in all about helping businesses and individuals optimize their use of LinkedIn.
Intermediate & Advanced SEO | | jessential0 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
Need some help/input about my Joomla sitemap created by XMap
Here is my current sitemap for my site http://www.yakangler.com/index.php?option=com_xmap&view=xml&tmpl=component&id=1 I have some questions about it's current settings. I have a component called JReviews that xmap produces a separate link for each category. ex: http://www.yakangler.com/fishing-kayak-review/265-2013-hobie-mirage-adventure-island 2014-09-03T20:46:25Z monthly 0.4 http://www.yakangler.com/fishing-kayak-review/266-2012-wilderness-systems-tarpon-140 2014-06-03T15:49:00Z monthly 0.4
Intermediate & Advanced SEO | | mr_w
http://www.yakangler.com/fishing-kayak-review/343-wilderness-systems-tarpon-120-ultralite 2013-11-25T06:39:05Z monthly 0.4 Where as my other articles are only linked by the content category. ex: http://www.yakangler.com/news monthly 0.4
http://www.yakangler.com/tournaments monthly 0.4
http://www.yakangler.com/kayak-events monthly 0.4
http://www.yakangler.com/spotlight monthly 0.4 Which option is better?0 -
Importance of having a tightly themed sight and domain for ranking and SEO
I started the site 10 years ago as www.islesurfboards.com selling mainly surfboards and ranking mainly for surfboards, paddle boards came along and now paddle boards make up for 95% of all the business and we are missing alot of ranking in the paddle board related keywords so what is the best course of action? my plans: keep www.islesurfboards.com and keep it surfboards focused, create a new domain www.islepaddleboards.com and move all paddle board related content products etc over to this domain with redirects to transfer the link juice. Doing this will still keep my surfboard site and all its long term domain credibility and i can offer a link over the the www.islepaddleboards.com site for people looking to buy paddle boards and vice versa on the new paddle board site for people looking to shop surfboards. Would this be the best course of action or does can anyone offer any better suggestions. I know google supposodly has taken off much ranking emphasis of the domain but as i pick apart the competition who rank welll in the paddle board space they all have "paddleboards" in the domains and a paddle board specific site to keep it tightly themed which pays off across the board in content, ppc campaigns, and overall ease of use as surfboards and paddle boards are two seperate products and paddle boards is very hot right now so i dont want to stay commited to www.islesurfboards.com domain if its going to create confusion or not help me rank well for paddle boards leading into the future. Any ideas? Thoughts on the best route to take?
Intermediate & Advanced SEO | | isle_surf0 -
Is Sitemap Issue Causing Duplicate Content & Unindexed Pages on Google?
On July 10th my site was migrated from Drupal to Google. The site contains approximately 400 pages. 301 permanent redirects were used. The site contains maybe 50 pages of new content. Many of the new pages have not been indexed and many pages show as duplicate content. Is it possible that there is a site map issue that is causing this problem? My developer believes the map is formatted correctly, but I am not convinced. The sitemap address is http://www.nyc-officespace-leader.com/page-sitemap.xml [^] I am completely non technical so if anyone could take a brief look I would appreciate it immensely. Thanks,
Intermediate & Advanced SEO | | Kingalan1
Alan | |0 -
How do I best deal with pages returning 404 errors as they contain links from other sites?
I have over 750 URL's returning 404 errors. The majority of these pages have back links from sites, however the credibility of these pages from what I can see is somewhat dubious, mainly forums and sites with low DA & PA. It has been suggested placing 301 redirects from these pages, a nice easy solution, however I am concerned that we could do more harm than good to our sites credibility and link building strategy going into 2013. I don't want to redirect these pages if its going to cause a panda/penguin problem. Could I request manual removal or something of this nature? Thoughts appreciated.
Intermediate & Advanced SEO | | Towelsrus0 -
Question regarding error url while checking in open-site explorer tool
Hello friends, My website home url & inner page url shows error while checking the open-site site explorer tool from SEOMoz, for a website** eg.www.abc.com website as**
Intermediate & Advanced SEO | | zco_seo
**"Oh Hey! It looks like that URL redirects to www.abc.com/error.aspx?aspxerrorpath=/default.aspx. Would you like to see data for that URL instead?"**May I know the reason, why this url showing this result while checking back link report from the tool?**May I know on what basis this tool is evaluating the website url as well?****May I know, Will this affect the Google SERPs for this website?**Thanks0 -
Should the sitemap include just menu pages or all pages site wide?
I have a Drupal site that utilizes Solr, with 10 menu pages and about 4,000 pages of content. Redoing a few things and we'll need to revamp the sitemap. Typically I'd jam all pages into a single sitemap and that's it, but post-Panda, should I do anything different?
Intermediate & Advanced SEO | | EricPacifico0