Submitting XML Sitemap for large website: how big?
-
Hi there,
I’m currently researching how I can generate an XML sitemap for a large website we run. We think that Google is having problems indexing the URLs based on some of the messages we have been receiving in Webmaster tools, which also shows a large drop in the total number of indexed pages.
Content on this site can be accessed in two ways. On the home page, the content appears as a list of posts. Users can search for previous posts and can search all the way back to the first posts that were submitted.
Posts are also categorised using tags, and these tags can also currently be crawled by search engines. Users can then click on tags to see articles covering similar subjects. A post could have multiple tags (e.g. SEO, inbound marketing, Technical SEO) and so can be reached in multiple ways by users, creating a large number of URLs to index.
Finally, my questions are:
- How big should a sitemap be? What proportion of the URLs of a website should it cover?
- What are the best tools for creating the sitemaps of large websites?
- How often should a sitemap be updated?
Thanks
-
Thanks Matt, that's really useful
-
Yeah, it's better to have one than not - but I have always aimed to make it as complete as I can. Why? I'm not sure - mostly because I figure Google is GREAT at crawling my main structure - it's those far-reaching pages that I'm hoping they find in the sitemap.
-
Thanks for both your replies - I will check out the tools and recommendations you suggested.
I'm sure I remember somewhere reading a recommendation that it was only necessary to submit the basic site structure in a sitemap. It sounds like this is not the case and that a site map should , if possible, be comprehensive.
Would it be better to have a basic sitemap giving the main navigational URLs than having nothing at all?
-
I've created sitemaps with the paid version of Screaming Frog that were almost 80,000 pages. That's what I'd use. No point asking what % unless you can't get it all. If you're crawling Microsoft, break it up. Otherwise, organize it if you can (category sitemap, month by month, something.) or just make one big finger to Google type sitemap. lol
-
Hi!
First off, since your content can be accessed in multiple ways, I'd make sure that you're applying means to indicate duplicate pages as such to search engines. Easy access to great content is fantastic, but you can devaluate your own pages a lot when you're not careful. If you're not using it yet, I recommend implementing the rel="canonical" tag in your website.
To answer your questions:
- It should cover all URLs that want indexed. Ideally, that would be every URL
- I'm not sure what 'the best' tools would be, but I used http://www.xml-sitemaps.com a lot a few years back. Their sitemaps are free up to 500 URLs. There are payment plans for bigger ones.
- I wouldn't update an XML sitemap for every new page you make once a month. Instead, let the search engine find their own way in that case. Should your entire site structure change, an XML sitemap can be a great way to help search engine understand your new site setup better.
I hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Temporarily redirecting a small website to a specific url of another website
Hi, I would like to redirect a small website that contains info about a specific project temporarily to a specific url about this project on my main website. Reason for this is that the small website doesn't contain accurate info anymore. We will adapt the content in the next few weeks and then remove the redirect again. Should I set up a 301 or a 302? Thanks
Intermediate & Advanced SEO | | Mat_C1 -
Which search engines should we submit our sitemap to?
Other than Google and Bing, which search engines should we submit our sitemap to?
Intermediate & Advanced SEO | | NicheSocial0 -
Website dropped out from Google index
Howdy, fellow mozzers. I got approached by my friend - their website is https://www.hauteheadquarters.com She is saying that they dropped from google index over night - and, as you can see if you google their name, website url or even site: , most of the pages are not indexed. Home page is nowhere to be found - that's for sure. I know that they were indexed before. Google webmaster tools don't have any manual actions (at least yet). No sudden changes in content or backlink profile. robots.txt has some weird rule - disallow everything for EtaoSpider. I don't know if google would listen to that - robots checker in GWT says it's all good. Any ideas why that happen? Any ideas what I should check? P.S. Just noticed in GWT there was a huge drop in indexed pages within first week of August. Still no idea why though. P.P.S. Just noticed that there is noindex x-robots-tag in headers... Anyone knows where this can be set?
Intermediate & Advanced SEO | | DmitriiK0 -
How Submit to different Countries
Hey There, One of My client Needs To rank his site in multiple Location , like He is in Australia But wants To Promote his Site in Canada, india, USA,So What are the things i can do For Rank in Others Country. Please any Expert Help. Thanx
Intermediate & Advanced SEO | | nupuriepl0 -
Bing seriously hitting our website
Hi I have a strange query, Bing in the last four weeks have been seriously crawling our site, and while at the moment this isn't affecting our servers, coming into the busy time of the year I am just wondering if anyone else is seeing this. They are basically crawling the entire site (including nofollow pages) even though according to Bing Webmaster tools, they know our sitemap. Is anybody else seeing any unusual activity with the Bing bot. Thanks Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
Proper sitemap update frequency
I have 12 sitemaps submitted to Google. After about a week, Google is about 50% of the way through crawling each one. In the past week I've created many more pages. Should I wait until Google is 100% complete with my original sitemaps or can I just go ahead and refresh them? When I refresh the original files will have different URLs.
Intermediate & Advanced SEO | | jcgoodrich0 -
Domain and Sitemap Question
Hi - I am hoping you can help me with this issue we are currently trying to solve. We are hosting our mobile site's content on a different domain than what the URL of the site is, though owned by same company. In Google Webmasters tool we have the mobile sitemap under "sitemaps.xyz.com", however the URL of the site is "m.xyz.com". We have submitted 60MM pages in the mobile sitemap, but only 1MM pages have been indexed. Do you think this set up causes confusion with the bots? Does this affect the crawlability of the site? Any thoughts would be greatly appreciated. Thank you!
Intermediate & Advanced SEO | | ladylana
Eva0 -
Sitemaps / Google Indexing / Submitted
We just submitted a new sitemap to google for our new rails app - http://www.thesquarefoot.com/sitemap.xml Which has over 1,400 pages, however Google is only seeing 114. About 1,200 are in the listings folder / 250 blog posts / and 15 landing pages. Any help would be appreciated! Aron sitemap.png
Intermediate & Advanced SEO | | TheSquareFoot0