Submitting XML Sitemap for large website: how big?
-
Hi there,
I’m currently researching how I can generate an XML sitemap for a large website we run. We think that Google is having problems indexing the URLs based on some of the messages we have been receiving in Webmaster tools, which also shows a large drop in the total number of indexed pages.
Content on this site can be accessed in two ways. On the home page, the content appears as a list of posts. Users can search for previous posts and can search all the way back to the first posts that were submitted.
Posts are also categorised using tags, and these tags can also currently be crawled by search engines. Users can then click on tags to see articles covering similar subjects. A post could have multiple tags (e.g. SEO, inbound marketing, Technical SEO) and so can be reached in multiple ways by users, creating a large number of URLs to index.
Finally, my questions are:
- How big should a sitemap be? What proportion of the URLs of a website should it cover?
- What are the best tools for creating the sitemaps of large websites?
- How often should a sitemap be updated?
Thanks
-
Thanks Matt, that's really useful
-
Yeah, it's better to have one than not - but I have always aimed to make it as complete as I can. Why? I'm not sure - mostly because I figure Google is GREAT at crawling my main structure - it's those far-reaching pages that I'm hoping they find in the sitemap.
-
Thanks for both your replies - I will check out the tools and recommendations you suggested.
I'm sure I remember somewhere reading a recommendation that it was only necessary to submit the basic site structure in a sitemap. It sounds like this is not the case and that a site map should , if possible, be comprehensive.
Would it be better to have a basic sitemap giving the main navigational URLs than having nothing at all?
-
I've created sitemaps with the paid version of Screaming Frog that were almost 80,000 pages. That's what I'd use. No point asking what % unless you can't get it all. If you're crawling Microsoft, break it up. Otherwise, organize it if you can (category sitemap, month by month, something.) or just make one big finger to Google type sitemap. lol
-
Hi!
First off, since your content can be accessed in multiple ways, I'd make sure that you're applying means to indicate duplicate pages as such to search engines. Easy access to great content is fantastic, but you can devaluate your own pages a lot when you're not careful. If you're not using it yet, I recommend implementing the rel="canonical" tag in your website.
To answer your questions:
- It should cover all URLs that want indexed. Ideally, that would be every URL
- I'm not sure what 'the best' tools would be, but I used http://www.xml-sitemaps.com a lot a few years back. Their sitemaps are free up to 500 URLs. There are payment plans for bigger ones.
- I wouldn't update an XML sitemap for every new page you make once a month. Instead, let the search engine find their own way in that case. Should your entire site structure change, an XML sitemap can be a great way to help search engine understand your new site setup better.
I hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Which search engines should we submit our sitemap to?
Other than Google and Bing, which search engines should we submit our sitemap to?
Intermediate & Advanced SEO | | NicheSocial0 -
Best Sitemap Generator XML
Hello Everyone, Can Anyone Suggest best Site map Generator Software??
Intermediate & Advanced SEO | | ieplnupur0 -
To include in Sitemap or not to include?
Hello all, A bit of a confusing one but please bear with me... On our website we have a Used Cars section where each morning a feed is loaded onto our site with any changes to the stock. Some cars may have been sold and removed, some new cars may be added, some prices may be changed, every day every morning this very large section of our website is updated. The question I have is, should I be including these urls in my sitemap? The Used Cars section is a huge portion of our website content and is our most important area, the Used Cars overview is our most frequently visited page. The reason I ask is because of course Google might crawl and see car X, but tomorrow car X could be gone and be replaced with car Y. Should I be even mentioning these pages to Google if by tomorrow some of those urls could be gone? It's always changing and it's something we don't have control of. Thanks!
Intermediate & Advanced SEO | | HB170 -
Merging 3 websites into 1
Hi I was wondering if someone can give me a bit of advice - outside of my full time job I run three websites - all in the same area, but all do three different things 1. Directory (DA19) 2. Blog (DA 23) 3. Products (DA35) I want to merger all three website into one and all be on one website (rather do it now than keep working on three websites). I have included the DAs of each site ( I know they are not amazing, but i've only recently started working on two of them), but I want to place all three websites under the Blog url. Regarding 301's of the pages, would I be better doing at the top level and 301 all the pages to the home page, or spending the time and 301 the old product page for instance to the new product page - this is a much bigger project, but what are the potential gains. Is there anything else I should consider when switching the sites - all three are wordpress sites (I know it has its limitations but they are easy to create). Thanks in Advance Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
XML Sitemap for classifieds
I have seeon some trends for sites which do not even use XML sitemp and robots e.g. see this site. How do you see if sitemap is not used. Also for classified websites, should ad pages be included in sitemap because after certain duration those ads will be deleted and google might not be able to crawl. What do you suggest about XML sitemap for classified website.
Intermediate & Advanced SEO | | MozAddict0 -
Linking Within Website
Hello - I have about 10 landing pages that I am focusing on ranking for and I'm doing okay. My question is should I have all these pages on a drop down menu from my home page or is the innerlinking too much? http://www.kasplacement.com
Intermediate & Advanced SEO | | ksundheim10 -
Canonical URLs and Sitemaps
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external). Questions: 1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags? 2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
Intermediate & Advanced SEO | | 379seo0 -
Website layout for a new website [Over 50 Pages & targeting Long Tail Keywords]
Hey everyone, We are designing a new website with over 50 pages and I have a question regarding the layout. Should I target my long tail keywords via blog pages? It will be easier to manage and list and link out to similar articles related to my long tail keywords using a word press blog. For this example - lets suppose the website is www.orange.com and we sells 'Oranges' Am I going about this in the right way? Main Section: Main Section 1 : Home Page - Keyword Targeted - Orange Main Section 2 : Important Conversion page - 'Buy oranges' Long Tail Keyword (LTK) 1: www.orange.com/blog/LTK1 Subsection(SS): www.orange.com/blog/LTK1/SS1 www.orange.com/blog/LTK1/SS1a www.orange.com/blog/LTK1/SS1b Long Tail Keyword (LTK) 2: www.orange.com/blog/LTK2 Long Tail Keyword (LTK) 3: www.orange.com/blog/LTK3 Subsection(SS): www.orange.com/blog/LTK1/SS3 www.orange.com/blog/LTK1/SS3a www.orange.com/blog/LTK1/SS3b All these long tail pages and sub sections under them are built specifically for hosting content that targets these specific long tail keywords. Most of my traffic will come initially via the sub section pages - and it is important for me to rank well for these terms initially. _E.g. if someone searches for the keyword 'SS3b' on Google - my corresponding page www.orange.com/blog/LTK1/SS3b should rank well on the results page. _ For ranking purposes - will using this blog/category structure hurt or benefit me? Instead do you think I should build static pages? Also, we are targeting more than 50 long tail keywords - and building quality content for each of these keywords - and I assume that we will be doing this continuously. So in the long term term which is more beneficial? Do you have any suggestions on if I am going about this the right way? Apologies for using these random terms - oranges, LKT, SS etc in this example. However, I hope that the question is clear. Looking forward to some interesting answers on this! Please feel free to share your thoughts.. Thank you! Natasha
Intermediate & Advanced SEO | | Natashadogres0