Submitting XML Sitemap for large website: how big?
-
Hi there,
I’m currently researching how I can generate an XML sitemap for a large website we run. We think that Google is having problems indexing the URLs based on some of the messages we have been receiving in Webmaster tools, which also shows a large drop in the total number of indexed pages.
Content on this site can be accessed in two ways. On the home page, the content appears as a list of posts. Users can search for previous posts and can search all the way back to the first posts that were submitted.
Posts are also categorised using tags, and these tags can also currently be crawled by search engines. Users can then click on tags to see articles covering similar subjects. A post could have multiple tags (e.g. SEO, inbound marketing, Technical SEO) and so can be reached in multiple ways by users, creating a large number of URLs to index.
Finally, my questions are:
- How big should a sitemap be? What proportion of the URLs of a website should it cover?
- What are the best tools for creating the sitemaps of large websites?
- How often should a sitemap be updated?
Thanks
-
Thanks Matt, that's really useful
-
Yeah, it's better to have one than not - but I have always aimed to make it as complete as I can. Why? I'm not sure - mostly because I figure Google is GREAT at crawling my main structure - it's those far-reaching pages that I'm hoping they find in the sitemap.
-
Thanks for both your replies - I will check out the tools and recommendations you suggested.
I'm sure I remember somewhere reading a recommendation that it was only necessary to submit the basic site structure in a sitemap. It sounds like this is not the case and that a site map should , if possible, be comprehensive.
Would it be better to have a basic sitemap giving the main navigational URLs than having nothing at all?
-
I've created sitemaps with the paid version of Screaming Frog that were almost 80,000 pages. That's what I'd use. No point asking what % unless you can't get it all. If you're crawling Microsoft, break it up. Otherwise, organize it if you can (category sitemap, month by month, something.) or just make one big finger to Google type sitemap. lol
-
Hi!
First off, since your content can be accessed in multiple ways, I'd make sure that you're applying means to indicate duplicate pages as such to search engines. Easy access to great content is fantastic, but you can devaluate your own pages a lot when you're not careful. If you're not using it yet, I recommend implementing the rel="canonical" tag in your website.
To answer your questions:
- It should cover all URLs that want indexed. Ideally, that would be every URL
- I'm not sure what 'the best' tools would be, but I used http://www.xml-sitemaps.com a lot a few years back. Their sitemaps are free up to 500 URLs. There are payment plans for bigger ones.
- I wouldn't update an XML sitemap for every new page you make once a month. Instead, let the search engine find their own way in that case. Should your entire site structure change, an XML sitemap can be a great way to help search engine understand your new site setup better.
I hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Big retailers and duplicate content
Hello there! I was wondering if you guys have experience with big retailers sites fetching data via API (PDP content etc.) from another domain which is also sharing the same data with other multiple sites. If each retailer has thousands on products, optimizing PDP content (even in batches) is quite of a cumbersome task and rel="canonical" pointing to original domain will dilute the value. How would you approach this type of scenario? Looking forward to read your suggestions/experiences Thanks a lot! Best Sara
Intermediate & Advanced SEO | | SaraCoppola1 -
Big drop in bounce rate suddenly
Our site had a huge drop in bounce rate in one day (went from 10% to 3%) and it has stayed that way. What could cause this? P4bYpNF.png
Intermediate & Advanced SEO | | navidash0 -
To include in Sitemap or not to include?
Hello all, A bit of a confusing one but please bear with me... On our website we have a Used Cars section where each morning a feed is loaded onto our site with any changes to the stock. Some cars may have been sold and removed, some new cars may be added, some prices may be changed, every day every morning this very large section of our website is updated. The question I have is, should I be including these urls in my sitemap? The Used Cars section is a huge portion of our website content and is our most important area, the Used Cars overview is our most frequently visited page. The reason I ask is because of course Google might crawl and see car X, but tomorrow car X could be gone and be replaced with car Y. Should I be even mentioning these pages to Google if by tomorrow some of those urls could be gone? It's always changing and it's something we don't have control of. Thanks!
Intermediate & Advanced SEO | | HB170 -
Construction website
Hi, I have a construction website that is aimed at tradesmen. There are 2 goals of the site: 1. To allow potential customers to sign up for a trade account. 2. To allow existing customers to access to products and login to their account to make an order. The site is full of categories and products which should be indexed so we rank for these trade products. The homepage redesign is where i am having an issue: Currently the site is set up like a standard retail site but without prices, which are viewable only when logged in. The homepage is designed such that there is several call to actions about promotions, services and to apply for a trade account, that apply to both existing and potential customers. At the moment there is a poor conversion to get potential customers to apply for a trade account. This is because there is too much distraction away from this goal and they are allowed to engage other areas of the site freely. The main purpose of the homepage should be to encourage potential customers to sign up. The secondary purpose to for existing customers to access the accounts and products. I believe potential customers should not be exposed to the categories and products as it is a distraction from the primary goal. Potential customers, i.e. Tradesmen, would already have a certain understanding of the types of products we provide, so I don't feel it is necessary to allow them to crawl the rest of the site unless they have an account. What are your thoughts on that? Here is my lack of understanding: On the homepage, if I restrict access to categories and products to existing account holders only, where a login is required to proceed, would that mean Google cannot access these pages to index them? Or is this only controlled by NoFollows & Robots.txt? Obviously not indexing is undesirable. I do understand potential customers will need some information about our range of products but the idea is to coerce them to sign up for an account so they can see this information. The more information that is provided to a potential customer, the higher the probability a person can make a decision against applying for an account. Restricting access creates a motivator to reveal information and we capture their data to converse with them personally. This increases the probability of us being able to retain their interest by providing a customised service based on their needs. All of this I feel makes perfect sense to me, the only query/obstacle I have is the indexing of the site. If Google cannot index pages that are restricted by account access, then I would like suggestions to solve/compromise/optimise the above. Just to address the desired behaviour of index pages. If in search a our product page appears, the person clicking the link would either be redirected or exposed to a login or sign up screen to view. Thank you so much for your help. Antonio
Intermediate & Advanced SEO | | AVSFencingSupplies0 -
New Website Launch - Traffic Way Down
We launched a new website in June. Traffic plummeted after the launch, we crept back up for a couple of months, but now we are flat, nowhere near our pre-launch traffic or previous year's traffic. For the past 6 months our analytics have been worrying us - Overall traffic and new visitor traffic is down over 10%, bounce rate is up almost 35% since site launched, keywords aren't ranking where they used to, and of course, web sales are down. Is this supposed to happen when a new site is launched, and how long does a new this transition last? We have done all the technical audits, adding relevant content, we're at a loss. Any suggestions where to look next to improve traffic to pre-launch numbers?
Intermediate & Advanced SEO | | WaySEO0 -
Optimising a Dynamic website ?
A client has bought the Nostalgia wp theme. I've installed Yoast but because the website is ajax based and the content for the pages are dynamically loaded the plugin won't work. Or at least not to my knowledge? The developer doesn't currently have a solution, which from previous expereience it will never be supported. So I need some possible solutions here. Create a mobile site? Cons more time, more money etc Create non dynamic pages linked in footer area. Cons page duplication etc. It's a small niche so having the basic elements is imperative to getting it ranking.
Intermediate & Advanced SEO | | StephenForde0 -
Sitemaps / Google Indexing / Submitted
We just submitted a new sitemap to google for our new rails app - http://www.thesquarefoot.com/sitemap.xml Which has over 1,400 pages, however Google is only seeing 114. About 1,200 are in the listings folder / 250 blog posts / and 15 landing pages. Any help would be appreciated! Aron sitemap.png
Intermediate & Advanced SEO | | TheSquareFoot0 -
Website Ranking Issue
Hey All My question is specfic to a particular website. The category of the website is Kitchen Appliances. The keyword is extremely competitive. The website I am currently optimizing has loads of products and many pages as well. I am constantly building links from industry specific websites for the website as well as composing articles and leading the users back to the website with keyword rich anchor text. I have been doing this for around 3 months and I do not see the website in the first 30 pages of the SERP (for the keyword kitchen appliances - the site is a page rank 2 BTW). No bugs reported as well in Webmaster tools. My next step is to add these articles to the website (www.example.com/KitchenAppliances ) with keyword rich metatags as well as content with internal links to my product pages. I also plan on sending traffic to these pages to build the pages link popularity. Do you think I can expect better results for the article pages than my original website product pages or do you think I should continue with the link building activity I was performing originally for the website. regards Ryan
Intermediate & Advanced SEO | | SEO5Team2