Submitting XML Sitemap for large website: how big?
-
Hi there,
I’m currently researching how I can generate an XML sitemap for a large website we run. We think that Google is having problems indexing the URLs based on some of the messages we have been receiving in Webmaster tools, which also shows a large drop in the total number of indexed pages.
Content on this site can be accessed in two ways. On the home page, the content appears as a list of posts. Users can search for previous posts and can search all the way back to the first posts that were submitted.
Posts are also categorised using tags, and these tags can also currently be crawled by search engines. Users can then click on tags to see articles covering similar subjects. A post could have multiple tags (e.g. SEO, inbound marketing, Technical SEO) and so can be reached in multiple ways by users, creating a large number of URLs to index.
Finally, my questions are:
- How big should a sitemap be? What proportion of the URLs of a website should it cover?
- What are the best tools for creating the sitemaps of large websites?
- How often should a sitemap be updated?
Thanks
-
Thanks Matt, that's really useful
-
Yeah, it's better to have one than not - but I have always aimed to make it as complete as I can. Why? I'm not sure - mostly because I figure Google is GREAT at crawling my main structure - it's those far-reaching pages that I'm hoping they find in the sitemap.
-
Thanks for both your replies - I will check out the tools and recommendations you suggested.
I'm sure I remember somewhere reading a recommendation that it was only necessary to submit the basic site structure in a sitemap. It sounds like this is not the case and that a site map should , if possible, be comprehensive.
Would it be better to have a basic sitemap giving the main navigational URLs than having nothing at all?
-
I've created sitemaps with the paid version of Screaming Frog that were almost 80,000 pages. That's what I'd use. No point asking what % unless you can't get it all. If you're crawling Microsoft, break it up. Otherwise, organize it if you can (category sitemap, month by month, something.) or just make one big finger to Google type sitemap. lol
-
Hi!
First off, since your content can be accessed in multiple ways, I'd make sure that you're applying means to indicate duplicate pages as such to search engines. Easy access to great content is fantastic, but you can devaluate your own pages a lot when you're not careful. If you're not using it yet, I recommend implementing the rel="canonical" tag in your website.
To answer your questions:
- It should cover all URLs that want indexed. Ideally, that would be every URL
- I'm not sure what 'the best' tools would be, but I used http://www.xml-sitemaps.com a lot a few years back. Their sitemaps are free up to 500 URLs. There are payment plans for bigger ones.
- I wouldn't update an XML sitemap for every new page you make once a month. Instead, let the search engine find their own way in that case. Should your entire site structure change, an XML sitemap can be a great way to help search engine understand your new site setup better.
I hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why has my website been removed from Bing?
I have a website that has recently been removed from Bing's index, but can't figure out why. The website isn't new, and it is indexed just fine on Google. These are the steps I've tried: The website is verified in Bing Webmaster Tools and successfully submitted the sitemap. I tested the URL to ensure that Bingbot is allowed to crawl the site I submitted URLs to Bing via the URL Submission tool There isn't a "noindex" on the site preventing it from being indexed When I do a URL Inspection, an error message comes up saying "The inspected URL is known to Bing but has some issues which are preventing us from serving it to our users. We recommend you to follow Bing Webmaster Guidelines." I contacted Bing to ask whether the website was removed in error, but received a reply that the website doesn't comply with Bing's quality guidelines, but they wouldn't go into detail as to which guidelines the website isn't meeting. The website URL is https://www.pardeehospital.org. Can anyone offer any advice or insight as to why Bing won't index our site? Thank you!
Intermediate & Advanced SEO | | lindsey.steinkamp0 -
Website removed from Bing and Yahoo
Hello, Our website howtoremove.guide was recently removed from the Bing and Yahoo index. The first thing we did was contact Bing Webmaster support to ask what the issue was since we did not get any notifications or messages in our webmaster dashboard. The email that we got back said “I have escalated the issue to our engineers and will get back to you once I receive an update.” Since then, we haven't received any word back from them, but we did not find any technical problems and we strongly believe we were manually penalized. We've never had issues with a search engine before, so we are at a loss what to do. Could you please give us advice as to what technical issue our website might have or what could incur a deindex penalty in our case? We want to do everything that is possible to get back into Bing and Yahoo search results ASAP. The website has primarily affiliate content, so we are doing anything we can to clean everything up, but any recommendations will be incredibly useful to us. We are also open to contacting an expert on this, but we have no idea where to look.
Intermediate & Advanced SEO | | ThreatAnalyzer0 -
New Website SEO Implications
Hi Moz Community, A client of mine has launched a new website. The new website is well designed, mobile friendly, fast loading and offers a far better UX than the old site. It has similar content but 'less wordy'. The old website was tired, slow, not mobile responsive etc but still ranked well. The domain has marketing leading authority and link metrics. Since the launch, the rankings for virtually every word has plummeted. Even previously ranked #1 words have disappeared to page 3 or 4. New pages have different URLs (301s from the old urls are working fine) and still score the same 98% (using the Moz page optimiser tool). Is it usual to experience some short term pain, or are these rankings drop an indication that something else is missing? My theory is that the new URLs are being treated like new pages, and that those new pages don't have the engagement data which is used for ranking. Thus, despite having the same authority of the old pages, as far as user data is concerned, they are new pages and therefor, not ranking well - yet. That theory would make logical sense but I'm hoping some experts here can help. Any suggestions welcome. Here's a quick checklist of things I have already done: complete 301 redirect list
Intermediate & Advanced SEO | | I.AM.Strategist
New sitemap
Submitted to console
Created internal links from within their large blog
Optimised all the new pages (img alts, H1s etc) Extra info: Platform changed from Wordpress to Expression engine
Target pages now on level 3 not level 2 (extra subfolder used)
Less words used (average word count per page from 400+ to 250) Thanks in advance 🙂0 -
Sitemap into SE
Hi Moz community experts, I have a question about the sitemap into search engine like here : http://i.imgur.com/gQ0JhuH.jpg. Do you know what I need to do to get the same structure or do decide which pages we want to present into our result. We created a new page and we would like to see it into the resultat when the visitor is searching for our branded keywords. Thank in advance for your support. gQ0JhuH.jpg.
Intermediate & Advanced SEO | | johncurlee0 -
Is it safe to link my websites together?
Hi Everyone, I have 10 websites which are all of good standing and related. My visitors would benefit of knowing about the other websites but I don't want to trigger a google penalty by linking them all together. Ideally I'd also like to pass on importance through the links as well. How would you proceed in this situation? Advice would be greatly appreciated, Peter.
Intermediate & Advanced SEO | | RoyalBlueCoffee0 -
Getting Your Website Listed
Do you have any suggestiongs? I do not know local websites where I can get some easy backlinks. I guess a record in Google Places.would be great as well. Any sound suggestion will be appreciated. Thanks!
Intermediate & Advanced SEO | | stradiji0 -
Sitemap Issue - vol 2
Hello everyone! I validated the sitemap with different tools (w3Schools, and so on..) and no errors were found. So I uploaded into my site, tested it through GWT and BANG! all of a sudden there is a parsing error, which correspond to the last, and I mean last piece of code of thousand of lines, . I don't know why it isn't reading the code and it's giving me this as there are no other errors and I haven't got a clue about what to do in order to fix it! Thanks
Intermediate & Advanced SEO | | PremioOscar0 -
Different domains for multilingual website
Hey guys, A site that I'm currently working on as different domains for each website language. So for example: word1word2.com for the english version word3word4.com for the french version word5word6.com for spanish version .... Is it better to move all of the different languages to the same domain and use subfolders for each language /fr/... Please note that the domains being used bring in organic traffic as well as they are EMDs. Thank You.
Intermediate & Advanced SEO | | BruLee0