Submitting XML Sitemap for large website: how big?
-
Hi there,
I’m currently researching how I can generate an XML sitemap for a large website we run. We think that Google is having problems indexing the URLs based on some of the messages we have been receiving in Webmaster tools, which also shows a large drop in the total number of indexed pages.
Content on this site can be accessed in two ways. On the home page, the content appears as a list of posts. Users can search for previous posts and can search all the way back to the first posts that were submitted.
Posts are also categorised using tags, and these tags can also currently be crawled by search engines. Users can then click on tags to see articles covering similar subjects. A post could have multiple tags (e.g. SEO, inbound marketing, Technical SEO) and so can be reached in multiple ways by users, creating a large number of URLs to index.
Finally, my questions are:
- How big should a sitemap be? What proportion of the URLs of a website should it cover?
- What are the best tools for creating the sitemaps of large websites?
- How often should a sitemap be updated?
Thanks
-
Thanks Matt, that's really useful
-
Yeah, it's better to have one than not - but I have always aimed to make it as complete as I can. Why? I'm not sure - mostly because I figure Google is GREAT at crawling my main structure - it's those far-reaching pages that I'm hoping they find in the sitemap.
-
Thanks for both your replies - I will check out the tools and recommendations you suggested.
I'm sure I remember somewhere reading a recommendation that it was only necessary to submit the basic site structure in a sitemap. It sounds like this is not the case and that a site map should , if possible, be comprehensive.
Would it be better to have a basic sitemap giving the main navigational URLs than having nothing at all?
-
I've created sitemaps with the paid version of Screaming Frog that were almost 80,000 pages. That's what I'd use. No point asking what % unless you can't get it all. If you're crawling Microsoft, break it up. Otherwise, organize it if you can (category sitemap, month by month, something.) or just make one big finger to Google type sitemap. lol
-
Hi!
First off, since your content can be accessed in multiple ways, I'd make sure that you're applying means to indicate duplicate pages as such to search engines. Easy access to great content is fantastic, but you can devaluate your own pages a lot when you're not careful. If you're not using it yet, I recommend implementing the rel="canonical" tag in your website.
To answer your questions:
- It should cover all URLs that want indexed. Ideally, that would be every URL
- I'm not sure what 'the best' tools would be, but I used http://www.xml-sitemaps.com a lot a few years back. Their sitemaps are free up to 500 URLs. There are payment plans for bigger ones.
- I wouldn't update an XML sitemap for every new page you make once a month. Instead, let the search engine find their own way in that case. Should your entire site structure change, an XML sitemap can be a great way to help search engine understand your new site setup better.
I hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Two Different IP address pointing to my website, does it will effect my website from SEO point of view
Due to some reason my website https://xyz.com is not redirecting to my main website domain - https://www.xyz.com so our tech team suggested - we will have the non-www name on a different IP and we'll 301 redirect that to the https://www.xyz.com. if it works does it will effect our website from SEO point of view? please let me know.
Intermediate & Advanced SEO | | BPLLC0 -
Schema for E-Commerce websites
Hi Guys. I am running a cleanup for the on page schema we use and will be moving the on page elements into tag manager. I have all the metas and schema for the products boxed off. My question today is what schema should I use for category pages. Granted there is Json-LD for aggregated reviews but I cant see or work out how or what to use for the category pages that have the lists of products on. Any assistance appreciated. Alex
Intermediate & Advanced SEO | | JBGlobalSEO1 -
Sitemap Query
I've decided to write my own sitemap because frankly, the automated ones pull all kinds of out of I don't know where. So to get around that, manual it is. But I have some products appear in various categories, should I still list every product in each category in the sitemap, regardless of some being duplicates, or should I choose the most relevant category and list them there? I do have a canonical URL extension which should resolve any duplicate content I have.
Intermediate & Advanced SEO | | moon-boots0 -
Keyword Stuffing - Ecommerce websites
Hey Mozzers, Im undertaking a content audit and its going very well, we have written some better content for the first set of pages, it still needs some improvement but we have a good base and starting point from which we can make an SEO log and work on it over time. For the content I used the following formula for how many times to include a keyword Word Count / Length of Keyword. (eg. 600 words / 3 word keyword = 200). Then 1-4% of this (2-8 times). This has worked well for me in the past and has been a good base guide. I have ran the pages through Moz optimiser and every single page hit an A for keyword page optimisation. However many of the pages failed on keyword stuffing, which obviously has high priority. My dilemma is that, moz counts 15 as the cut off for keyword stuffing with the written text we have done really well with using it a set number of times. But these pages are product category pages. The keyword in the extreme of cases is listed 7-9 times in the side nav menu. 7-9 times in the product category listings. Take for example *** it is optimised for thermometers (i know it a tough single word keyword, and we have fairly modest aims with it, im using it here for example purposes). The word is used a good number of times within the article but is sent through the roof with the links to the sub categories. This page for example mentions the keyword 30 times. Can anybody suggest any ways to improve on this? Is how we display the categories in the nav bar and in the page excessive? As always many thanks!
Intermediate & Advanced SEO | | ATP0 -
Google is not indexing an updated website
We just relaunched a website that has 5 years old, we maintain all the old URLs and articles but for some reason google is not picking up the new website https://www.navisyachts.com. In Google Webmaster Tools we can see the sitemap with over 1000 pages submitted but shows nothing as indexed. The site is loosing traffic rapidly and positions, from the SEO side all looks fine for me. What can be wrong? I’ll appreciate any help. The new website is built over Joomla 3.4, we have it here at MOZ and other than some minor details it doesn't show that something can be wrong with the website. Thank you.
Intermediate & Advanced SEO | | FWC_SEO0 -
Website not coming up properly on Google
Hello, our website (http://www.roguevalleymicro.com/index.php) is not coming up properly on Google search (for example, when you search for Rogue Valley Microdevices on Google). We believe that there is something wrong with the website source code, and Google cannot index it properly. However, your Crawl Test results did not indicate any such problems. Can someone help us with some advice please?
Intermediate & Advanced SEO | | medved441 -
XML Sitemaps - Multi-lingual website
Hi Mozzers, I am working with a large website that has some of its content translated across multiple languages. I am planning on using The Media Flow to create an HREFLANG Sitemap for content on various languages. Please see the attached image for the questions below. Thanks! Section Highlighted Yellow: When there is a URL that does not have a translated version, should it not be included on the same HREFLANG sitemap? Alternately, could I just remove the languages that are not being targeted, so this would just reflect English language targeting? fqO9Dvk
Intermediate & Advanced SEO | | J-Banz0 -
Effects of having both http and https on my website
You are able to view our website as either http and https on all pages. For example: You can type "http://mywebsite.com/index.html" and the site will remain as http: as you navigate the site. You can also type "https://mywebsite.com/index.html" and the site will remain as https: as you navigate the site. My question is....if you can view the entire site using either http or https, is this being seen as duplicate content/pages? Does the same hold true with "www.mywebsite.com" and "mywebsite.com"? Thanks!
Intermediate & Advanced SEO | | rexjoec1