Xml sitemap advice for website with over 100,000 articles
-
Hi,
I have read numerous articles that support submitting multiple XML sitemaps for websites that have thousands of articles... in our case we have over 100,000. So, I was thinking I should submit one sitemap for each news category.
My question is how many page levels should each sitemap instruct the spiders to go? Would it not be enough to just submit the top level URL for each category and then let the spiders follow the rest of the links organically?
So, if I have 12 categories the total number of URL´s will be 12???
If this is true, how do you suggest handling or home page, where the latest articles are displayed regardless of their category... so I.E. the spiders will find l links to a given article both on the home page and in the category it belongs to. We are using canonical tags.
Thanks,
Jarrett
-
It's really a process of experimenting over time to find out the method that results in the most URLs indexed that in turn brings the most relevant traffic. Personally I wouldn't have one for each category, yet without tests there's no conclusive reasoning either way.
-
Thanks for the tip... I will do that.
I´m still unsure if I really need to submit a sitemap with thousands of URL´s I was thinking I should create an sitemap index file the points to individual top level category sitemaps and leave it at that. If I do this though, I suppose I don´t need individual sitemaps per category as I will just insert the category URL´s in the root sitemap. What do you think?
-
To add to Corey's response, I'll repeat what I just provided another question here on Pro Q&A. Sitemap.xml files can handle a maximum of 50,000 URLs, however I've seen them choke with as few as 10,000. Its important to run them through a tool like tools.pingdom.com to ensure they load within just a couple seconds.
Then submit them through Google/Bing webmaster systems and then see if they succeed in crawling all of them.
-
We break up our sitemap files into several different site maps, and then use a sitemap index file to make sure Google finds them all.
At the bottom of this post they talk about using an index file to combine multiple sitemaps, and they also specifically say it is fine to have one time sensitive site map (ie: front page items) and several other less time sensitive ones (categories in your case).
http://googlewebmastercentral.blogspot.com/2006/10/multiple-sitemaps-in-same-directory.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Strange rankings on new website
HI All My website is 10 years old, and has decent rankings. The domain is www.advanced-driving.co.uk I have recently had a major overhaul of the site, before it was very outdated, with lots of duplicated content. My main keywords are "advanced driving course" and "advanced driving courses" both of which I am on page 1. However, since I have been live with new site - (5 days) I am not ranking for some easy win keywords. I have submitted new content thought webmaster tools, and whilst some content is ranking, others are not. The content not ranking is fresh and unique ( have used copyscape on all new pages). For example my homepage is on page 1 for "advanced driving courses london" - around rank 6. So I hand made some content titled advanced driving courses london to provide more of an exact match, outlining our courses in London and the routes we take - http://www.advanced-driving.co.uk/defensive-advanced-driving-courses-london/ However, this page which is unique does not rank at all....I have done this with another website and it worked well, but google is not understanding this at all. Also I am now on page 1 for "advanced driving course" but not for "advanced driving courses" - well I am but the page for the plural keyword is a page not really related - surely Googles semantic search should realise course and courses are the same! I suspect that Google is still getting used to my new website? No errors or anything in Webmaster tools... Can anyone confirm this - or outline if I have done something awful..!! Thanks Rob
Intermediate & Advanced SEO | | robert780 -
My website is ranking well on most of keywords. How do I find more keywords in order to drive more traffic to my website?
I have a website which is ranking well on some good keywords ie generic and long tail. It is also ranking for some really competitive keywords. and now getting constant traffic. I want to increase organic traffic to my website. What are the best possible ways to do this? How to research more keywords and how to identify that they will really work? Please help, I am confused.
Intermediate & Advanced SEO | | rishi.ast0 -
Sitemap Priority and Recency
Perhaps more of a discussion here than a definite answer, but we are looking at making periodic changes to the priority and recency of our sitemap pages, but would have to repopulate that information each time our plugin updates for our WordPress site. Is this something that is even worth doing or are these updates not impactful enough to merit adding it to our process? Thanks all!
Intermediate & Advanced SEO | | ReunionMarketing0 -
B2B site targeting 20,000 companies with 20,000 dedicated "target company pages" on own website.
An energy company I'm working with has decided to target 20,000 odd companies on their own b2b website, by producing a new dedicated page per target company on their website - each page including unique copy and a sales proposition (20,000 odd new pages to optimize! Yikes!). I've never come across such an approach before... what might be the SEO pitfalls (other than that's a helluva number of pages to optimize!). Any thoughts would be very welcome.
Intermediate & Advanced SEO | | McTaggart0 -
Merging 3 websites into 1
Hi I was wondering if someone can give me a bit of advice - outside of my full time job I run three websites - all in the same area, but all do three different things 1. Directory (DA19) 2. Blog (DA 23) 3. Products (DA35) I want to merger all three website into one and all be on one website (rather do it now than keep working on three websites). I have included the DAs of each site ( I know they are not amazing, but i've only recently started working on two of them), but I want to place all three websites under the Blog url. Regarding 301's of the pages, would I be better doing at the top level and 301 all the pages to the home page, or spending the time and 301 the old product page for instance to the new product page - this is a much bigger project, but what are the potential gains. Is there anything else I should consider when switching the sites - all three are wordpress sites (I know it has its limitations but they are easy to create). Thanks in Advance Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
14,000 links from affiliate
I have an active affiliate program and notice that webmaster tools is showing a huge number of links from one particular affiliate. The affiliate is called productwiki.co.uk and they are showing 14,413 links all pointing to my homepage in WMT. They don't seem to be no follow. What should I do about this? Is this a problem? I have had major issues with my organic traffic dropping right off. I appreciate any feedback
Intermediate & Advanced SEO | | Aikijeff0 -
Sitemap Folders on Search Results
Hello! We are managing SEO campaign of a video website. We have an issue about sitemap folders. I have sitemaps like ** /xml/sitemap-name.xml .** But Google is indexing my /xml/ folder and also sitemaps and they appear in search results. If i will add Disallow: /xml/ to my robots.txt and remove /xml/ folder from webmaster tools, Google could see my sitemaps? or it ignores them? Will my site effect negatively after remove /xml/ folder completely from search results? What should i do?
Intermediate & Advanced SEO | | roipublic0 -
1st Campaign - Advice please.
We have just run our first campaign for our site and have found over 5,335 errors! It would appear that the majority of which are where the crawl has duplicated the product page with the "write a review - Tell a friend page"...hence a large number of errors. In addition we also have over 5,000 302 warnings for the following URL: URL: http://www.collarandcuff.co.uk/index.php?_a=login&redir=/index.php?_a=viewCat&catId=105 Please bear in mind we are fairly new to this type of data....so go easy on us. In short, will these errors have a significant bearing on our rankings etc and if so how do we rectify? Many thanks. Tony
Intermediate & Advanced SEO | | collar640