Xml sitemap advice for website with over 100,000 articles
-
Hi,
I have read numerous articles that support submitting multiple XML sitemaps for websites that have thousands of articles... in our case we have over 100,000. So, I was thinking I should submit one sitemap for each news category.
My question is how many page levels should each sitemap instruct the spiders to go? Would it not be enough to just submit the top level URL for each category and then let the spiders follow the rest of the links organically?
So, if I have 12 categories the total number of URL´s will be 12???
If this is true, how do you suggest handling or home page, where the latest articles are displayed regardless of their category... so I.E. the spiders will find l links to a given article both on the home page and in the category it belongs to. We are using canonical tags.
Thanks,
Jarrett
-
It's really a process of experimenting over time to find out the method that results in the most URLs indexed that in turn brings the most relevant traffic. Personally I wouldn't have one for each category, yet without tests there's no conclusive reasoning either way.
-
Thanks for the tip... I will do that.
I´m still unsure if I really need to submit a sitemap with thousands of URL´s I was thinking I should create an sitemap index file the points to individual top level category sitemaps and leave it at that. If I do this though, I suppose I don´t need individual sitemaps per category as I will just insert the category URL´s in the root sitemap. What do you think?
-
To add to Corey's response, I'll repeat what I just provided another question here on Pro Q&A. Sitemap.xml files can handle a maximum of 50,000 URLs, however I've seen them choke with as few as 10,000. Its important to run them through a tool like tools.pingdom.com to ensure they load within just a couple seconds.
Then submit them through Google/Bing webmaster systems and then see if they succeed in crawling all of them.
-
We break up our sitemap files into several different site maps, and then use a sitemap index file to make sure Google finds them all.
At the bottom of this post they talk about using an index file to combine multiple sitemaps, and they also specifically say it is fine to have one time sensitive site map (ie: front page items) and several other less time sensitive ones (categories in your case).
http://googlewebmastercentral.blogspot.com/2006/10/multiple-sitemaps-in-same-directory.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I use two sitemaps?
I have a Magento website. I am going to add a Wordpress blog under /blog. If I setup each with its own webmaster tools to submit a sitemap does it hurt anything?
Intermediate & Advanced SEO | | Tylerj0 -
Google does not index image sitemap
Hi, we put an image sitemap in the searchconsole/webmastertools http://www.sillasdepaseo.es/sillasdepaseo/sitemap-images.xml it contains only the indexed products and all images on the pages. We also claimed the CDN in the searchconsole http://media.sillasdepaseo.es/ It has been 2 weeks now, Google indexes the pages, but not the images. What can we do? Thanks in advance. Dieter Lang
Intermediate & Advanced SEO | | Storesco0 -
Panda penalty removal advice
Hi everyone! I'm after a second (or third, or fourth!) opinion here! I'm working on the website www.workingvoices.com that has a Panda penalty dating from the late March 2012 update. I have made a number of changes to remove potential Panda issues but haven't seen any rankings movement in the last 7 weeks and was wondering if I've missed something... The main issues I identified and fixed were: Keyword stuffed near duplicate title tags - fixed with relevant unique title tags Copies of the website on other domains creating duplicate content issues - fixed by taking these offline Thin content - fixed by adding content to some pages, and noindexing other thin/tag/category pages. Any thoughts on other areas of the site that might still be setting off the mighty Panda are appreciated! Cheers Damon.
Intermediate & Advanced SEO | | Digitator0 -
Merging websites
My company (A) is about the merge with another company (B). The long-term plan is not to keep their brand or website. In terms of the merge process, I have been doing a bit of research and this is how I'm thinking about doing it so far., which is open minded about changing... On the homepage of company B, do a 302 redirect to an information page on the same website which details the merger. This will only be for a year. After a year has passed, do a 301 redirect to the homepage of company A Do 301 redirects from all other pages to similar pages on company A. For pages that don't correspond, either do a 302 to the 'merger detail page', or do a 301 to the homepage of company A. Bring across any content that is effective at driving traffic. Contact all high authority websites that have links to company B and request for them to be updated. Any tips/corrections appreciated. Stu
Intermediate & Advanced SEO | | Stuart260 -
Google Processing but Not Indexing XML Sitemap
Like it says above, Google is processing but not indexing our latest XML sitemap. I noticed this Monday afternoon - Indexed status was still Pending - and didn't think anything of it. But when it still said Pending on Tuesday, it seemed strange. I deleted and resubmitted our XML sitemap on Tuesday. It now shows that it was processed on Tuesday, but the Indexed status is still Pending. I've never seen this much of a lag, hence the concern. Our site IS indexed in Google - it shows up with a site:xxxx.com search with the same number of pages as it always has. The only thing I can see that triggered this is Sunday the site failed verification via Google, but we quickly fixed that and re-verified via WMT Monday morning. Anyone know what's going on?
Intermediate & Advanced SEO | | Kingof50 -
Ecommerce website consolidation
I have a large ecommerce site and several smaller nitche ecommerce sites. All have the same products, but the smaller sites are loosing traffic. I want to combine all the sites to the larger site so it will be easier to manage, but I don't want to loose any rank on the smaller sites. Example: www.yourpromopeople.com - This is the large site I want to use. www.logocoolies.com www.fourcolormagnets.com - These are a couple of the smaller sites I want to combine with the larger one. Questions: What are the pros and cons in doing this? What would be the best way to do this? Would redirecting the URL's to the larger site's product pages do the trick or is there a better option? Thanks for the help.
Intermediate & Advanced SEO | | JHSpecialty0 -
What do you do with outdated news and articles?
What do you guys do with your old content/news/articles? Do you just leave them on your site forever for historical reasons? It goes without saying that you wouldn't delete an article that has links pointing to it. But if there aren't any links, it doesn't rank and it doesn't receive traffic… do you just scrap it? How say you? Update: I would also like to throw in that I have a client who in 2006/2007 used content from another site. What would you do with that content after this amount of time? Bother with it?
Intermediate & Advanced SEO | | BeTheBoss0 -
How long until Sitemap pages index
I recently submitted an XML sitemap on Webmaster tools: http://www.uncommongoods.com/sitemap.xml Once Webmaster tools downloads it, how long do you typically have to wait until the pages index ?
Intermediate & Advanced SEO | | znotes0