Xml sitemap advice for website with over 100,000 articles
-
Hi,
I have read numerous articles that support submitting multiple XML sitemaps for websites that have thousands of articles... in our case we have over 100,000. So, I was thinking I should submit one sitemap for each news category.
My question is how many page levels should each sitemap instruct the spiders to go? Would it not be enough to just submit the top level URL for each category and then let the spiders follow the rest of the links organically?
So, if I have 12 categories the total number of URL´s will be 12???
If this is true, how do you suggest handling or home page, where the latest articles are displayed regardless of their category... so I.E. the spiders will find l links to a given article both on the home page and in the category it belongs to. We are using canonical tags.
Thanks,
Jarrett
-
It's really a process of experimenting over time to find out the method that results in the most URLs indexed that in turn brings the most relevant traffic. Personally I wouldn't have one for each category, yet without tests there's no conclusive reasoning either way.
-
Thanks for the tip... I will do that.
I´m still unsure if I really need to submit a sitemap with thousands of URL´s I was thinking I should create an sitemap index file the points to individual top level category sitemaps and leave it at that. If I do this though, I suppose I don´t need individual sitemaps per category as I will just insert the category URL´s in the root sitemap. What do you think?
-
To add to Corey's response, I'll repeat what I just provided another question here on Pro Q&A. Sitemap.xml files can handle a maximum of 50,000 URLs, however I've seen them choke with as few as 10,000. Its important to run them through a tool like tools.pingdom.com to ensure they load within just a couple seconds.
Then submit them through Google/Bing webmaster systems and then see if they succeed in crawling all of them.
-
We break up our sitemap files into several different site maps, and then use a sitemap index file to make sure Google finds them all.
At the bottom of this post they talk about using an index file to combine multiple sitemaps, and they also specifically say it is fine to have one time sensitive site map (ie: front page items) and several other less time sensitive ones (categories in your case).
http://googlewebmastercentral.blogspot.com/2006/10/multiple-sitemaps-in-same-directory.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Please have a look at my website. I am stuck here.
Here might be the reason. I had loads of unnecessary content so I given them the noindex tag. I tried to change the robot.txt file but that shouldn't be a problem in SEO. First my site had a country specific domain and then a year later I changed it to .Com, as to target globally (Mainly US). My site is ranking well in that specific country (never been close to page 1) on page 3 almost every time. It's not ranking in other countries, despite the fact that I've not targeted it to any specific country since the domain was changed. A month ago, I deleted 404 pages and all the thin content which was indexed in the SERP and also deleted the duplicated contents and as well as the copied contents. Meanwhile I've also tried changing the headings in some of the products articles as they were causing the duplicate heading issue. I've recently switched my hosting from the UK based server to the Us based server because the last hosting has bad downtime. So far until now nothing seems to be working in my favor. I'm just tired of resolving issues and in return finding a zero result. This is my devil site: 10stuffs.com plz check it out and tell me why my site is not ranking at all and what sould I do.
Intermediate & Advanced SEO | | anshu14320 -
News articles on our website are being indexed, but not showing up for search queries.
News articles on distributed.com are being indexed by Google, but not showing up for any search queries. In Google Search, I can copy and paste the entire first paragraph of the article, and the listing still won't show up in search results. For example, https://distributed.com/news/dtcc-moves-closer-blockchain-powered-trades doesn't rank AT ALL for "DTCC Moves Closer to Blockchain-Powered Trades", the title of the article. We've tried the following so far: re-submitted sitemap to search console checked manual actions in search console checked for any no-index/no-follow tags Please help us solve this SEO mystery!
Intermediate & Advanced SEO | | BTC_Inc0 -
Best Sitemap for Large Website
i have more than 3500 pages on my website. Please let me know the best sitemap plugin for my website.
Intermediate & Advanced SEO | | Michael.Leonard1 -
Sitemap Migration - Google Guidelines
Hi all. I saw in support.google.com the following text: Create and save the Sitemap and lists of links A Sitemap file containing the new URL mapping A Sitemap file containing the old URLs to map A list of sites with link to your current content I would like to better understand about a "A list of sites with bond link to current content" Question 1: have I need tree sitemaps simultaneously ?
Intermediate & Advanced SEO | | mobic
Question 2: If yes, should I put this sitemap on the Search Console of the new website?
Question 3: or just Google gave a about context how do we make the migration? And I'll need really have sitemaps about the new site only..? What about is Google talking? Thanks for any advice.0 -
Ruby on rails sitemap.xml structure
Is their a recommended way/best practice to implement sitemap.xml files on a site built with ruby on rails?
Intermediate & Advanced SEO | | brianvest0 -
Need advice and smart solution for H1
Hi! I've been quite a long time trying to find a solution to the html structure of a webpage that I'am auditing right now. But I need your valuable help! The problem is the following: Actually, they have different sections inside. The structure is something like this Masters - Degrees - Programs - About us - News And if you go to the Masters Section you will find something like this Masters Master number 1 in Tourism (brand.com/master/master-number-1-tourism) Presentation
Intermediate & Advanced SEO | | teconsite
(brand.com/master/master-number-1-tourism) Objectives
(brand.com/master/master-number-1-tourism/objectives) Professional opportunites
(brand.com/master/master-number-1-tourism/professional-opportunities) Faculty
(brand.com/master/master-number-1-tourism/faculty) Qualification
(brand.com/master/master-number-1-tourism/qualification) Financial
(brand.com/master/master-number-1-tourism/financial) Master number 2 with a long name Presentation Objetives Profesisional opportunities Faculty ... Master number 3 in Sports and so on The Degrees section, has inside exactly the same structure with the same names. My doubt is related with the use of h1 tag What would be the best h1 strategy for each content page? Each master has 6 pages (presentation, objectives, faculty,...) For page Objetives,
brand.com/master/master-number-1-tourism/objectives If I choose to use as H1 just the word Objetives, what will happen is that I will have a lot of pages (one per master, degree or program), with the same H1, because each master will have its own page Objectives. If we have 10 masters + 10 degrees + 2 programs y will have 22 pages with the same H1 Objecives If I choose to use as H1 the following: "Objectives of the Master Number 1 in Tourisim and so on with long name" it will be difficult for the users to visually see the difference between the different pages. for instance Objectives of the Master Number 1 in Tourisim and so on with long name and Faculty of the Master Number 1 in Tourisim and so on with long name because they only differ in one word. What do you think of this solution? Objectives Master Number 1 in Tourisim and so on with long name is it correct to do this inside the h1? or would you use combinations of h1 and h2 like these
h1: Objectives
h2: Master Number 1 in Tourisim and so on with long name would be this appropiate?0 -
Should I bother with a Video Sitemap?
Morning all, I've started a pretty aggressive Video content push in recent weeks. All our videos are on our YouTube channel. I decided to go with hosting the videos on YouTube based on my research on moz.com, especially considering the potential reach of the content on YouTube. What I'm finding is that the YouTube channel is doing great. We've hit 200 subscribers and 15K views in a little under a month. Wayyyy more than I could have ever hoped for. But the blog posts on our website are getting minimal traffic and no search visibility. That doesn't necessarily bother me, since the intention of our marketing campaign is to use YouTube to drive traffic to our website. So I guess my question is really more to do with optimizing the site with Video Sitemaps and best practices for Google Webmaster Tools. Right now we have YouTube videos embedded on blog posts like this one that have a time-stamp. But I've been working to create Gallery-style pages (no time-stamp) which would have multiple YouTube videos embedded on them like this one. These make it easier for visitors to watch multiple videos without needing to skip around to multiple blog posts. The challenge I'm running into is that when I go to submit a Video Sitemap to GWT I get an error saying that I have duplicate page content within the video sitemap. I've used several WP plugins to do this. It seems that when there is a video embedded on multiple URLs (pages + posts) the plugins will ignore the posts and only add the pages to the video sitemap. Here is my regular Sitemap Here is my video Sitemap I've attached a screenshot of my current Yoast Video SEO config if that's useful for reference. Does anyone have experience with using multiple sitemaps in GWT? I'm starting to think that maybe I shouldn't even bother with a video sitemap. Maybe those gallery-style pages should just go in the regular sitemap? Any thoughts or advice would be highly appreciated! Thanks llQfydA
Intermediate & Advanced SEO | | TMHoward860 -
How to find all of a website's SERPs?
Was wondering how easiest to find all of a website's existing SERPs?
Intermediate & Advanced SEO | | McTaggart0