Sitemap use for very large forum-based community site
-
I work on a very large site with two main types of content, static landing pages for products, and a forum & blogs (user created) under each product. Site has maybe 500k - 1 million pages. We do not have a sitemap at this time.
Currently our SEO discoverability in general is good, Google is indexing new forum threads within 1-5 days roughly. Some of the "static" landing pages for our smaller, less visited products however do not have great SEO.
Question is, could our SEO be improved by creating a sitemap, and if so, how could it be implemented? I see a few ways to go about it:- Sitemap includes "static" product category landing pages only - i.e., the product home pages, the forum landing pages, and blog list pages. This would probably end up being 100-200 URLs.
- Sitemap contains the above but is also dynamically updated with new threads & blog posts.
Option 2 seems like it would mean the sitemap is unmanageably long (hundreds of thousands of forum URLs). Would a crawler even parse something that size? Or with Option 1, could it cause our organically ranked pages to change ranking due to Google re-prioritizing the pages within the sitemap?
Not a lot of information out there on this topic, appreciate any input. Thanks in advance. -
Agreed, you'll likely want to go with option #2. Dynamic sitemaps are a must when you're dealing with large sites like this. We advise them on all of our clients with larger sites. If your forum content is important for search then these are definitely important to include as the content likely changes often and might be naturally deeper in the architecture.
In general, I'd think of sitemaps from a discoverability perspective instead of a ranking one. The primary goal is to give Googlebot an avenue to crawl your sites content regardless of internal linking structure.
-
Hi
Go with option 2, there is no scaling issue here. I have worked with and for sites that have a high multiplier on the number of sitemaps and pages that they're submitting, in some cases up to 100M pages. In all cases, Google was totally fine in crawling and processing the data that was there. As long as you follow the guidelines (max 50K URLs in a sitemap) you're fine as you're just providing another file that usually doesn't exceed about 50MB (depending on if you also add images to the sitemap). If you have an engineering team build the right infrastructure you can easily deal with thousands of these files and run them automated every day/week.
My main focus on big sites is also to streamline their sitemaps to have sitemaps with just the last 50.000 pages and the same for the last 50.000 pages that were updated. This way you're able to also monitor the indexation level of these pages. If you are able to, for example, combine the data from log file analysis you can say: we added 50K pages and Google in the last days were able to crawl X percentage of that.
Hope this gives you some extra insights.
Martijn.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Migration + Change of Address Tool used - previous site de-indexed!!
OMG disaster! Recently migrated my site womencycles.com to moonrise.health. Painstakingly went through each URL manually to map out redirects, notified Google via change of address tool. Bam. My old website has disappeared from Google and my new site has thus lost all it's organic (i.e. redirected) traffic. I don't get it. I think I have done everything by the book, but it seems my old site has disappeared and no authority or link juice has been passed to my new site by the 301s, as the new site isn't ranking either. Some examples: https://www.google.com/search?q=women+cycles&oq=women+cycles&aqs=chrome..69i57j69i65j69i61l2j69i60.1834j0j1&sourceid=chrome&ie=UTF-8 'women cycles' previous position 1
Technical SEO | | tikitaka
https://www.google.com/search?q=chaffed+vagina&oq=chaffed+vagina&aqs=chrome..69i57.2370j0j1&sourceid=chrome&ie=UTF-8 - chaffed vagina, previous position 1 https://www.google.com/search?q=how+long+does+it+take+turmeric+to+shrink+fibroids&oq=how+long+does+it+take+turmeric+to+shrink+fibroids&aqs=chrome..69i57.1355j0j1&sourceid=chrome&ie=UTF-8 - how long does it take turmeric to shrink fibroids, previous position 1. Biggest traffic source pages were: https://womencycles.com/blog/top-10-home-remedies-that-claim-to-tighten-vagina-do-they-work/
https://womencycles.com/blog/sore-breasts-after-period-has-finished/
https://womencycles.com/blog/what-is-vaginal-gas-queefing/
https://womencycles.com/blog/tired-during-ovulation/
https://womencycles.com/blog/how-to-get-rid-of-saggy-vag-without-surgery/
https://womencycles.com/blog/vagina-chafing-causes-treatments-to-prevent-it-from-coming-back/
https://womencycles.com/blog/vaginal-dryness-during-pregnancy/ New blog articles on new site, with 301 redirect in place, but not ranking Screenshot shows my search traffic for my new site. Site migrated 13 June. Any ideas anyone??!Screenshot 2022-06-28 at 13.27.41.png0 -
We are migrating a site and are seeing alot of 301s and 302s already in the old site is it ok to leave those as is?
For the 3xx’s I’m not sure if it’s okay for us to redirect to these so please advise on that
Technical SEO | | lina_digital0 -
Can an AJAX framework (using HTML5 + pushstate) on your site impact your ranking?
Hello everybody, I am currently investigating a website which is rendered by an AJAX Framework (Angularjs) using the HTML5 +API history - Pushstate methods.
Technical SEO | | Netsociety
Recently Google announced that they are able to execute Javascript and can therefore see the content and links to discover all pages in the structure. However it seems that it doesn't run the Javascript at ALL times. (after some internal testing) So technically it is possible it arrives on a page without seeing any content and links, while another time he can arrive, run Javascript and read/discover the content and links generated by AJAX.
The fact that Google can't always interpret or read the website correctly can therefore have negative SEO impact? (not the indexation process but ranking) We are aware that is better to create a snapshot of the page but in the announcement of Google they state that the method that is currently used, should be sufficient. Does anybody have any experience with this AND what is the impact on the ranking process? Thanks!0 -
Mobile site not ranking
Hello, Our main site ranks well for all the keyword terms, and yet, our mobile site is buried. It is a "m." configuration, and I am wondering if it is a question of not using the correct programming language to get it there? Or if the redirects to the main site should relate differently? I have tried to read up on the topic of mobile site SEO and cannot find (or understand) the answer? Could someone please help? Thanks so much in advance!
Technical SEO | | lfrazer0 -
How to create site map for large site (ecommerce type) that has 1000's if not 100,000 of pages.
I know this is kind of a newbie question but I am having an amazing amount of trouble creating a sitemap for our site Bestride.com. We just did a complete redesign (look and feel, functionality, the works) and now I am trying to create a site map. Most of the generators I have used "break" after reaching some number of pages. I am at a loss as to how to create the sitemap. Any help would be greatly appreciated! Thanks
Technical SEO | | BestRide0 -
Should I Use the Disavow Tool to for a Spammy Site/Landing Page?
Here's the situation... There's a site that is linking to about 6 articles of mine from about 225 pages of theirs (according to info in GWT). These pages are sales landing pages looking to sell their product. The pages are pretty much identical but have different urls. (I actually have a few sites doing this to me.) Here's where I think it's real bad -- when they are linking to me you don't see the link on the page, you have to view the page source and search for my site's url. I'm thinking having a hidden url, and it being my article that's hidden, has got to be bad. That on top of it being a sales page for a product I've seen traffic to my site dropping but I don't have a warning in GWT. These aren't links that I've placed or asked for in any way. I don't see how they could be good for me and I've already done what I could to email the site to remove the links (I didn't think it would work but thought I'd at least try). I totally understand that the site linking to me may not have any affect on my current traffic. So should I use the Disavow tool to make sure this site isn't counting against me?
Technical SEO | | GlenCraig0 -
Am I using 301 correctly?
Hello, I have a 'Free download' type site for free graphics for designers. To prevent hot linking we authenticate the downloads and use a 301 redirect. So for example: The download URL looks like this if someone is clicking on the download button: http://www.website.com**/resources/243-name-of-the-file/download/dc37** and then we 301 that URL back to: http://www.website.com**/category-name/243-name-of-the-file** Is a 301 the correct way to do that?
Technical SEO | | shawn810 -
Will training videos available on the "members only" section of a site contribute to the sites ranking?
Hello, I got asked a question recently as to whether training videos on the deeper pages of a website (that you can only access if you are a member and log in) will help with the sites ranking. On the SEOMoz software these deeper pages have been crawled as far as I can tell with errors reported on pages from the "members only" section of the site, leading me to believe the members only pages and their content will contribute to the sites overall ranking profile. I have suggested uploading the informational videos on the main pages of the site for now, making them accessible to all visitors and putting them in a more obvious place to encourage more sharing and views, however I've also said I would check it out with some experts so any information will be greatly appreciated! Many thanks 🙂 Charlotte
Technical SEO | | CharlotteWaller0