Sitemap use for very large forum-based community site
-
I work on a very large site with two main types of content, static landing pages for products, and a forum & blogs (user created) under each product. Site has maybe 500k - 1 million pages. We do not have a sitemap at this time.
Currently our SEO discoverability in general is good, Google is indexing new forum threads within 1-5 days roughly. Some of the "static" landing pages for our smaller, less visited products however do not have great SEO.
Question is, could our SEO be improved by creating a sitemap, and if so, how could it be implemented? I see a few ways to go about it:- Sitemap includes "static" product category landing pages only - i.e., the product home pages, the forum landing pages, and blog list pages. This would probably end up being 100-200 URLs.
- Sitemap contains the above but is also dynamically updated with new threads & blog posts.
Option 2 seems like it would mean the sitemap is unmanageably long (hundreds of thousands of forum URLs). Would a crawler even parse something that size? Or with Option 1, could it cause our organically ranked pages to change ranking due to Google re-prioritizing the pages within the sitemap?
Not a lot of information out there on this topic, appreciate any input. Thanks in advance. -
Agreed, you'll likely want to go with option #2. Dynamic sitemaps are a must when you're dealing with large sites like this. We advise them on all of our clients with larger sites. If your forum content is important for search then these are definitely important to include as the content likely changes often and might be naturally deeper in the architecture.
In general, I'd think of sitemaps from a discoverability perspective instead of a ranking one. The primary goal is to give Googlebot an avenue to crawl your sites content regardless of internal linking structure.
-
Hi
Go with option 2, there is no scaling issue here. I have worked with and for sites that have a high multiplier on the number of sitemaps and pages that they're submitting, in some cases up to 100M pages. In all cases, Google was totally fine in crawling and processing the data that was there. As long as you follow the guidelines (max 50K URLs in a sitemap) you're fine as you're just providing another file that usually doesn't exceed about 50MB (depending on if you also add images to the sitemap). If you have an engineering team build the right infrastructure you can easily deal with thousands of these files and run them automated every day/week.
My main focus on big sites is also to streamline their sitemaps to have sitemaps with just the last 50.000 pages and the same for the last 50.000 pages that were updated. This way you're able to also monitor the indexation level of these pages. If you are able to, for example, combine the data from log file analysis you can say: we added 50K pages and Google in the last days were able to crawl X percentage of that.
Hope this gives you some extra insights.
Martijn.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site dropped from SERP
Hello, I've been ranking a site for the last 5 months with good success, ranking on the first page for a high traffic keyword. In the beginning of September however, my site completely dropped out of the SERPs for several of those keywords yet my site was still indexed and there was no penalty applied to my site via search console. I would assume this maybe because of the update during the time.My site came back again a week later and it was ranking much higher on the first page (#2). Today, I just checked the SERPs and my site is now gone again. It was there this morning but now as of two hours ago it is gone, as well as one of my main competitors. My site is still indexed and no penalties via search console. Does anyone know what causes these types of issues? Im assuming my site will come back in a week or so with hopefully the same or better ranking, but when I have disruptions like this it really hurts my organic traffic. Any input is appreciated. Thanks!
Technical SEO | | KathleenDC0 -
Google not using redirect
We have a GEO-IP redirect in place for our domain, so that users are pointed to the subfolder relevant for their region, e.g: Visit example.com from the UK and you will be redirected to example.com/uk This works fine when you manually type the domain into your browser, however if you search for the site and come to example.com, you end up at example.com I didn't think this was too much of an issue but our subfolders /uk and /au are not getting ranked at all in Google, even for branded keywords. I'm wondering if the fact that Google isn't picking up the redirect means that the pages aren't being indexed properly? Conversely our US region (example.com/us) is being ranked well. Has anyone encountered a similar issue?
Technical SEO | | ahyde0 -
Sitemap and crawl impact
If I have two links in the sitemap (for example: page1.html and page2.html) but the web-site contains more pages (page1.html, page2.html and page3.html) is this a sign for Google to not to crawl other pages? I.e. Will Google index page3.html? Consider that any page can be accessed.
Technical SEO | | ditoroin0 -
Does anyone use Ning (Custom social site)?
We just got a custom Ning site developed and the second we moved it to our domain, we lost almost all of our rankings in Google. So, we sign up for a SEOMoz Pro account and within a week we see that we have over 6,000 errors and warnings on our site. Duplicate content, duplicate page titles and pages with too many links. Is anyone using Ning and successfully maintaining 1st page rankings in Google? Seems to me that we need hundreds of fixes and modifications in the code of our site to fix all of these errors and get Google to take our penalizations off.
Technical SEO | | danlok0 -
How can I best find out which URLs from large sitemaps aren't indexed?
I have about a dozen sitemaps with a total of just over 300,000 urls in them. These have been carefully created to only select the content that I feel is above a certain threshold. However, Google says they have only indexed 230,000 of these urls. Now I'm wondering, how can I best go about working out which URLs they haven't indexed? No errors are showing in WMT related to these pages. I can obviously manually start hitting it, but surely there's a better way?
Technical SEO | | rango0 -
Trying to get google to know my site is a magazine site is this wrong
Hi, i have put a line to describe what my site is at the top of my site and i want to know if this is wrong or not. We have dropped frok being number one in google for lifestyle magazine to now number seven. Before we had to redo our site we were number one and then we dropepd to around number four when we finished the site and now we are number seven and i need to try and get back up there. To help google know we are a lifestyle magazine i have put a line at the top of the site and i want to know if this looks out of place and if i should take it down. i need advice on how to get google to know we are a lifestyle magazine and get back in the top five of google my site is www.in2town.co.uk any help would be great
Technical SEO | | ClaireH-1848860 -
How do i use Linkedin to promote my site
Hi i have heard so much about Linkedin but i am not sure how to use it. I would like to start off using it for free and have opened an account but i am not sure how to use it to promote my site and to gain traffic before i go to the paid service. Can anyone please give me some basic advice on how to use the free service to gain traffic to my site and also the best way to use it. I want to try and use it to promote my articles on www.in2town.co.uk
Technical SEO | | ClaireH-1848860 -
Switching Site to a Domain Name that's in Use
I'm comfortable with the steps of moving a site to a new domain name as recommended by Google. However, in this case, the domain name I'm asked to move to is not really "new" ... meaning it's currently hosting a website and has been for a long time. So my question is, do I do this in steps and take the old website down first in order to "free up" the domain name in they eyes of search engines to avoid large numbers of 404s and then (in step 2) switch to the "new" domain in a few months? Thanks.
Technical SEO | | R2iSEO0