Sitemap use for very large forum-based community site
-
I work on a very large site with two main types of content, static landing pages for products, and a forum & blogs (user created) under each product. Site has maybe 500k - 1 million pages. We do not have a sitemap at this time.
Currently our SEO discoverability in general is good, Google is indexing new forum threads within 1-5 days roughly. Some of the "static" landing pages for our smaller, less visited products however do not have great SEO.
Question is, could our SEO be improved by creating a sitemap, and if so, how could it be implemented? I see a few ways to go about it:- Sitemap includes "static" product category landing pages only - i.e., the product home pages, the forum landing pages, and blog list pages. This would probably end up being 100-200 URLs.
- Sitemap contains the above but is also dynamically updated with new threads & blog posts.
Option 2 seems like it would mean the sitemap is unmanageably long (hundreds of thousands of forum URLs). Would a crawler even parse something that size? Or with Option 1, could it cause our organically ranked pages to change ranking due to Google re-prioritizing the pages within the sitemap?
Not a lot of information out there on this topic, appreciate any input. Thanks in advance. -
Agreed, you'll likely want to go with option #2. Dynamic sitemaps are a must when you're dealing with large sites like this. We advise them on all of our clients with larger sites. If your forum content is important for search then these are definitely important to include as the content likely changes often and might be naturally deeper in the architecture.
In general, I'd think of sitemaps from a discoverability perspective instead of a ranking one. The primary goal is to give Googlebot an avenue to crawl your sites content regardless of internal linking structure.
-
Hi
Go with option 2, there is no scaling issue here. I have worked with and for sites that have a high multiplier on the number of sitemaps and pages that they're submitting, in some cases up to 100M pages. In all cases, Google was totally fine in crawling and processing the data that was there. As long as you follow the guidelines (max 50K URLs in a sitemap) you're fine as you're just providing another file that usually doesn't exceed about 50MB (depending on if you also add images to the sitemap). If you have an engineering team build the right infrastructure you can easily deal with thousands of these files and run them automated every day/week.
My main focus on big sites is also to streamline their sitemaps to have sitemaps with just the last 50.000 pages and the same for the last 50.000 pages that were updated. This way you're able to also monitor the indexation level of these pages. If you are able to, for example, combine the data from log file analysis you can say: we added 50K pages and Google in the last days were able to crawl X percentage of that.
Hope this gives you some extra insights.
Martijn.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap
I have a question for the links in a sitemap. Wordpress works with a sitemap that first link to the different kind of pages: pagesitemap.xml categorysitemap.xml productsitemap.xml etc. etc. These links on the first page are clickable. We have a website that also links to the different pages but it's not clickable, just a flat link. Is this an issue?
Technical SEO | | Happy-SEO0 -
Are sliders killing our site?
Our website, http://shatterbuggy.com, has what I believe is a systemic issue that stems from the heavy reliance upon the Revolution Slider for Wordpress. I am not an SEO expert and our site has vexed many SEOs in the past. We get feedback regularly from customers (especially those that are not tech savvy) that express gratitude for the ease of use via following an image to image sequence to get to their respective booking. This was our goal when creating the site. Incidentally, in many cases, the only linking from page to page is within the slider itself (clickable image) and there is little to no content. That said, we seems to stumble in SERPS against seemingly inferior competition. For example, we should be ranked in spot 1, 2, or 3 ish for "iPhone repair Minneapolis" but rather we are stuck near spot 15. Any thoughts on whether this is a strategy that may be harming us? If so, would simply creating content on these empty (slider only) pages help? Should we create "static links" that connect to the same places as the slider? Also, is our particular use of the slider creating H1 issues? Thank you all! B.
Technical SEO | | BenjaminH0 -
Removing images from site and Image Sitemap SEO advice
Hello again, I have received an update request where they want me to remove images from this site (as of now its a bunch of thumbnails) current page design: http://1stimpressions.com/portfolio/car-wraps/ and turn it into a new design which utilized a slider (such as this): http://1stimpressions.com/portfolio/ They don't want the thumbnails on the page anymore. My question is since my site has a image sitemap that has been indexed will removing all the images hurt my SEO greatly? What would the recommended steps to take to reduce any SEO damage be, if so? Thank you again for your help, always great and very helpful feedback! 🙂 cheers!
Technical SEO | | allstatetransmission0 -
302 redirect used, submit old sitemap?
The website of a partner of mine was recently migrated to a new platform. Even though the content on the pages mostly stayed the same, both the HTML source (divs, meta data, headers, etc.) and URLs (removed index.php, removed capitalization, etc) changed heavily. Unfortunately, the URLs of ALL forum posts (150K+) were redirected using a 302 redirect, which was only recently discovered and swiftly changed to a 301 after the discovery. Several other important content pages (150+) weren't redirected at all at first, but most now have a 301 redirect as well. The 302 redirects and 404 content pages had been live for over 2 weeks at that point, and judging by the consistent day/day drop in organic traffic, I'm guessing Google didn't like the way this migration went. My best guess would be that Google is currently treating all these content pages as 'new' (after all, the source code changed 50%+, most of the meta data changed, the URL changed, and a 302 redirect was used). On top of that, the large number of 404's they've encountered (40K+) probably also fueled their belief of a now non-worthy-of-traffic website. Given that some of these pages had been online for almost a decade, I would love Google to see that these pages are actually new versions of the old page, and therefore pass on any link juice & authority. I had the idea of submitting a sitemap containing the most important URLs of the old website (as harvested from the Top Visited Pages from Google Analytics, because no old sitemap was ever generated...), thereby re-pointing Google to all these old pages, but presenting them with a nice 301 redirect this time instead, hopefully causing them to regain their rankings. To your best knowledge, would that help the problems I've outlined above? Could it hurt? Any other tips are welcome as well.
Technical SEO | | Theo-NL0 -
Best practice for eCommerce site migration, should I 301 redirect or match URLs on new site
Hi Guys, I have been struggling with this one for quite some time. I am no SEO expert like many of you, rather just a small business owner trying to do the right thing, so forgive me if I say something that makes no sense 🙂 I am moving our eCommerce store from one platform to another, in the process the store is getting a massive face lift. The part I am struggling with is whether I should keep my existing URL structure in place or use 301 redirects to create a cleaner looking URLs. Currently the URLs are a little long and I would love to move to a /category/product_name type format. Of course the goal is not to lose ranking in the process, I rank pretty well for several competitive phrases and do not want to create a negative impact. How would you guys handle this? Thanks, Dinesh
Technical SEO | | MyFairyTaleBooks0 -
Base href
I'm having a discussion with a third party that's building a website for a client that I advised concerning his SEO. The site went live 2 weeks ago and it's not getting indexed very well, so my client asked me what could be the problem. I checked several things that could be the problem, like an xml sitemap that was missing etc. But there was another thing that I saw in the source code: <base href="http://www.domain.com/"> Can this be a problem for Google to follow internal links? I always thought that you should use the base href like this: <base href="http://www.domain.com"> so without the trailing / behind the TLD And even better using absolute instead of relative links, no?
Technical SEO | | nvs.nim0 -
Why is this site ranking better than me
Hi just used the compare tool to try and find out why a site is ranking better than me http://www.opensiteexplorer.org/comparisons?site=www.lifestylemonthly.co.uk%2F my site is www.in2town.co.uk and the site i am comparing with is http://www.lifestylemonthly.co.uk/ Can anyone explain what is going on and how i can achieve better ranking results
Technical SEO | | ClaireH-1848860 -
Moving forum to subfolder
Hi, we're thinking of moving our very well established forum to a subfolder on our main domain. I have read that this would be much better for seo reasons/domain authority rather than using a sub domain. Is this correct? Our forum has 15,000 users and gets roughly over 502,087 impressions from 101,147 visits a month. Thanks, James
Technical SEO | | Wedideas0