Best way to "Prune" bad content from large sites?
-
I am in process of pruning my sites for low quality/thin content. The issue is that I have multiple sites with 40k + pages and need a more efficient way of finding the low quality content than looking at each page individually. Is there an ideal way to find the pages that are worth no indexing that will speed up the process but not potentially harm any valuable pages?
Current plan of action is to pull data from analytics and if the url hasn't brought any traffic in the last 12 months then it is safe to assume it is a page that is not beneficial to the site. My concern is that some of these pages might have links pointing to them and I want to make sure we don't lose that link juice. But, assuming we just no index the pages we should still have the authority pass along...and in theory, the pages that haven't brought any traffic to the site in a year probably don't have much authority to begin with.
Recommendations on best way to prune content on sites with hundreds of thousands of pages efficiently? Also, is there a benefit to no indexing the pages vs deleting them? What is the preferred method, and why?
-
I have a section of my website where I heavily use embedded content. Embeds from Youtube, Slideshare, Twitter, Quora etc. Google thinks they're thin, and they don't show up in my analytics because you can read the content without clicking on the page.
http://getonthemap.us/twitter/blog
But I like them, and I think they're helpful. So I no-indexed all but one of the blog posts in that section. It retains the backlinks to the posts, but cleans me up with Google.
If you're deleting, can't you do that quickly from your console?
-
It's hard to say exactly without seeing your site since there are so many potential variables (e.g. are most of your blog posts low quality or just a minority? etc) that would define the best way to go about it.
What I can say though is that you're on the right track as far as using analytics data to determine which ones are providing value right now. There is a danger in losing some rankings if you go removing a huge volume of these posts. Unless they're utter rubbish posts, they'll likely be providing relevance signals to Google on what your site is about. That said, I do think it's a necessary evil and I'd expect you'll be rewarded for it in the long run provided you start replacing the trash with high quality posts in the future.
As for the benefits, if they really are low quality then user engagement is going to be terrible which is obviously not what you should be aiming for. It's also going to be chewing up your crawl budget for no good reason so the leaner your site is, the better base you have to start rebuilding with quality instead of quantity. For the same reason, I generally suggest removing tags and categories that aren't providing any actual benefit too - in most cases I see they're just there either "for good SEO" or because the site owners things that's how users are browsing their site but in almost all cases, that's not true. As always, check your own data on this to be sure.
As for removing vs noindex, this one is always contentious but I lean toward removing simply because it's going to clean things up for the user too and ultimately they should be your primary focus. Having 40,000+ pages of trash on your website is a fantastic indicator to them that your site may not be somewhere they want to be and noindexing them won't do anything to change the user's experience.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way to use redirects on a massive site consolidation
We are migrating 13 websites into a single new domain and with that we have certain pages that will be terminated or moved to a new folder path so we need custom 301 redirects built for these. However, we have a huge database of pages that will NOT be changing folder paths and it's way too many to write custom 301's for. One idea was to use domain forwarding or a wild card redirect so that all the pages would be redirected to their same folder path on the new URL. The problem this creates though is that we would then need to build the custom 301s for content that is moving to a new folder path, hence creating 2 redirects on these pages (one for the domain forwarding, and then a second for the custom 301 pointing to a new folder). Any ideas on a better solution to this?
Intermediate & Advanced SEO | | MJTrevens0 -
Best way to go about merging 2 sites with significant search volume?
Hi everyone! A client of ours ('Company A') recently acquired another company ('Company B') - both brands carry weight within their industry. Company A's brand name currently registers over 6,500 searches per month, while Company B's brand name draws about 2,500 searches per month. While Company B is smaller, their search volume isn't insignificant. The powers that be plan to discontinue Company B's site at an unspecified date in the future, but it's on the backburner. We'd obviously like to transfer as much of their current ranking as possible, but we also don't want to confuse users. There's additional search volume for term variations such as 'Company B jobs' & 'Company B locations' that we'd like to capture for as long as there's still volume there. Would a microsite with Company B's look & feel (to make it easier to house pages built to capture careers/locations searches) justify its inherent cost, or would it be just as valuable to build a series of landing pages on Company A's site? (Obviously assuming that valid redirects would be in place once Company B's site is taken down.) Thanks in advance!
Intermediate & Advanced SEO | | wilcoxcm0 -
Glossary/Terms Page - What is the best way?
We have a glossary section on our website with hundreds of terms. At the moment we have it split into letters, e.g. there one page with all the terms starting with A, another for B etc.. I am conscious that this is not the best way to do things as not all of these pages are being indexed, and the traffic we get to these pages is very low. Any suggestions on what would be the best way to improve this? The 2 ideas I have at the moment are Have every term on a separate page, but ensuring there is enough copy for that term Leave as is, but have the URL change once a user scrolls down the page. E.g. the first page would be www.website.com/glossary/a/term-1 then once the user scrolls past this terms and onto the next one the URL would change to www.website.com/glossary/a/term-2
Intermediate & Advanced SEO | | brian-madden0 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
Post your 3 best ways to rank well on Google
Hi, Anyone care to share what are your 3 best ways to rank well on Google? As for me i think: 1.) Link building & Social Media 2.) Onsite optimization 3.) Quality Content What about you?
Intermediate & Advanced SEO | | chanel270 -
Best Format for URLs on large Ecommerce Site?
I saw this article, http://www.distilled.net/blog/seo/common-ecommerce-technical-seo-problems/, and noticed that Geoff mentioned that product URLs format should be in one of the following ways: Product Page: site.com/product-name Product Page: site.com/category/sub-category/product-name However, for SEO, is there a preferred way? I understand that the top one may be better to prevent duplicate page issues, but I would imagine that the bottom would be better for conversion (maybe the user backtracks to site.com/category/sub-category/ to see other products that he may be interested in). Also, I'd imagine that the top URL would not be a great way to distribute link juice since everything would be attached to the root, right?
Intermediate & Advanced SEO | | eTundra0 -
Is traffic and content really important for an e-commerce site???
Hi All, I'm maintaining an e-commerce website and I've encountered some related keywords that I know will not convert to sales but are related to the subject and might help becoming an "authority". I'll give an example... If a car dealership wrote an amazing article about cleaning a car.
Intermediate & Advanced SEO | | BeytzNet
Obviously it is related but the chances of someone looking to clean his car will go ahead and buy one now are quite low. Also, he will probably bounce out of this page after reading the piece. To conclude, Would such an article do GOOD (helping to become an authority and having more visitors) or BAD (low conversion rate and high bounce rate)? Thanks0 -
What is the best practice when a client is setting up multiple sites/domains
I have a client that is creating separate websites to be used for different purposes. What is the best practice here with regards to not looking spammy. i.e. do the domains need to registered with different companies? hosted on different servers, etc? Thanks in advance for your response.
Intermediate & Advanced SEO | | Dan-1718030