Best way to "Prune" bad content from large sites?
-
I am in process of pruning my sites for low quality/thin content. The issue is that I have multiple sites with 40k + pages and need a more efficient way of finding the low quality content than looking at each page individually. Is there an ideal way to find the pages that are worth no indexing that will speed up the process but not potentially harm any valuable pages?
Current plan of action is to pull data from analytics and if the url hasn't brought any traffic in the last 12 months then it is safe to assume it is a page that is not beneficial to the site. My concern is that some of these pages might have links pointing to them and I want to make sure we don't lose that link juice. But, assuming we just no index the pages we should still have the authority pass along...and in theory, the pages that haven't brought any traffic to the site in a year probably don't have much authority to begin with.
Recommendations on best way to prune content on sites with hundreds of thousands of pages efficiently? Also, is there a benefit to no indexing the pages vs deleting them? What is the preferred method, and why?
-
I have a section of my website where I heavily use embedded content. Embeds from Youtube, Slideshare, Twitter, Quora etc. Google thinks they're thin, and they don't show up in my analytics because you can read the content without clicking on the page.
http://getonthemap.us/twitter/blog
But I like them, and I think they're helpful. So I no-indexed all but one of the blog posts in that section. It retains the backlinks to the posts, but cleans me up with Google.
If you're deleting, can't you do that quickly from your console?
-
It's hard to say exactly without seeing your site since there are so many potential variables (e.g. are most of your blog posts low quality or just a minority? etc) that would define the best way to go about it.
What I can say though is that you're on the right track as far as using analytics data to determine which ones are providing value right now. There is a danger in losing some rankings if you go removing a huge volume of these posts. Unless they're utter rubbish posts, they'll likely be providing relevance signals to Google on what your site is about. That said, I do think it's a necessary evil and I'd expect you'll be rewarded for it in the long run provided you start replacing the trash with high quality posts in the future.
As for the benefits, if they really are low quality then user engagement is going to be terrible which is obviously not what you should be aiming for. It's also going to be chewing up your crawl budget for no good reason so the leaner your site is, the better base you have to start rebuilding with quality instead of quantity. For the same reason, I generally suggest removing tags and categories that aren't providing any actual benefit too - in most cases I see they're just there either "for good SEO" or because the site owners things that's how users are browsing their site but in almost all cases, that's not true. As always, check your own data on this to be sure.
As for removing vs noindex, this one is always contentious but I lean toward removing simply because it's going to clean things up for the user too and ultimately they should be your primary focus. Having 40,000+ pages of trash on your website is a fantastic indicator to them that your site may not be somewhere they want to be and noindexing them won't do anything to change the user's experience.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I am temporarily moving a site to a new domain. Which redirect is best?
A client is having their site redeveloped on a new platform in sections and are moving the sections that are on the new platform to a temporary subdomain until the entire site is migrated. This is happening over the course of 2-3 months. During this time, is it best for the site to use 302 temporary redirects during this time (URL path not changing), or is it best to 301 to the temp. domain, then 301 back to the original once the new platform is completely migrated? Thanks!
Intermediate & Advanced SEO | | Matt3120 -
Hidden category content really bad?
Hi Guys, I'm working with a site which has hidden based category content see: http://i.imgur.com/Sgko2we.jpg It seems google are still indexing these pages but i heard Google might ignore or reduce the benefit of hidden content like this.I just want to confirm if this is the case? And if this is a really bad thing for SEO?Cheers.Sgko2we.jpg
Intermediate & Advanced SEO | | seowork2140 -
What is the best way to take advantage of this keyword?
Hi SEO's! I've been checking out webmaster tools (screenshot attached) and noticed that we're getting loads of long tail searches around a search query 'arterial and venous leg ulcers' - on a side note we're a nursing organisation so excuse the content of the search!!! The trouble is that google is indexing a PDF page which we give out as a freebie:
Intermediate & Advanced SEO | | 9868john
http://www.nursesfornurses.com.au/admin/uploads/5DifferencesBetweenVenousAndArterialLegUlcers1.pdf This PDF is a couple of years old and needs updating but its got a few links pointing to it. Ok so down to the nitty gritty, we've just launched a blog:
http://news.nursesfornurses.com.au/Nursing-news/ We have a whole wound care category in which this content belongs, and i'm trying to find the best way to take advantage of the search, so I was thinking: Create an article of about 1000 words Update the PDF and re-upload it to the main domain (not the sub domain news.nursesfornurses.com.au) Attach the PDF to the article on the blog OR would it be better to host this on the blog, and setup a 301 redirect to this page? I just need some advice on how best to take advantage of this opportunity, our blog isn't getting much search traffic at the moment (despite having 300+ articles!!) and i'm looking into how we can change that. I look forward to your response and suggestions. Thanks! qtY64B10 -
Google's form for "Small sites that should rank better" | Any experiences or results?
Back in August of 2013 Google created a form that allowed people to submit small websites that "should be ranking better in Google". There is more info about it in this article http://www.seroundtable.com/google-small-site-survey-17295.html Has anybody used it? Any experiences or results you can share? *private message if you do not want to share publicly...
Intermediate & Advanced SEO | | GregB1230 -
What is the best way to get anchor text cloud in line?
So I am working on a website, and it has been doing seo with keyword links for a a few years. The first branded terms comes in a 7% in 10th in the list on Ahefs. The keyword terms are upwards of 14%. What is the best way to get this back in line? It would take several months to build keyword branded terms to make any difference - but it is doable. I could try link removal, but less than 10% seem to actually get removed -- which won't make a difference. The disavow file doesn't really seem to do anything either. What are your suggestions?
Intermediate & Advanced SEO | | netviper0 -
Impact of simplifying website and removing 80% of site's content
We're thinking of simplifying our website which has grown to a very large size by removing all the content which hardly ever gets visited. The plan is to remove this content / make changes over time in small chunks so that we can monitor the impact on SEO. My gut feeling is that this is okay if we make sure to redirect old pages and make sure that the pages we remove aren't getting any traffic. From my research online it seems that more content is not necessarily a good thing if that content is ineffective and that simplifying a site can improve conversions and usability. Could I get people's thoughts on this please? Are there are risks that we should look out for or any alternatives to this approach? At the moment I'm struggling to combine the needs of SEO with making the website more effective.
Intermediate & Advanced SEO | | RG_SEO0 -
3rd Party hosted whitepapers — bad idea? Duplicate content?
It is common the B2B world to have 3rd parties host your whitepapers for added exposure. Is this a bad practice from an SEO point of view? Is the expectation that the 3rd parties use rel=canonical tags? I doubt most of them do . . .
Intermediate & Advanced SEO | | BlueLinkERP0 -
What do I do about sites that copy my content?
I've noticed that there are a number of websites that are copying my content. They are putting the full article on their site, mentioning that it was reposted from my site, but contains no links to me. How should I approach this? What are my rights and should I ask them to remove it or add a link? Will the duplicate content affect me?
Intermediate & Advanced SEO | | JohnPeters0