Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best way to "Prune" bad content from large sites?
-
I am in process of pruning my sites for low quality/thin content. The issue is that I have multiple sites with 40k + pages and need a more efficient way of finding the low quality content than looking at each page individually. Is there an ideal way to find the pages that are worth no indexing that will speed up the process but not potentially harm any valuable pages?
Current plan of action is to pull data from analytics and if the url hasn't brought any traffic in the last 12 months then it is safe to assume it is a page that is not beneficial to the site. My concern is that some of these pages might have links pointing to them and I want to make sure we don't lose that link juice. But, assuming we just no index the pages we should still have the authority pass along...and in theory, the pages that haven't brought any traffic to the site in a year probably don't have much authority to begin with.
Recommendations on best way to prune content on sites with hundreds of thousands of pages efficiently? Also, is there a benefit to no indexing the pages vs deleting them? What is the preferred method, and why?
-
I have a section of my website where I heavily use embedded content. Embeds from Youtube, Slideshare, Twitter, Quora etc. Google thinks they're thin, and they don't show up in my analytics because you can read the content without clicking on the page.
http://getonthemap.us/twitter/blog
But I like them, and I think they're helpful. So I no-indexed all but one of the blog posts in that section. It retains the backlinks to the posts, but cleans me up with Google.
If you're deleting, can't you do that quickly from your console?
-
It's hard to say exactly without seeing your site since there are so many potential variables (e.g. are most of your blog posts low quality or just a minority? etc) that would define the best way to go about it.
What I can say though is that you're on the right track as far as using analytics data to determine which ones are providing value right now. There is a danger in losing some rankings if you go removing a huge volume of these posts. Unless they're utter rubbish posts, they'll likely be providing relevance signals to Google on what your site is about. That said, I do think it's a necessary evil and I'd expect you'll be rewarded for it in the long run provided you start replacing the trash with high quality posts in the future.
As for the benefits, if they really are low quality then user engagement is going to be terrible which is obviously not what you should be aiming for. It's also going to be chewing up your crawl budget for no good reason so the leaner your site is, the better base you have to start rebuilding with quality instead of quantity. For the same reason, I generally suggest removing tags and categories that aren't providing any actual benefit too - in most cases I see they're just there either "for good SEO" or because the site owners things that's how users are browsing their site but in almost all cases, that's not true. As always, check your own data on this to be sure.
As for removing vs noindex, this one is always contentious but I lean toward removing simply because it's going to clean things up for the user too and ultimately they should be your primary focus. Having 40,000+ pages of trash on your website is a fantastic indicator to them that your site may not be somewhere they want to be and noindexing them won't do anything to change the user's experience.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rel="prev" / "next"
Hi guys, The tech department implemented rel="prev" and rel="next" on this website a long time ago.
Intermediate & Advanced SEO | | AdenaSEO
We also added a canonical tag to the 'own' page. We're talking about the following situation: https://bit.ly/2H3HpRD However we still see a situation where a lot of paginated pages are visible in the SERP.
Is this just a case of rel="prev" and "next" being directives to Google?
And in this specific case, Google deciding to not only show the 1st page in the SERP, but still show most of the paginated pages in the SERP? Please let me know, what you think. Regards,
Tom1 -
Breaking up a site into multiple sites
Hi, I am working on plan to divide up mid-number DA website into multiple sites. So the current site's content will be divided up among these new sites. We can't share anything going forward because each site will be independent. The current homepage will change to just link out to the new sites and have minimal content. I am thinking the websites will take a hit in rankings but I don't know how much and how long the drop will last. I know if you redirect an entire domain to a new domain the impact is negligible but in this case I'm only redirecting parts of a site to a new domain. Say we rank #1 for "blue widget" on the current site. That page is going to be redirected to new site and new domain. How much of a drop can we expect? How hard will it be to rank for other new keywords say "purple widget" that we don't have now? How much link juice can i expect to pass from current website to new websites? Thank you in advance.
Intermediate & Advanced SEO | | timdavis0 -
Could another site copying my content hurt my ranking?
Earlier this week I asked why a page of mine might not be ranking locally. (https://moz.com/community/q/what-could-be-stopping-us-from-ranking-locally). Maybe this might be part of the answer – another firm has copied huge chunks of my website copy: **My company: **https://idearocketanimation.com/video-production-company/ The other company: http://studio3dm.com/studio3dm-com/video/ Could this be causing my page to not rank? And is there anything I can do about it, other than huff and puff to the other firm? (Which I am already doing.)
Intermediate & Advanced SEO | | Wagster0 -
Can Google read content that is hidden under a "Read More" area?
For example, when a person first lands on a given page, they see a collapsed paragraph but if they want to gather more information they press the "read more" and it expands to reveal the full paragraph. Does Google crawl the full paragraph or just the shortened version? In the same vein, what if you have a text box that contains three different tabs. For example, you're selling a product that has a text box with overview, instructions & ingredients tabs all housed under the same URL. Does Google crawl all three tabs? Thanks for your insight!
Intermediate & Advanced SEO | | jlo76130 -
Using hreflang="en" instead of hreflang="en-gb"
Hello, I have a question in regard to international SEO and the hreflang meta tag. We are currently a B2B business in the UK. Our major market is England with some exceptions of sales internationally. We are wanting to increase our ranking into other english speaking countries and regions such as Ireland and the Channel Islands. My research has found regional google search engines for Ireland (google.ie), Jersey (google.je) and Guernsey (google.gg). Now, all the regions have English as one their main language and here is my questions. Because I use hreflang=“en-gb” as my site language, am I regional excluding these countries and islands? If I used hreflang=“en” would it include these english speaking regions and possible increase the ranking on these the regional search engines? Thank you,
Intermediate & Advanced SEO | | SilverStar11 -
Is tabbed content bad for SEO?
I work for a Theater show listings and ticketing website. In our show listings pages (e.g. http://www.theatermania.com/broadway/this-is-our-youth_302998/) we split our content into separate tabs (overview, pricing and show dates, cast, and video). Are we shooting ourselves in the foot by separating the content? Are we better served with keeping it all in a single page? Thanks so much!
Intermediate & Advanced SEO | | TheaterMania0 -
Best way to noindex an image?
Hi all, A client wanted a few pages noindexed, which was no problem using the meta robots noindex tag. However they now want associated images removed, some of which still appear on pages that they still want indexed. I added the images to their robots.txt file a few weeks ago (probably over a month ago actually) but they're all still showing when you do an image search. What's the best way to noindex them for good, and how do I go about implementing it? Many thanks, Steve
Intermediate & Advanced SEO | | steviephil0 -
News sites & Duplicate content
Hi SEOMoz I would like to know, in your opinion and according to 'industry' best practice, how do you get around duplicate content on a news site if all news sites buy their "news" from a central place in the world? Let me give you some more insight to what I am talking about. My client has a website that is purely focuses on news. Local news in one of the African Countries to be specific. Now, what we noticed the past few months is that the site is not ranking to it's full potential. We investigated, checked our keyword research, our site structure, interlinking, site speed, code to html ratio you name it we checked it. What we did pic up when looking at duplicate content is that the site is flagged by Google as duplicated, BUT so is most of the news sites because they all get their content from the same place. News get sold by big companies in the US (no I'm not from the US so cant say specifically where it is from) and they usually have disclaimers with these content pieces that you can't change the headline and story significantly, so we do have quite a few journalists that rewrites the news stories, they try and keep it as close to the original as possible but they still change it to fit our targeted audience - where my second point comes in. Even though the content has been duplicated, our site is more relevant to what our users are searching for than the bigger news related websites in the world because we do hyper local everything. news, jobs, property etc. All we need to do is get off this duplicate content issue, in general we rewrite the content completely to be unique if a site has duplication problems, but on a media site, im a little bit lost. Because I haven't had something like this before. Would like to hear some thoughts on this. Thanks,
Intermediate & Advanced SEO | | 360eight-SEO
Chris Captivate0