Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What should be done with old news articles?
-
Hello,
We have a portal website that gives information about the industry we work in. This website includes various articles, tips, info, reviews and more about the industry.We also have a news section that was previously indexed in Google news but is not for the past few month.The site was hit by Panda over a year ago and one of the things we have been thinking of doing is removing pages that are irrelavant/do not provide added value to the site.Some of these pages are old news articles posted over 3-4 years ago and that have had hardly any traffic to.All the news articles on the site are under a /archive/ folder sorted by month and year, so for example a url for a news item from April 2010 would be /archive/042010/article-nameMy question is do you think removing such news articles would benefit the site helping it get out of Panda (many other things have been done in the site as well), if not what is the best suggested way to keep these articles on the site in a way which Google indexes them and treats them well.thx
-
Basically I don't see a reason to remove old news articles from a site, as it makes sense to still have an archive present. The only reason I could think of to remove them is if they are duplicate versions of texts that have originally been published somewhere else. Or if the quality is really crap...
-
if the articles are good - then there just might be value to the user . Depending on the niche / industry those old articles could be very important.
Google dosen't like those as you probably have a lot of impression but no clicks (so mainly no traffic) or maybe the "score" is bad (bounce rate - not Google analytics bounce rate, but Google's bounce rate - if they bounce to serps that is).
Since you got hit by panda, in my opinion, I see two options:
1. No index those old pages. The users can still get tho those by navigation, site search etc but google won't see them. Google is fine with having content (old, poor, thin etc) if it's not in the index. I work with a site that has several million pages and 80% is no index - everything is fine now (they also got hit by Panda).
2. Merge those pages into rich, cool, fresh topic pages (see new york time topic pages sample - search for it - I think there is also an seomoz post - a whiteboard friday about it). This is a good approach and if you manage to merge those old pages with some new content you will be fine. Topic pages are great as an anti panda tool !
If you merge the pages into topic pages do that based on a simple flow:
1. identify a group of pages that covers the same topic.
2. identify the page that has the highest authority of all.
3. Change this page into the topic page - keep the url.
4. Merge the other into this page (based on your new topic page structure and flow)
5. 301 redirect the others to this one
6. build a separat xml sitemaps with all those pages and load it up to WMT. Monitor it.
7. Build some links to some of those landing pages, get some minimum social signals to those - to a few (depending on the number). Build an index typoe of page with those topic pages or some of them (user friendly one/ ones) and use those as target to build some links to send the 'love'.
Hope it helps - just some ideas.
-
I do think that any site should remove pages that are not valuable to users.
I would look for the articles that have external links pointed at them and 301 those to something relevant. The rest, you could simply remove and let them return a 404 status. Just make sure all internal links pointing at them are gone. You don't want to lead people to a 404 page.
You could consider putting /archive/ in your robots.txt file if you think the pages have some value to users, but not to the engines. Or putting a no index tag on each page in that section.
If you want to keep the articles on the site, available to both google and users, you have to make sure they meet some of this basic criteria.
- Mostly Unique Content
- Moderate length.
- Good content to ad ratio.
- Content the focus on the page (top/center)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Fresh page versus old page climbing up the rankings.
Hello, I have noticed that if publishe a webpage that google has never seen it ranks right away and usually in a descend position to start with (not great but descend). Usually top 30 to 50 and then over the months it slowly climbs up the rankings. However, if my page has been existing for let's say 3 years and I make changes to it, it takes much longer to climb up the rankings Has someone noticed that too ? and why is that ?
Intermediate & Advanced SEO | | seoanalytics0 -
Magento: Should we disable old URL's or delete the page altogether
Our developer tells us that we have a lot of 404 pages that are being included in our sitemap and the reason for this is because we have put 301 redirects on the old pages to new pages. We're using Magento and our current process is to simply disable, which then makes it a a 404. We then redirect this page using a 301 redirect to a new relevant page. The reason for redirecting these pages is because the old pages are still being indexed in Google. I understand 404 pages will eventually drop out of Google's index, but was wondering if we were somehow preventing them dropping out of the index by redirecting the URL's, causing the 404 pages to be added to the sitemap. My questions are: 1. Could we simply delete the entire unwanted page, so that it returns a 404 and drops out of Google's index altogether? 2. Because the 404 pages are in the sitemap, does this mean they will continue to be indexed by Google?
Intermediate & Advanced SEO | | andyheath0 -
Reverting back to old domain name.
I've recently been asked by a client if I can foresee any issues with reverting back to their original domain name. With the original domain name they had a pretty decent DA for their sector which they have now lost. Although I do appreciate that over time this might come back, the CEO is very keen to switch back to the old domain. They do currently have 301 redirects from the old domain to the new and have implemented rel canonical. As yet they have not notified Google of the change of address using Webmaster Tools. Can anyone forsee any issues with returning back to the old domain name? They have only been using the new domain name for a couple of months which currently has a DA for 1.
Intermediate & Advanced SEO | | Macrofireball0 -
How to de-index old URLs after redesigning the website?
Thank you for reading. After redesigning my website (5 months ago) in my crawl reports (Moz, Search Console) I still get tons of 404 pages which all seems to be the URLs from my previous website (same root domain). It would be nonsense to 301 redirect them as there are to many URLs. (or would it be nonsense?) What is the best way to deal with this issue?
Intermediate & Advanced SEO | | Chemometec0 -
What are the pros & cons of recycling an old domain name?
Hi, Old domain name is about books and book buyback. It had about 1000 pages at one time, been around since 2006, and still shows in Open Site Explorer as 86 links from from 46 domains, PA 43 DA 35, spam score of 4. The 4 evidently relates to low number of internal links and no contact info. The domain name's ownership hasn't changed, but for the last year has either not been up at all or only the homepage in the last couple of months. Now the idea is to maybe re-purpose it for place rating content... no more book content... totally different subject matter. Is this an organic search advantage or would it be better to start fresh with a new domain name? Is Google going to have a harder time seeing it as relevant for a new subject (with good new content) or seeing a new site as important? Thanks... Darcy
Intermediate & Advanced SEO | | 945010 -
Is it worth removing date from Blog Posts / Articles
Wondering, is it worth to remove date from articles from seo perspective. Am sure, Google search algorithm would like demote a post written a year back, as against an article on the same post (unless a year old post has very strong Authoritative links) May be it can turn out a bad user experience of removing dates, but if can hide date using Javascripts so as to show it as image to user and hide it from search engines, is it a good idea !!
Intermediate & Advanced SEO | | Modi0 -
How to deal with old, indexed hashbang URLs?
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/). My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries. So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #). I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way? If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue. Best, -G
Intermediate & Advanced SEO | | Celts180 -
How to beat Wikipedia article from the top spot on SERPS?
Hi Guys, One of our clients has a good web site with lots of content that is ranked already on #2 for the top keyword (singular and plural) on Google UK. The keyword itself is a competitive one. The top spot is occupied by a wikipedia article that doesn't have much content in general. Can anyone come up with an advice what strategy we have to apply to outplace that article? Thanks!
Intermediate & Advanced SEO | | myclicks-1636030