Delete or not delete outdated content
-
Hi there!
We run a website about a region in Italy, the Langhe area, where we write about wine and food, local culture, and we give touristic informations.The website also sports a nice events calendar: in 4 years we (and our users) loaded more than 5700 events. Now, we're starting to have some troubles managing this database.
The database related to events is huge both in file size and number of rows. There are a lot of images that eat up disk space, and also it's becoming difficult to manage all the data in our backend. Also, a lot of users are entering the website by landing on outdated events.
I was wondering if it could be a good idea to delete events older than 6 months: the idea was to keep only the most important and yearly recurring events (which we can update each year with fresh information), and trash everything else.
This of course means that 404 errors will increase, and also that our content will gettin thinner, but at the same time we'll have a more manageable database, and the content will be more relevant and "clean".
What do you think?
thank you
Best
-
Thank you Donna. We have seen a lot of success with the pruning method for outdated content. I'm glad the article has helped you.
-
I love this post by Everett Sizemore from last year and refer to it often. It's a step-by-step how-to for auditing content. Chapter 8 talks about considerations when deciding whether to rewrite / remove / redirect / consolidate content. Give it a read and see if it helps clarify matters for you.
-
For old content which is expired - just let them 404 or redirect to a newer version of the page (if available).
For new content that is going to expire you could use the unavailable after tag - see also this advice from Matt Cutts on content that expires (it's more about products for e-commerce - but the general principle is the same).Dirk
-
Especially with recurring events, duplicate content also might be causing you issues down the line. You could always delete the events which are older than 12 months, and 301 the old event URLs to the current or upcoming ones. This way, you won't have duplicate content issues with recurring events, and searchers won't be landing on outdated events.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
PDF Instructions come up in Crawl report as Duplicate Content
Hello, My ecommerce site has many PDF instruction pages that are being marked as duplicate content in the site crawl. Each page has a different title, and then a PDF displayed in an iframe with a link back to the previous page & to the category that the product is placed in. Should I add text to the pages to help differentiate them? I included a screenshot of the code that is on all the pages. Thanks! Justin 9tD9HMr
On-Page Optimization | | JustinBSLW0 -
SEO optimization for popular long-tail content
I was wondering if you all have tips / best practices to SEO-optimize new pages created for popular long-tail queries, as well as recommendations to assure we don't have duplicate content issues.
On-Page Optimization | | CuriosityMedia0 -
Duplicate content issue
Hello, I got duplicate content issue on my home page : examplesite.com
On-Page Optimization | | digitalkiddie
examplesite.com/index.html Those page urls are with duplicate content. If in index.html i use 301 redirect like that : Header( "HTTP/1.1 301 Moved Permanently" );
Header( "Location: http://examplesite.com" );
?> would i loose any page authority ? sorry for the newbie question0 -
Duplicate Content- Best Practise Usage of the canonical url
Canonical urls stop self competition - from duplicate content. So instead of a 2 pages with a rank of 5 out of 10, it is one page with a rank of 7 out of 10.
On-Page Optimization | | WMA
However what disadvantages come from using canonical urls. For example am I excluding some products like green widet, blue widget. I have a customer with 2 e-commerce websites(selling different manufacturers of a type jewellery). Both websites have massive duplicate content issues.
It is a hosted CMS system with very little SEO functionality, no plugins etc. The crawling report- comes back with 1000 of pages that are duplicates. It seems that almost every page on the website has a duplicate partner or more. The problem starts in that they have 2 categorys for each product type, instead of one category for each product type.
A wholesale category and a small pack category. So I have considered using a canonical url or de-optimizing the small pack category as I believe it receives less traffic than the whole category. On the original website I tried de- optimizing one of the pages that gets less traffic. I did this by changing the order of the meta title(keyword at the back, not front- by using small to start of with). I also removed content from the page. This helped a bit. Or I was thinking about just using a canonical url on the page that gets less traffic.
However what are the implications of this? What happens if some one searches for "small packs" of the product- will this no longer be indexed as a page. The next problem I have is the other 1000s of pages that are showing as duplicates. These are all the different products within the categories. The CMS does not have a front office that allows for canonical urls to be inserted. Instead it would have to be done going into the html of the pages. This would take ages. Another issue is that these product pages are not actually duplicate, but I think it is because they have such little content- that the rodger(seo moz crawler, and probably googles one too) cant tell the difference.
Also even if I did use the canonical url - what happened if people searched for the product by attributes(the variations of each product type)- like blue widget, black widget, brown widget. Would these all be excluded from Googles index.
On the one hand I want to get rid of the duplicate content, but I also want to have these pages included in the search. Perhaps I am taking too idealistic approach- trying to optimize a website for too many keywords. Should I just focus on the category keywords, and forget about product variations. Perhaps I look into Google Analytics, to determine the top landing pages, and which ones should be applied with a canonical. Also this website(hosted CMS) seems to have more duplicate content issues than I have seen with other e-commerce sites that I have applied SEO MOZ to On final related question. The first website has 2 landing pages- I think this is a techical issue. For example www.test.com and www.test.com/index. I realise I should use a canonical url on the page that gets less traffic. How do I determine this? (or should I just use the SEO MOZ Page rank tool?)0 -
Duplicate content because of content scrapping - please help
We manage brands websites in a very competitive industry that have thousands of affiliate links We see that more and more websites (mainly affiliates websites) are scrapping our brand websites content and it generate many duplicate content (but most of them link to us back with an affiliate link). Our brand websites still rank for any sentence in brackets you search in Google, Will this duplicate content hurt our brand websites ? If yes, should we take some preventive actions ? We are not able to add ongoing UGC or additional text to all our duplicate content and trying to stop those websites of stealing our content is like playing cat and mouse... Thanks for your advices
On-Page Optimization | | Tit0 -
Ecommerce: content on category pages
I have to optimize some online Shops and after Panda I really don't know what to think about thin content on product overview pages anymore... used to be that we could improve our rankings easily just by adding 1-2 sentences on such a page. This always worked for non-overly competitive terms. Now It feels like it doesn't work any longer, but I couldn't put my finger on it and I don't have the resources to test. Here's an example of what I mean: http://www.geschenkidee.ch/wandtattoos/aus_aller_welt.html
On-Page Optimization | | zeepartner
I would add max. 3 lines of text directly over the product thumbnails. What do you think? Is it worth adding some text on a product overview page or do I not even have to bother post-Panda?0 -
Duplicate Page Content Issue
For one of our campaigns, we have 164 errors for Duplicate Page Content. We have a website where much of the same content lives in two different places on their website. The information needs to be accessible from both areas. What is the best way to tackle this problem? Is there anything that can be done so these pages are not competing against one another? If the only solution is to edit the content on one of the pages, how much of the content has to be different? Is there a certain percentage to go by? Here is an example of what I am referring to: 1.) http://www.valleyorthopedicassociates.com/services/foot-center/preventing-sprains-and-strains 2.) http://www.valleyorthopedicassociates.com/patient-resources/service/foot-and-ankle-center/preventing-sprains-and-strains
On-Page Optimization | | cmaseattle1 -
Magento Layered Navigation & Duplicate Content
Hello Dear SeoMoz, I would like to ask your help with something that I am not sure off. Our ecommerce web site is built with Magento. I have found many problems so far and I know that there will be many more in the future. Currently, I am trying to find the best way to deal with the duplicate content that is produced from the layered navigation (size, gender etc). I have done a lot of research so far in order to understand which might be the best practice and I found the following practices: **Block layered navigation URLSs from the Google Webmaster Tools (**Apparently this works for Google Only). Block these URLs with the robots.txt file Make links no-follow **Make links JavaScript from Magento *** Avoid including these links in the xml site map. Avoid including these link in the A-Z Product Index. Canonical tag Meta Tags (noindex, nofollow) Question If I turn the layered navigation links into JavaScript links from the Magento Admin, the layered navigation links are still found by the crawlers but they look like that: | http://www.mysite.com/# instead of: http://www.mysite.com/girls-basics.html?gender_filte... | Can these new URLS (http://www.mysite.com/# ) solve the duplicate content problems with the layered navigation or do I need to implement other practices too to make sure that everything is done right. Kind Regards Stefanos Anastasiadis
On-Page Optimization | | alexandalexaseo0