What's the best way to eliminate duplicate page content caused by blog archives?
-
I (obviously) can't delete the archived pages regardless of how much traffic they do/don't receive.
Would you recommend a meta robot or robot.txt file? I'm not sure I'll have access to the root directory so I could be stuck with utilizing a meta robot, correct?
Any other suggestions to alleviate this pesky duplicate page content issue?
-
I think I understand better now.
Use the noindex,follow tag on the content you don't want included in the search index.
If you are using Wordpress then you should check out http://yoast.com/wordpress/seo/
-
The hypothetical blog posting I want to have indexed is...
www.example.com/blog/2011/10/19
The first sentence of this blog posting is: "Jim and Janice jumped joyfully to Jackson."
I go out to google and search "Jim and Janice jumped joyfully to Jackson." There are 7 results. The first result is the blog posting I want indexed. The 2nd - 7th results are archive pages from my blog. Let's call one of those archive pages...
So, residing on this archive page are all of my postings from October 2011 including Jim and Janice's. Thus, there appears to be a ton of duplicate content on my site.
If I implement a canonical tag on the archive page, won't this archive page be referred to the blog posting I want indexed?
If so, that won't work. I need the blog posting and all the archive pages to remain as is but I don't want the archive pages to be indexed or show up as duplicate content.
Thoughts?
-
The hypothetical blog posting I want to have indexed is...
www.example.com/blog/2011/10/19
The first sentence of this blog posting is: "Jim and Janice jumped joyfully to Jackson."
I go out to google and search "Jim and Janice jumped joyfully to Jackson." There are 7 results. The first result is the blog posting I want indexed. The 2nd - 7th results are archive pages from my blog. Let's call one of those archive pages...
So, residing on this archive page are all of my postings from October 2011 including Jim and Janice's. Thus, there appears to be a ton of duplicate content on my site.
If I implement a canonical tag on the archive page, won't this archive page be referred to the blog posting I want indexed?
If so, that won't work. I need the blog posting and all the archive pages to remain as is but I don't want the archive pages to be indexed or show up as duplicate content.
Thoughts?
-
I agree with James, best to implement canonical tags.
-
The best way would be to implement canonical tags on these pages,
Example from Google:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way to handle product filter URLs?
I've been researching and can't find a clear cut answer. Imagine you have a product category page e.g. domain/jeans You've a lot of options as to how to filter the results domain/jeans?=ladies,skinny,pink,10 or domain/jeans/ladies-skinny-pink-10 or domain/jeans/ladies/skinny?=pink,10 And in this how do you handle titles, breadcrumbs etc. Is the a way you prefer to handle filters and why do you do it that way? I'm trying to make my mind up as some very big names handle this differently e.g. http://www.next.co.uk/shop/gender-women-category-jeans/colour-pink-fit-skinny-size-10r VS https://www.matalan.co.uk/womens/shop-by-category/jeans?utf8=✓&[facet_filter][meta.tertiary_category][Skinny]=on&[facet_filter][variants.meta.size][Size+10]=on&[facet_filter][meta.master_colour][Midwash]=on&[facet_filter][min_current_price][gte]=6.0&[facet_filter][min_current_price][lte]=18.0&per=36&sort=
Technical SEO | | RodneyRiley0 -
Thousands of 404-pages, duplicate content pages, temporary redirect
Hi, i take over the SEO of a quite large e-commerce-site. After checking crawl issues, there seems to be +3000 4xx client errors, +3000 duplicate content issues and +35000 temporary redirects. I'm quite desperate regarding these results. What would be the most effective way to handle that. It's a magento shop. I'm grateful for any kind of help! Thx,
Technical SEO | | posthumus
boris0 -
Duplicate Page Content
Hi, I just had my site crawled by the seomoz robot and it came back with some errors. Basically it seems the categories and dates are not crawling directly. I'm a SEO newbie here Below is a capture of the video of what I am talking about. Any ideas on how to fix this? Hkpekchp
Technical SEO | | mcardenal0 -
Using a Feedburner RSS link in your blog's header tag
It was suggested in Quick Sprout's Advanced SEO guide that it's good form to place your Feedburner RSS link into the header tag of your blog. Anyone know if this needs to be done for every page header of the blog, or just the home/main/index page? Thanks
Technical SEO | | Martin_S0 -
Instance IDs on "Events" in wordpress causing duplicate content
Hi all I use Yoast SEO on wordpress which does a pretty good job of insertint rel=canonical in to the header of the pages where approproate, including on my event pages. However my crawl diagnostics have highlighted these event pages as duplicate content and titles because of the instance id parameter being added to the URL. When I look at the pages head I see that rel=canonical is as it should be. Please see here for an example: http://solvencyiiwire.com/ai1ec_event/unintended-consequences-basel-ii-and-solvency-ii?instance_id= My question is how come SEOMoz is highlighting these pages as duplicate content and what can I do to remedy this. Is it because ?instance_id= is part of the string on the canonical link? How do I remove this? My client uses the following plugins "All-in-One Event Calendar by Timely" and
Technical SEO | | wellsgp
Google Calendar Events Many thanks!0 -
How do I fix duplicate content with the home page?
This is probably SEO 101, but I'm unsure what to do here... Last week my weekly crawl diagnostics were off the chart because http:// was not resolving to http://www...fixed that but now it's saying I have duplicate content on: http://www.......com http://www.......com/index.php How do I fix this? Thanks in advance!
Technical SEO | | jgower0 -
Duplicate page content errors in SEOmoz
Hi everyone, we just launched this new site and I just ran it through SEOmoz and I got a bunch of duplicate page content errors. Here's one example -- it says these 3 are duplicate content: http://www.alicealan.com/collection/alexa-black-3inch http://www.alicealan.com/collection/alexa-camel-3inch http://www.alicealan.com/collection/alexa-gray-3inch You'll see from the pages that the titles, images and small pieces of the copy are all unique -- but there is some copy that is the same (after all, these are pretty much the same shoe, just a different color). So, why am I getting this error and is there any best way to address? Thanks so much!
Technical SEO | | ketanmv
Ketan0 -
Does 'framing' a website create duplicate content?
Something I have not come across before, but hope others here are able offer advice based on experience: A client has independently created a series of mini-sites, aimed at targeting specific locations. The tactic has worked very well and they have achieved a large amount of well targeted traffic as a result. Each mini-site is different but then in the nav, if you want to view prices or go to the booking page, that then links to what at first appears to be their main site. However, you then notice that the URL is actually situated on the mini-site. What they have done is 'framed' the main site so that it appears exactly the same even when navigating through this exact replica site. Checking the code, there is almost nothing there - in fact there is actually no content at all. Below the head, there is a piece of code: <frameset rows="*" framespacing=0 frameborder=0> <frame src="[http://www.example.com](view-source:http://www.yellowskips.com/)" frameborder=0 marginwidth=0 marginheight=0> <noframes>Your browser does not support frames. Click [here](http://www.example.com) to view.noframes> frameset> Given that main site content does not appear to show in the source code, do we have an issue with duplicate content? This issue is that these 'referrals' are showing in Analytics, despite the fact that the code does not appear in the source, which is slightly confusing for me. They have done this without consultation and I'm very concerned that this could potentially be creating duplicate content of their ENTIRE main site on dozens of mini-sites. I should also add that there are no links to the mini-sites from the main site, so if you guys advise that this is creating duplicate content, I would not be worried about creating a link-wheel if I advise them to link directly to the main site rather than the framed pages. Thanks!
Technical SEO | | RiceMedia0