Using robots.txt to deal with duplicate content
-
I have 2 sites with duplicate content issues.
One is a wordpress blog.
The other is a store (Pinnacle Cart).
I cannot edit the canonical tag on either site. In this case, should I use robots.txt to eliminate the duplicate content?
-
It will be any part of the URL that doesn't handle navigation, so look at what you can delete off the URL without breaking the link to the product page.
Take a look at this: http://googlewebmastercentral.blogspot.com/2009/10/new-parameter-handling-tool-helps-with.html
Remember, this will only work with Google!
This is another interesting video from Matt Cutts about removing content from Google: http://googlewebmastercentral.blogspot.com/2008/01/remove-your-content-from-google.html
-
If the urls look like this...
Would I tell Google to ignore p, mode, parent, or CatalogSetSortBy? Just one of those or all of those?
Thanks!!!
-
For Wordpress try : http://wordpress.org/extend/plugins/canonical/
also look at Yoast's Wordpress SEO plugin referenced on that page - I love it!
and for the duplicate content caused by the dymanic content on the pinnacle cart you can use the Google Webmasters tool to tell the Google to ignore certain parameters - go to Site configuration - Settings - Parameter handling and add the variables you wish to ignore to this list.
-
Hi,
The two sites are unrelated to each other so my concern is not duplicate content between the two, there is none.
However, on each of the sites I have the duplicate content issues. I do have admin privileges to both sites.
If there is a Wordpress plug in that would be great. Do you have one that you would recommend?
For my ecommerce site using pinnacle cart, I have duplicates because of the way people can search on the site. For example:
|
http://www.domain.com/accessories/
http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=date
http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=name
http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=price
|
These all show as duplicate content in my webmaster tools reports. I don't have the ability to edit each head tag of pages in order to add a canonical link on this site.
-
What are your intentions here? Do you intend to leave both sites running? Can you give us more information on the sites? Are they aged domains, is one/any/both of them currently attracting any inbound links, are they ranking? What is the purpose of the duplicate content?
Are you looking to redirect traffic from one of the sites to the other using 301 redirect?
Or do you want both sites visible - using the Canonical link tag?
(I am concerned that you say you 'cannot edit the tag'? Do you not have full Admin access to either site?
There are dedicated Canonical management plugins for Wordpress (if you have access to the wp-admin area)
You are going to need some admin priviledges to make any alterations to the site so that you can correct this.
Let us know a bit more please!
These articles may be useful as they provide detailed best practice info on redirects:
http://www.google.com/support/webmasters/bin/answer.py?answer=66359
http://www.seomoz.org/blog/duplicate-content-block-redirect-or-canonical
Check out this article on redirects
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
ViewState and Duplicate Content
Our site keeps getting duplicated content flagged as an issue... however, the pages being grouped together have very little in common on-page. One area which does seem to recur across them is the ViewState. There's a minimum of 150 lines across the ones we've investigated. Could this be causing the reports?
Technical SEO | | RobLev0 -
Do you think my client is being hit for duplicate content?
Wordpress website. The client's website is http://www.denenapoints.com/ The URL that we purchase so that we could setup the hosting account is http://houston-injury-lawyers.com, which shows 1 page indexed in Google when I search for site:http://houston-injury-lawyers.com On http://www.denenapoints.com/ there is <link rel="<a class="attribute-value">canonical</a>" href="http://houston-injury-lawyers.com/"> But on http://houston-injury-lawyers.com it says the same thing, <link rel="<a class="attribute-value">canonical</a>" href="http://houston-injury-lawyers.com/" /> Is this how it should be setup, assuming that we want everything to point to http://denenapoints.com/? Maybe we should do a 301 redirect to be 100% Sure? Hopefully I explained this well enough. Please let me know if anyone has any thoughts, thanks!
Technical SEO | | georgetsn0 -
Duplicate content problem
Hi, i work in joomla and my site is www.in2town.co.uk I have been looking at moz tools and it is showing i have over 600 pages of duplicate content. The problem is shown below and i am not sure how to solve this, any help would be great, | Benidorm News http://www.in2town.co.uk/benidorm-news/Page-2 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-102 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-103 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-104 9 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-106 28 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-11 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-112 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-114 45 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-115 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-116 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-12 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-120 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-123 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-13 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-130 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-131 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-132 31 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-140 4 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-141 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-21 10 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-22 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-23 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-26 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-271 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-274 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-277 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-28 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-29 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-310 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-341 21 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-342 4 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-343 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-345 1 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-346 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-348 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-349 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-350 50 16 0 In2town http://www.in2town.co.uk/blog/In2town/Page-351 50 19 1 In2town http://www.in2town.co.uk/blog/In2town/Page-82 24 1 0 In2town http://www.in2town.co.uk/blog/in2town 50 20 1 In2town http://www.in2town.co.uk/blog/in2town/Page-10 50 23 3 In2town http://www.in2town.co.uk/blog/in2town/Page-100 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-101 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-105 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-107 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-108 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-109 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-110 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-111 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-113 |
Technical SEO | | ClaireH-1848860 -
Duplicate Content Problems
Hi I am new to the seomoz community I have been browsing for a while now. I put my new website into the seomoz dashboard and out of 250 crawls I have 120 errors! So the main problem is duplicate content. We are a website that finds free content sources for popular songs/artists. While seo is not our main focus for driving traffic I wanted to spend a little time to make sure our site is up to standards. With that said you can see when two songs by an artist are loaded. http://viromusic.com/song/125642 & http://viromusic.com/song/5433265 seomoz is saying that it is duplicate content even though they are two completely different songs. I am not exactly sure what to do about this situation. We will be adding more content to our site such as a blog, artist biographies and commenting maybe this will help? Although if someone was playing multiple bob marley songs the biography that is loaded will also be the same for both songs. Also when a playlist is loaded http://viromusic.com/playlist/sldvjg on the larger playlists im getting an error for to many links on the page. (some of the playlists have over 100 songs) any suggestions? Thanks in advance and any tips or suggestions for my new site would be greatly appreciated!
Technical SEO | | mikecrib10 -
Block Quotes and Citations for duplicate content
I've been reading about the proper use for block quotes and citations lately, and wanted to see if I was interpreting it the right way. This is what I read: http://www.pitstopmedia.com/sem/blockquote-cite-q-tags-seo So basically my question is, if I wanted to reference Amazon or another stores product reviews, could I use the block quote and citation tags around their content so it doesn't look like duplicate content? I think it would be great for my visitors, but also to the source as I am giving them credit. It would also be a good source to link to on my products pages, as I am not competing with the manufacturer for sales. I could also do this for product information right from the manufacturer. I want to do this for a contact lens site. I'd like to use Acuvue's reviews from their website, as well as some of their product descriptions. Of course I have my own user reviews and content for each product on my website, but I think some official copy could do well. Would this be the best method? Is this how Rottentomatoes.com does it? On every movie page they have 2-3 sentences from 50 or so reviews, and not much unique content of their own. Cheers, Vinnie
Technical SEO | | vforvinnie1 -
Should search pages be disallowed in robots.txt?
The SEOmoz crawler picks up "search" pages on a site as having duplicate page titles, which of course they do. Does that mean I should put a "Disallow: /search" tag in my robots.txt? When I put the URL's into Google, they aren't coming up in any SERPS, so I would assume everything's ok. I try to abide by the SEOmoz crawl errors as much as possible, that's why I'm asking. Any thoughts would be helpful. Thanks!
Technical SEO | | MichaelWeisbaum0 -
Duplicate Content
We have a main sales page and then we have a country specific sales page for about 250 countries. The country specific pages are identical to the main sales page, with the small addition of a country flag and the country name in the h1. I have added a rel canonical tag to all country pages to send the link juice and authority to the main page, because they would be all competing for rankings. I was wondering if having the 250+ indexed pages of duplicate content will effect the ranking of the main page even though they have rel canonical tag. We get some traffic to country pages, but not as much as the main page, but im worried that if we remove those pages and redirect all to main page that we will loose 250 plus indexed pages where we can get traffic through for odd country specific terms. eg searching for uk mobile phone brings up the country specific page instead of main sales page even though the uk sales pages is not optimized for uk terms other than having a flag and the country name in the h1. Any advice?
Technical SEO | | -Al-0 -
E-Commerce Duplicate Content
Hello all We have an e-commerce website with approximately 3,000 products. Many of the products are displayed in multiple categories which in turn generates a different URL! 😞 Accross the entire site I have noticed that the product pages are always outranked by competitors who have lower page authority, domain authority, total links etc etc. I am convinced this is down to duplicate content issues. I understand there is no direct penalty but how would this affect our rankings? Is page rank split between all the duplicates, which in turn lowers it's ranking potential? I have looked for a way to identify duplicate content using Google analytics but i've been unsuccessful. If the duplicate content is the issue and page rank is divided am i best using canonical or 301 redirects? Sorry if this is an obvious question but If i'm correct we could see a huge improvement in rankings accross the board. Wow! Cheers Todd
Technical SEO | | toddyC0