Duplicate content across multiple domains
-
I have come across a situation where we have discovered duplicate content between multiple domains. We have access to each domain and have recently within the past 2 weeks added a 301 redirect to redirect each page dynamically to the proper page on the desired domain.
My question relates to the removal of these pages. There are thousands of these duplicate pages.
I have gone back and looked at a number of these cached pages in google and have found that the cached pages that are roughly 30 days old or older. Will these pages ever get removed from google's index? Will the 301 redirect even be read by google to be redirected to the proper domain and page? If so when will that happen?
Are we better off submitting a full site removal request of the sites that carries the duplicate content at this point? These smaller sites do bring traffic on their own but I'd rather not wait 3 months for the content to be removed since my assumption is that this content is competing with the main site.
I suppose another option would be to include no cache meta tag for these pages.
Any thoughts or comments would be appreciated.
-
I went ahead and added the links to the sitemap, however when google crawled the links I receieve this message.
When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.
However I do not understand how adding the redirected links to the sitemap will remove the old links.
-
Worth a shot. Crawl bots usually work by following links from page to the next. If links links no longer exist to those pages, then Google will have a tough time finding those pages and de-indexing them in favor or the correct pages.
Good luck!
-
One of the previous developers left a hole that caused this issue. The system shares code between sites.
-
Andrew,
The links were removed from the offending sites, but If I understand the gist of your suggestion Google won't remove them as quickly if they are no longer linked and yes I am using canonical tags. So I should create a sitemap with the previous links and once Google follows these links to the main site remove the sitemap. Is that your recommendation?
I suppose I can try this first before filing a request to remove the entire site.
-
Ah, I thought he was saying the dupe content does still exists but no more duplication is taking place after the fix. That's where I was going wrong then lol.
-
As long as the duplicate content pages no longer exist and you've set up the 301 redirects properly, this shouldn't be a long term problem. It can sometimes take Google a while to crawl through 1000's of pages to index the correct pages. You might want to include these pages in a Sitemap to speed up the process, particularly if there are no longer any links to these pages from anywhere else. Are you using canonical tags? They might also help point Google in the right direction.
I don't think a no cache meta tag would help. This is assuming the page will be crawled and by that point Google should follow the 301 and cace that page.
Hope this helps! Let me know how the situation progresses.
Andrew
-
Do you want the smaller sites to still exist? If they don't matter at all then you could always take them offline though that's not recommended for obvious reasons (but it would get them out of the index fairly quick).
If they still need to exist then we're just back to the same thing, changing the content on them. If the problem has been fixed to stop further duplication then that's fine... you could limit the damage by having all of those smaller sites be dupes of each other but not of the main site by rewriting the smaller ones with one lot of content, or the main one. At least that way they will only be competing with each other and not the main site any more.
Or have I still got the wrong end of the stick?
-
I am referring to an e-commerce site, so yes its dynamic. The hole has been plugged (so to speak) but the content still exists in the google cache.
-
Ah I see, so it's a CMS which pumps out content then?
But it pumps it to other sites?
-
Steve, Maybe I haven't explained the issue in enough detail. The duplicate content issue is related to a technical issue with the site causing the content to be duplicated when it should not have been. Its not a matter of rewriting content. My issue deals with purging this content from these other domains so that the main domain can be indexed with this content.
-
You could always just rewrite the content so it's not duplicate, that way you get to keep them cached and maybe focus on some different but still targeted long tail traffic... turn a negative into a positive. I accept thousands of pages is a lot of work, but there's a million and one online copywriters who are pretty good (and cheap) that you could assign projects to for it. Google copywriters for hire or freelance copywriters... could have it done in no time and not spend that much
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content for Locations on my Directory Site
I have a pretty big directory site using Wordpress with lots of "locations", "features", "listing-category" etc.... Duplicate Content: https://www.thecbd.co/location/california/ https://www.thecbd.co/location/canada/ referring URL is www.thecbd.co is it a matter of just putting a canonical URL on each location, or just on the main page? Would this be the correct code to put: on the main page? Thanks Everyone!
Technical SEO | | kay_nguyen0 -
Decline in traffic and duplicate content in different domains
Hi, 6 months ago my customer purchased their US supplier and moved the supplier's website to their e-commerce platform. When moving to the new platform they copied the descriptions of the products from their site to the supplier's site so now both sites have the same content in the product pages. Since then they have experienced decrease in traffic in about 80%. They didn't implement canonical tag or hreflang. My customer's domain format is https://www.xxx.biz and the supplier's domain is https://www.zzz.com The last one is targeting the US and when someone from outside of the US wants to purchase a product they get a message that they need to move to the first website, the www.xxx.biz. Both sites are in English. The old site version of www.zzz.com, before the shit to the new platform, contained different product descriptions, and BTW, the old website version is still live and indexed under a subdomain of www.zzz.com. My question is what's the best thing to do in this case so that the rankings will be back to higher positions and they'll get back their traffic. Thanks!
Technical SEO | | digital19740 -
SEOMOZ and non-duplicate duplicate content
Hi all, Looking through the lovely SEOMOZ report, by far its biggest complaint is that of perceived duplicate content. Its hard to avoid given the nature of eCommerce sites that oestensibly list products in a consistent framework. Most advice about duplicate content is about canonicalisation, but thats not really relevant when you have two different products being perceived as the same. Thing is, I might have ignored it but google ignores about 40% of our site map for I suspect the same reason. Basically I dont want us to appear "Spammy". Actually we do go to a lot of time to photograph and put a little flavour text for each product (in progress). I guess my question is, that given over 700 products, why 300ish of them would be considered duplicates and the remaning not? Here is a URL and one of its "duplicates" according to the SEOMOZ report: http://www.1010direct.com/DGV-DD1165-970-53/details.aspx
Technical SEO | | fretts
http://www.1010direct.com/TDV-019-GOLD-50/details.aspx Thanks for any help people0 -
Duplicate Content on SEO Pages
I'm trying to create a bunch of content pages, and I want to know if the shortcut I took is going to penalize me for duplicate content. Some background: we are an airport ground transportation search engine(www.mozio.com), and we constructed several airport transportation pages with the providers in a particular area listed. However, the problem is, sometimes in a certain region multiple of the same providers serve the same places. For instance, NYAS serves both JFK and LGA, and obviously SuperShuttle serves ~200 airports. So this means for every airport's page, they have the super shuttle box. All the provider info is stored in a database with tags for the airports they serve, and then we dynamically create the page. A good example follows: http://www.mozio.com/lga_airport_transportation/ http://www.mozio.com/jfk_airport_transportation/ http://www.mozio.com/ewr_airport_transportation/ All 3 of those pages have a lot in common. Now, I'm not sure, but they started out working decently, but as I added more and more pages the efficacy of them went down on the whole. Is what I've done qualify as "duplicate content", and would I be better off getting rid of some of the pages or somehow consolidating the info into a master page? Thanks!
Technical SEO | | moziodavid0 -
Duplicate Content on Navigation Structures
Hello SEOMoz Team, My organization is making a push to have a seamless navigation across all of its domains. Each of the domains publishes distinctly different content about various subjects. We want each of the domains to have its own separate identity as viewed by Google. It has been suggested internally that we keep the exact same navigation structure (40-50 links in the header) across the header of each of our 15 domains to ensure "unity" among all of the sites. Will this create a problem with duplicate content in the form of the menu structure, and will this cause Google to not consider the domains as being separate from each other? Thanks, Richard Robbins
Technical SEO | | LDS-SEO0 -
Duplicate Content Issue
Very strange issue I noticed today. In my SEOMoz Campaigns I noticed thousands of Warnings and Errors! I noticed that any page on my website ending in .php can be duplicated by adding anything you want to the end of the url, which seems to be causing these issues. Ex: Normal URL - www.example.com/testing.php Duplicate URL - www.example.com/testing.php/helloworld The duplicate URL displays the page without the images, but all the text and information is present, duplicating the Normal page. I Also found that many of my PDFs seemed to be getting duplicated burried in directories after directories, which I never ever put in place. Ex: www.example.com/catalog/pdfs/testing.pdf/pdfs/another.pdf/pdfs/more.pdfs/pdfs/ ... when the pdfs are only located in a pdfs directory! I am very confused on how to fix this problem. Maybe with some sort of redirect?
Technical SEO | | hfranz0 -
Noindex duplicate content penalty?
We know that google now gives a penalty to a whole duplicate if it finds content it doesn't like or is duplicate content, but has anyone experienced a penalty from having duplicate content on their site which they have added noindex to? Would google still apply the penalty to the overall quality of the site even though they have been told to basically ignore the duplicate bit. Reason for asking is that I am looking to add a forum to one of my websites and no one likes a new forum. I have a script which can populate it with thousands of questions and answers pulled direct from Yahoo Answers. Obviously the forum wil be 100% duplicate content but I do not want it to rank for anyway anyway so if I noindex the forum pages hopefully it will not damage the rest of the site. In time, as the forum grows, all the duplicate posts will be deleted but it's hard to get people to use an empty forum so need to 'trick' them into thinking the section is very busy.
Technical SEO | | Grumpy_Carl0 -
Duplicate Content -->?ss=facebook
Hi there, When searching site:mysite.com my keyword I found the "same page" twice in the SERP's. The URL's look like this: Page 1: www.example.com/category/productpage.htm Page 2: www.example.com/category/productpage.htm**?ss=facebook** The ?ss=facebook is caused by a bookmark button inserted in some of our product pages. My question is... will the canonical tag do to solve this? Thanks!
Technical SEO | | Nobody15565529539090