Duplicate content across multiple domains
-
I have come across a situation where we have discovered duplicate content between multiple domains. We have access to each domain and have recently within the past 2 weeks added a 301 redirect to redirect each page dynamically to the proper page on the desired domain.
My question relates to the removal of these pages. There are thousands of these duplicate pages.
I have gone back and looked at a number of these cached pages in google and have found that the cached pages that are roughly 30 days old or older. Will these pages ever get removed from google's index? Will the 301 redirect even be read by google to be redirected to the proper domain and page? If so when will that happen?
Are we better off submitting a full site removal request of the sites that carries the duplicate content at this point? These smaller sites do bring traffic on their own but I'd rather not wait 3 months for the content to be removed since my assumption is that this content is competing with the main site.
I suppose another option would be to include no cache meta tag for these pages.
Any thoughts or comments would be appreciated.
-
I went ahead and added the links to the sitemap, however when google crawled the links I receieve this message.
When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.
However I do not understand how adding the redirected links to the sitemap will remove the old links.
-
Worth a shot. Crawl bots usually work by following links from page to the next. If links links no longer exist to those pages, then Google will have a tough time finding those pages and de-indexing them in favor or the correct pages.
Good luck!
-
One of the previous developers left a hole that caused this issue. The system shares code between sites.
-
Andrew,
The links were removed from the offending sites, but If I understand the gist of your suggestion Google won't remove them as quickly if they are no longer linked and yes I am using canonical tags. So I should create a sitemap with the previous links and once Google follows these links to the main site remove the sitemap. Is that your recommendation?
I suppose I can try this first before filing a request to remove the entire site.
-
Ah, I thought he was saying the dupe content does still exists but no more duplication is taking place after the fix. That's where I was going wrong then lol.
-
As long as the duplicate content pages no longer exist and you've set up the 301 redirects properly, this shouldn't be a long term problem. It can sometimes take Google a while to crawl through 1000's of pages to index the correct pages. You might want to include these pages in a Sitemap to speed up the process, particularly if there are no longer any links to these pages from anywhere else. Are you using canonical tags? They might also help point Google in the right direction.
I don't think a no cache meta tag would help. This is assuming the page will be crawled and by that point Google should follow the 301 and cace that page.
Hope this helps! Let me know how the situation progresses.
Andrew
-
Do you want the smaller sites to still exist? If they don't matter at all then you could always take them offline though that's not recommended for obvious reasons (but it would get them out of the index fairly quick).
If they still need to exist then we're just back to the same thing, changing the content on them. If the problem has been fixed to stop further duplication then that's fine... you could limit the damage by having all of those smaller sites be dupes of each other but not of the main site by rewriting the smaller ones with one lot of content, or the main one. At least that way they will only be competing with each other and not the main site any more.
Or have I still got the wrong end of the stick?
-
I am referring to an e-commerce site, so yes its dynamic. The hole has been plugged (so to speak) but the content still exists in the google cache.
-
Ah I see, so it's a CMS which pumps out content then?
But it pumps it to other sites?
-
Steve, Maybe I haven't explained the issue in enough detail. The duplicate content issue is related to a technical issue with the site causing the content to be duplicated when it should not have been. Its not a matter of rewriting content. My issue deals with purging this content from these other domains so that the main domain can be indexed with this content.
-
You could always just rewrite the content so it's not duplicate, that way you get to keep them cached and maybe focus on some different but still targeted long tail traffic... turn a negative into a positive. I accept thousands of pages is a lot of work, but there's a million and one online copywriters who are pretty good (and cheap) that you could assign projects to for it. Google copywriters for hire or freelance copywriters... could have it done in no time and not spend that much
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content and 404 errors
I apologize in advance, but I am an SEO novice and my understanding of code is very limited. Moz has issued a lot (several hundred) of duplicate content and 404 error flags on the ecommerce site my company takes care of. For the duplicate content, some of the pages it says are duplicates don't even seem similar to me. additionally, a lot of them are static pages we embed images of size charts that we use as popups on item pages. it says these issues are high priority but how bad is this? Is this just an issue because if a page has similar content the engine spider won't know which one to index? also, what is the best way to handle these urls bringing back 404 errors? I should probably have a developer look at these issues but I wanted to ask the extremely knowledgeable Moz community before I do 🙂
Technical SEO | | AliMac260 -
Duplicate Content
HI There, Hoping someone can help me - before i damage my desk banging my head. Getting notifications from ahrefs and Moz for duplicate content. I have no idea where these weird urls have came from , but they do take us to the correct page (but it seems a duplicate of this page). correct url http://www.acsilver.co.uk/shop/pc/Antique-Vintage-Rings-c152.htm Incorrect url http://www.acsilver.co.uk/shop/pc/vintage-Vintage-Rings- c152.htm This is showing for most of our store categories 😞 Desperate for help as to what could be causing these issues. I have a technical member of the ecommerce software go through the large sitemap files and they assured me it wasn't linked to the sitemap files. Gemma
Technical SEO | | acsilver0 -
Duplicate content on job sites
Hi, I have a question regarding job boards. Many job advertisers will upload the same job description to multiple websites e.g. monster, gumtree, etc. This would therefore be viewed as duplicate content. What is the best way to handle this if we want to ensure our particular site ranks well? Thanks in advance for the help. H
Technical SEO | | HiteshP0 -
Duplicate Page Content for sorted archives?
Experienced backend dev, but SEO newbie here 🙂 When SEOmoz crawls my site, I get notified of DPC errors on some list/archive sorted pages (appending ?sort=X to the url). The pages all have rel=canonical to the archive home. Some of the pages are shorter (have only one or two entries). Is there a way to resolve this error? Perhaps add rel=nofollow to the sorting menu? Or perhaps find a method that utilizes a non-link navigation method to sort / switch sorted pages? No issues with duplicate content are showing up on google webmaster tools. Thanks for your help!
Technical SEO | | jwondrusch0 -
Duplicate content by php id,page=... problem
Hi dear friends! How can i resolve this duplicate problem with edit the php code file? My trouble is google find that : http://vietnamfoodtour.com/?mod=booking&act=send_booking&ID=38 and http://vietnamfoodtour.com/.....booking.html are different page, but they are one but google indexed both of them. And the Duplcate content is raised 😞 how can i notice to google that they are one?
Technical SEO | | magician0 -
Tags and Duplicate Content
Just wondering - for a lot of our sites we use tags as a way of re-grouping articles / news / blogs so all of the info on say 'government grants' can be found on one page. These /tag pages often come up with duplicate content errors, is it a big issue, how can we minimnise that?
Technical SEO | | salemtas0 -
Duplicate Content Errors
Ok, old fat client developer new at SEO so I apologize if this is obvious. I have 4 errors in one of my campaigns. two are duplicate content and two are duplicate title. Here is the duplicate title error Rare Currency And Old Paper Money Values and Information.
Technical SEO | | Banknotes
http://www.antiquebanknotes.com/ Rare Currency And Old Paper Money Values and Information.
http://www.antiquebanknotes.com/Default.aspx So, my question is... What do I need to do to make this right? They are the same page. in my page load for default.aspx I have this: this.Title = "Rare Currency And Old Paper Money Values and Information."; And it occurs only once...0 -
Are multiple links devalued on the same domain?
I'm in negotiations to get links placed on a popular blog with good stats. I'm allowed to pick older posts on the site, and I get to pick the anchor text. Is it best practice to diversify the links by having different keywords pointing to different pages or am I better off pointing as many links as I can at one page (varying anchor text)? Also, is it best to pick a more recent blog post, or is it ok to pick one from say, 2009?
Technical SEO | | MichaelWeisbaum0