Duplicate content across multiple domains
-
I have come across a situation where we have discovered duplicate content between multiple domains. We have access to each domain and have recently within the past 2 weeks added a 301 redirect to redirect each page dynamically to the proper page on the desired domain.
My question relates to the removal of these pages. There are thousands of these duplicate pages.
I have gone back and looked at a number of these cached pages in google and have found that the cached pages that are roughly 30 days old or older. Will these pages ever get removed from google's index? Will the 301 redirect even be read by google to be redirected to the proper domain and page? If so when will that happen?
Are we better off submitting a full site removal request of the sites that carries the duplicate content at this point? These smaller sites do bring traffic on their own but I'd rather not wait 3 months for the content to be removed since my assumption is that this content is competing with the main site.
I suppose another option would be to include no cache meta tag for these pages.
Any thoughts or comments would be appreciated.
-
I went ahead and added the links to the sitemap, however when google crawled the links I receieve this message.
When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.
However I do not understand how adding the redirected links to the sitemap will remove the old links.
-
Worth a shot. Crawl bots usually work by following links from page to the next. If links links no longer exist to those pages, then Google will have a tough time finding those pages and de-indexing them in favor or the correct pages.
Good luck!
-
One of the previous developers left a hole that caused this issue. The system shares code between sites.
-
Andrew,
The links were removed from the offending sites, but If I understand the gist of your suggestion Google won't remove them as quickly if they are no longer linked and yes I am using canonical tags. So I should create a sitemap with the previous links and once Google follows these links to the main site remove the sitemap. Is that your recommendation?
I suppose I can try this first before filing a request to remove the entire site.
-
Ah, I thought he was saying the dupe content does still exists but no more duplication is taking place after the fix. That's where I was going wrong then lol.
-
As long as the duplicate content pages no longer exist and you've set up the 301 redirects properly, this shouldn't be a long term problem. It can sometimes take Google a while to crawl through 1000's of pages to index the correct pages. You might want to include these pages in a Sitemap to speed up the process, particularly if there are no longer any links to these pages from anywhere else. Are you using canonical tags? They might also help point Google in the right direction.
I don't think a no cache meta tag would help. This is assuming the page will be crawled and by that point Google should follow the 301 and cace that page.
Hope this helps! Let me know how the situation progresses.
Andrew
-
Do you want the smaller sites to still exist? If they don't matter at all then you could always take them offline though that's not recommended for obvious reasons (but it would get them out of the index fairly quick).
If they still need to exist then we're just back to the same thing, changing the content on them. If the problem has been fixed to stop further duplication then that's fine... you could limit the damage by having all of those smaller sites be dupes of each other but not of the main site by rewriting the smaller ones with one lot of content, or the main one. At least that way they will only be competing with each other and not the main site any more.
Or have I still got the wrong end of the stick?
-
I am referring to an e-commerce site, so yes its dynamic. The hole has been plugged (so to speak) but the content still exists in the google cache.
-
Ah I see, so it's a CMS which pumps out content then?
But it pumps it to other sites?
-
Steve, Maybe I haven't explained the issue in enough detail. The duplicate content issue is related to a technical issue with the site causing the content to be duplicated when it should not have been. Its not a matter of rewriting content. My issue deals with purging this content from these other domains so that the main domain can be indexed with this content.
-
You could always just rewrite the content so it's not duplicate, that way you get to keep them cached and maybe focus on some different but still targeted long tail traffic... turn a negative into a positive. I accept thousands of pages is a lot of work, but there's a million and one online copywriters who are pretty good (and cheap) that you could assign projects to for it. Google copywriters for hire or freelance copywriters... could have it done in no time and not spend that much
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Page Content Issue
Hello, I recently solved www / no www duplicate issue for my website, but now I am in trouble with duplicate content again. This time something that I cannot understand happens: In Crawl Issues Report, I received Duplicate Page Content for http://yourappliancerepairla.com (DA 19) http://yourappliancerepairla.com/index.html (DA 1) Could you please help me figure out what is happenning here? By default, index.html is being loaded, but this is the only index.html I have in the folder. And it looks like the crawler sees two different pages with different DA... What should I do to handle this issue?
Technical SEO | | kirupa0 -
How to deal with 80 websites and duplicated content
Consider the following: A client of ours has a Job boards website. They then have 80 domains all in different job sectors. They pull in the jobs based on the sectors they were tagged in on the back end. Everything is identical across these websites apart from the brand name and some content. whats the best way to deal with this?
Technical SEO | | jasondexter0 -
Duplicate content. Wordpress and Website
Hi All, Will Google punish me for having duplicate blog posts on my website's blog and wordpress? Thanks
Technical SEO | | Mike.NW0 -
Duplicate Page Content
Hi, I just had my site crawled by the seomoz robot and it came back with some errors. Basically it seems the categories and dates are not crawling directly. I'm a SEO newbie here Below is a capture of the video of what I am talking about. Any ideas on how to fix this? Hkpekchp
Technical SEO | | mcardenal0 -
Duplicate content in Magento
Hi all We got some serious issues with duplicate content on a Magento site that we are marketing. For example: http://www.citcop.se/varmepumpar-luft-luft/panasonic/panasonic-nordic-ce9nke-5-0kw http://www.citcop.se/panasonic/panasonic-nordic-ce9nke-5-0kw http://www.citcop.se/panasonic-nordic-ce9nke-5-0kw All of the above seem to work just fine as it is now but since they are excatly the same product they should ofcourse do a 301 redirect to the main page. Any ideas on how to sort this out in Magnto without having to resort to manual work in .htaccess? Have a great day Fredrik
Technical SEO | | Resultify0 -
Duplicate Content Issue
Hi Everyone, I ran into a problem I didn't know I had (Thanks to the seomoz tool) regarding duplicate content. my site is oxford ms homes.net and when I built the site, the web developer used php to build it. After he was done I saw that the URL's looking like this "/blake_listings.php?page=0" and I wanted them like this "/blakes-listings" He changed them with no problem and he did the same with all 300 pages or so that I have on the site. I just found using the crawl diagnostics tool that I have like 3,000 duplicate content issues. Is there an easy fix to this at all or does he have to go in and 301 Redirect EVERY SINGLE URL? Thanks for any help you can give.
Technical SEO | | blake-766240 -
Duplicate content, Original source?
Hi there, say i have two websites with identicle content. website a had content on before website b - so will be seen as the original source? If the content was intended for website b, would taking it off a then make the orinal source to google then go to website b? I want website b to get the value of the content but it was put on website a first - would taking it off website a then give website b the full power of the content? Any help of advice much appreciated. Kind Regards,
Technical SEO | | pauledwards0 -
Duplicate Content and Canonical use
We have a pagination issue, which the developers seem reluctant (or incapable) to fix whereby we have 3 of the same page (slightly differing URLs) coming up in different pages in the archived article index. The indexing convention was very poorly thought up by the developers and has left us with the same article on, for example, page 1, 2 and 3 of the article index, hence the duplications. Is this a clear cut case of using a canonical tag? Quite concerned this is going to have a negative impact on ranking, of course. Cheers Martin
Technical SEO | | Martin_S0