Purchasing duplicate content
-
Morning all,
I have a client who is planning to expand their product range (online dictionary sites) to new markets and are considering the acquisition of data sets from low ranked competitors to supplement their own original data. They are quite large content sets and would mean a very high percentage of the site (hosted on a new sub domain) would be made up of duplicate content. Just to clarify, the competitor's content would stay online as well.
I need to lay out the pros and cons of taking this approach so that they can move forward knowing the full facts. As I see it, this approach would mean forgoing ranking for most of the site and would need a heavy dose of original content as well as supplementing the data on page to build around the data. My main concern would be that launching with this level of duplicate data would end up damaging the authority of the site and subsequently the overall domain.
I'd love to hear your thoughts!
-
Thanks for the great response, some really useful thoughts.
To address your final point, the site is considerably stronger than the content creator's so it's reassuring to hear that this could be the case. Of course we'll be recommending that as much of the data as possible is curated and that the pages are improved with original content/
-
Wow, this is a loaded question. The way I see it we can break this up into two parts.
First, subdomains vs. domains vs. subpages. There has been a lot of discussion surrounding which structure should be used for SEO friendliness and to keep it really simple, if you're concerned about SEO then using a subpage structure is going to be the most beneficial. If you create a separate domain, that will be duplicate content and it does impact rankings. Subdomains are a little more complex, and I don't recommend them for SEO. In some cases, Google views subdomains as spam (think of all the PBNs created with blogspot.com) and in other cases it's viewed as a separate website. By structuring something as a subdomain you're indicating that the content is different enough from the main content of the root domain that you don't feel it should be included together. An example of this being used in the wild appropriately might be different language versions of a website, which especially makes sense in countries where the TLD doesn't represent multiple languages (like Switzerland - they have four national languages).
Next, the concept of duplicate content is different depending on whether it's duplicate internally, or duplicate externally. It's common for websites to have a certain amount of duplicate or common content within their own website. The number that has been repeated for years as a "safe" threshold is 30%, which is a stat that Matt Cutts threw out there before he retired. I use siteliner.com to discover how much common content has been replicated internally. Externally, if you have the same content as another website, this can pretty dramatically impact your rankings. Google does a decent job of assigning content to the correct website (who had it first, etc.) but they have a long way to go.
If you could assimilate the new content and have the pages redirected on a 1:1 basis to the new location then it's probably safe enough to do, and hopefully you will have it structured in a way that makes it useful to users. If you can't perform the redirect, I think you're more likely to struggle with achieving SEO goals for those new pages. In that case, take the time to set realistic expectations and track something like user engagement between new and old content so you have a realistic understanding of your success and challenges.
-
I would be thinking about these topics....
** How many other companies are purchasing or have purchased this data? Is it out there on lots of sites and the number is growing?
** Since this is a low-ranking competitor, how much additional money would be required to simply buy the entire company (provided that the data is not already out there on a ton of other websites.)
** Rather than purchasing this content, what would be the cost of original authorship for just those words that produce a big bulk of the traffic. Certainly 10% of the content produces over 50% of the traffic on most reference sites.
** With knowledge that in most duplicate content situations, a significantly stronger site will crush the same content on the original publisher.... where do I sit in this comparison of power?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Car Dealership website - Duplicate Page Content Issues
Hi, I am currently working on a large car dealership website. I have just had a Moz crawl through and its flagging a lot of duplicate page content issues, these are mostly for used car pages. How can I get round this as the site stocks many of the same car, model, colour, age, millage etc. Only unique thing about them is the reg plate. How do I get past this duplicate issue if all the info is relatively the same? Anyone experienced this issue when working on a car dealership website? Thank you.
Technical SEO | | karl621 -
Finding a specific link - Duplicating my own content
Hi Mozzers, This may be a bit of a n00b question and i feel i should know the answer but alas, here i am asking. I have a page www.website.co.uk/page/ and im getting a duplicate page report of www.website.co.uk/Page/ i know this is because somewhere on my website a link will exists using the capitalised version. I have tried everything i can think of to find it but with no luck, any little tricks? I could always rewrite the urls to lowercase, but I have downloadable software etc also on the website that i dont want to take the capitals out of. So the best solution seems to be finding the link and remove it. Most link checkers I use treat the capitalised and non capitalised as the same thing so really arent helping lol.
Technical SEO | | ATP0 -
Despite canonical duplicate content in WMT
Hi, 2 weeks ago we've made big changes in title and meta descriptions. To solve the missing title and descriptions. Also set the right canonical. Now i see that in WMT despite the canonical it shows duplicates in meta descriptions and titles. i've setup the canonical like this:
Technical SEO | | Leonie-Kramer
1. url: www.domainname.com/category/listing-family/productname
2. url: www.domainname.com/category/listing-family/productname-more-info The canonical on both pages is like this: I'm aware of creating duplicate titles and descriptions, caused by the cms we use and also caused by wrong structure of category/products (we'll solve that nest year) that's why i wanted the canonical, but now it's not going any better, did i do something wrong with the canonical?0 -
Duplicate Content
Hello guys, After fixing the rel tag on similar pages on the site I thought that duplicate content issue were resolved. I checked HTML Improvements on GWT and instead of going down as I expected, it went up. The duplicate issues affect identical product pages which differ from each other just for one detail, let's say length or colour. I could write different meta tags as the duplicate is the meta description, and I did it for some products but still didn't have any effects and they are still showing as duplicates. What would the problem be? Cheers
Technical SEO | | PremioOscar0 -
Duplicate Content - Captcha on Contact Form
I am going to be working on a site where the contact form is being flagged as duplicate content the URL is the same apart from having: /contact/10119 contact/31010 ...at the end of it. The only difference in the content of the page that I can see is the Captcha numbers? Is there a way to overcome this to stop duplicate content? Thanks in advance
Technical SEO | | J_Sinclair0 -
Duplicate homepage content
Hi, I recently did a site crawl using seomoz crawl test My homepage seems to have 3 cases of duplicate content.. These are the urls www.example.ie/ www.example..ie/%5B%7E19%7E%5D www.example..ie/index.htm Does anyone have any advise on this? What impact does this have on my seo?
Technical SEO | | Socialdude0 -
Are recipes excluded from duplicate content?
Does anyone know how recipes are treated by search engines? For example, I know press releases are expected to have lots of duplicates out there so they aren't penalized. Does anyone know if recipes are treated the same way. For example, if you Google "three cheese beef pasta shells" you get the first two results with identical content.
Technical SEO | | RiseSEO0 -
Mapping Internal Links (Which are causing duplicate content)
I'm working on a site that is throwing off a -lot- of duplicate content for its size. A lot of it appears to be coming from bad links within the site itself, which were caused when it was ported over from static HTML to Expression Engine (by someone else). I'm finding EE an incredibly frustrating platform to work with, as it appears to be directing 404's on sub-pages to the page directly above that subpage, without actually providing a 404 response. It's very weird. Does anyone have any recommendations on software to clearly map out a site's internal link structure so that I can find what bad links are pointing to the wrong pages?
Technical SEO | | BedeFahey0