Purchasing duplicate content
-
Morning all,
I have a client who is planning to expand their product range (online dictionary sites) to new markets and are considering the acquisition of data sets from low ranked competitors to supplement their own original data. They are quite large content sets and would mean a very high percentage of the site (hosted on a new sub domain) would be made up of duplicate content. Just to clarify, the competitor's content would stay online as well.
I need to lay out the pros and cons of taking this approach so that they can move forward knowing the full facts. As I see it, this approach would mean forgoing ranking for most of the site and would need a heavy dose of original content as well as supplementing the data on page to build around the data. My main concern would be that launching with this level of duplicate data would end up damaging the authority of the site and subsequently the overall domain.
I'd love to hear your thoughts!
-
Thanks for the great response, some really useful thoughts.
To address your final point, the site is considerably stronger than the content creator's so it's reassuring to hear that this could be the case. Of course we'll be recommending that as much of the data as possible is curated and that the pages are improved with original content/
-
Wow, this is a loaded question. The way I see it we can break this up into two parts.
First, subdomains vs. domains vs. subpages. There has been a lot of discussion surrounding which structure should be used for SEO friendliness and to keep it really simple, if you're concerned about SEO then using a subpage structure is going to be the most beneficial. If you create a separate domain, that will be duplicate content and it does impact rankings. Subdomains are a little more complex, and I don't recommend them for SEO. In some cases, Google views subdomains as spam (think of all the PBNs created with blogspot.com) and in other cases it's viewed as a separate website. By structuring something as a subdomain you're indicating that the content is different enough from the main content of the root domain that you don't feel it should be included together. An example of this being used in the wild appropriately might be different language versions of a website, which especially makes sense in countries where the TLD doesn't represent multiple languages (like Switzerland - they have four national languages).
Next, the concept of duplicate content is different depending on whether it's duplicate internally, or duplicate externally. It's common for websites to have a certain amount of duplicate or common content within their own website. The number that has been repeated for years as a "safe" threshold is 30%, which is a stat that Matt Cutts threw out there before he retired. I use siteliner.com to discover how much common content has been replicated internally. Externally, if you have the same content as another website, this can pretty dramatically impact your rankings. Google does a decent job of assigning content to the correct website (who had it first, etc.) but they have a long way to go.
If you could assimilate the new content and have the pages redirected on a 1:1 basis to the new location then it's probably safe enough to do, and hopefully you will have it structured in a way that makes it useful to users. If you can't perform the redirect, I think you're more likely to struggle with achieving SEO goals for those new pages. In that case, take the time to set realistic expectations and track something like user engagement between new and old content so you have a realistic understanding of your success and challenges.
-
I would be thinking about these topics....
** How many other companies are purchasing or have purchased this data? Is it out there on lots of sites and the number is growing?
** Since this is a low-ranking competitor, how much additional money would be required to simply buy the entire company (provided that the data is not already out there on a ton of other websites.)
** Rather than purchasing this content, what would be the cost of original authorship for just those words that produce a big bulk of the traffic. Certainly 10% of the content produces over 50% of the traffic on most reference sites.
** With knowledge that in most duplicate content situations, a significantly stronger site will crush the same content on the original publisher.... where do I sit in this comparison of power?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Footer Content Issue
Please check given screenshot URL. As per the screenshot we are using highlighted content through out the website in the footer section of our website (https://www.mastersindia.co/) . So, please tell us how Google will treat this content. Will Google count it as duplicate content or not? What is the solution in case if the Google treat it as duplicate content. Screenshot URL: https://prnt.sc/pmvumv
Technical SEO | | AnilTanwarMI0 -
Duplicate content when working with makes and models?
Okay, so I am running a store on Shopify at the address https://www.rhinox-group.com. This store is reasonably new, so being updated constantly! The thing that is really annoying me at the moment though, is I am getting errors in the form of duplicate content. This seems to be because we work using the machine make and model, which is obviously imperative, but then we have various products for each machine make and model. Have we got any suggestions on how I can cut down on these errors, as the last thing I want is being penalised by Google for this! Thanks in advance, Josh
Technical SEO | | josh.sprakes1 -
Duplicate content issue
Hi, A client of ours has one URL for the moment (https://aalst.mobilepoint.be/) and wants to create a second one with exactly the same content (https://deinze.mobilepoint.be/). Will that mean Google punishes the second one because of duplicate content? What are the recommendations?
Technical SEO | | conversal0 -
Using canonical for duplicate contents outside of my domain
I have 2 domains for the same company, example.com and example.sg Sometimes we have to post the same content or event on both websites so to protect my website from duplicate content plenty i use canonical tag to point to either .com or .sg depend on the page. Any idea if this is the right decision Thanks
Technical SEO | | MohammadSabbagh0 -
How to fix duplicate content errors with Go Daddy Site
I have a friend that uses a free GoDaddy template for his business website. I ran his site through Moz Crawl diagnostics, and wow - 395 errors. Mostly duplicate content and duplicate page title I dug further and found the site was doing this: URL: www.businessname.com/page1.php and the duplicate: businessname.com/page1.php Essentially, the duplicate is missing the www. And it does this 2 hundred times. How do I explain to him what is happening?
Technical SEO | | cschwartzel0 -
Duplicate Content on SEO Pages
I'm trying to create a bunch of content pages, and I want to know if the shortcut I took is going to penalize me for duplicate content. Some background: we are an airport ground transportation search engine(www.mozio.com), and we constructed several airport transportation pages with the providers in a particular area listed. However, the problem is, sometimes in a certain region multiple of the same providers serve the same places. For instance, NYAS serves both JFK and LGA, and obviously SuperShuttle serves ~200 airports. So this means for every airport's page, they have the super shuttle box. All the provider info is stored in a database with tags for the airports they serve, and then we dynamically create the page. A good example follows: http://www.mozio.com/lga_airport_transportation/ http://www.mozio.com/jfk_airport_transportation/ http://www.mozio.com/ewr_airport_transportation/ All 3 of those pages have a lot in common. Now, I'm not sure, but they started out working decently, but as I added more and more pages the efficacy of them went down on the whole. Is what I've done qualify as "duplicate content", and would I be better off getting rid of some of the pages or somehow consolidating the info into a master page? Thanks!
Technical SEO | | moziodavid0 -
Duplicate Content Issue with
Hello fellow Moz'rs! I'll get straight to the point here - The issue, which is shown in the attached image, is that for every URL ending in /blog/category/name, it has a duplicate page of /blog/category/name/?p=contactus. Also, its worth nothing that the ?p=contact us are not in the SERPs but were crawled by SEOMoz and they are live and duplicate. We are using Pinnacle cart. Is there a way to just stop the crawlers from ?p=contactus or? Thank you all and happy rankings, James
Technical SEO | | JamesPiper0 -
What are some of the negative effects of having duplicate content from other sites?
This could include republishing several articles from another site with permission.
Technical SEO | | Charlessipe0