How do I combat content theft?
-
A new site popped up that has completely replicated a site own by my client. This site is literally a copycat, scraped all the content, and copied the design down to the colors.
I've already reported the site to the hosting provider and filled a spam report on Google. I noticed that the author changed some of the text, and internal links so that they don't link to our site anymore. Some of these were missed.
I'm also going to take a couple preventative actions like change stuff in .htaccess, but that doesn't help me now, just in case it happens again in the future.
I'm wondering what else i can or should be doing?
-
One of our sites has be quite well scraped already, and because we use absolute linking throughout the site we are getting a few links from the sites in question. I don't anticipate the links being worth a great deal but they may be helpful.
Provided that you're using absolute linking and your content is getting crawled first it shouldn't be a problem.
People will always copy good content, and it probably takes less time for them to scrape and set a site up than it does for you to do do something about it.
-
Hi Pashmina!
The best recommendation would be to Initiate a DMCA Take-down procedure. Essentially it's filing a notice with their host that the site is in violation of the Digital Millennium Copyright Act, which should get their host to take action more readily than simply contacting them. It says "we're serious and will pursue this". The same notice should go to the site owner if you can get valid contact info for them. You can read about this more and see a sample notice here http://ipwatchdog.com/2009/07/06/sample-dmca-take-down-letter/id=4501/
Note that the infringing party does not have to have a 100% complete copy - only that they go beyond "fair use" of a minor portion of content. If we're talking about major swaths of content, look and feel, it's pretty cut and dried.
Beyond that, it's really best to contact an attorney who specializes in digital law.
For individual articles, that's generally much more challenging because there's so much scraping going on. My take on it and now common thinking within the industry by some notable people is it's not worth going after sites that scrape when those sites scrape from multiple sources. They're usually impossible to find valid contact info on, and Google does a "fair" job at discerning origin source.
To help in that, Google's got their new article origin tag, but the best thing to do is to ensure content links to other pages within the site (most scrapers fail to strip that out), and include a standard paragraph at the closing of every page's content about the content being original information located on Domain.com (without making it a link so it's harder for scrapers to strip out). Or even better, also including the company name.
And finally, theory has it that scraper links might actually not be a bad thing for those scrapers that leave them in, since a lot of scraper content actually does rank
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content spamming risk
If some websites, which provide information about apps in a particular niche, are publishing the same content which we have given in our app's description when they refer our app for that particular niche then would it lead to spamming? Our website is getting a backlink from one such website so are we at any sort of risk? What should we do about it without having to lose that backlink?
Technical SEO | | Reema240 -
Duplicate content or an update ???
Buying Guide and Product Category page competing for the same keyword ? Got a “nuts and bold website” selling basic stuff. Imagine selling simple nuts, bolts and washers (the little ring that goes in between) in different metals. Imagine a website with a very wide and deep line of these simple products. For long tail keywords we rank well (Example: 0.25 inch bolts). For the keyword: “Nuts bolts” our main category page use to rank well low 1<sup>st</sup> page to second page up against the big guys (Amazon, Walmart, Target, Costco, some drug store who may have a mix pack of nuts and bolts, but still Google don’t see the difference and list 2 pages each for these guys). But then in mid-February there were an update and suddenly our “Buying guide for nuts and bolts” rank higher and started to compete with our own product category page. That was never our intention. These two pages now compete for the ranking on page 4<sup>th</sup>. Clearly there were more words on the buying guide page but no changes had been made to it for well months or years. To make up for it some more words were added to the category page, but of cause there is only so many way you can fraise words about “nuts and bolts” without sounding a bit duplicate/re-writing. So what do I do now ?? Clearly the product category page is the one we like to rank highest with the guide a close 2nd. Most customer don’t need the buying guide but it is good to have and great support as we got lot of good comments from customer who read it. Made a link to the buying guide from the category page and wise verses. The category page got an embedded video. Moz list the page authority for the category page to 16 and 1 for the buying guide but clearly G see it differently. Already tried to change the Meta Tag Title and Description a little but it is hard to do if the word “Nuts Bolts” is to appear in the description or people don’t know what to expect. Could just insert a “do not index” for the buying guide but not a good long term solution. Unfortunately I am out of imagination at this point. Any good suggestions ?? Thanks, Kim Any good suggestions ???
Technical SEO | | KimX0 -
Cloud Hosting and Duplicate content
Hi I have an ecommerce client who has all their images cloud hosted (amazon CDN) to speed up site. Somehow it seems maybe because the pinned the images on pinterest but the CDN got indexed and there now seems to be about 50% of the site duplicated (about 2500 pages eg: http://d2rf6flfy1l.cloudfront.net..) Is this a problem with duplicate content? How come Moz doesnt show it up as crawl errors? Why is thisnot a problem that loads of people have?I only found a couple of mentions of such a prob when I googled it.. any suggestion will be grateful!
Technical SEO | | henya0 -
Duplicate Content Problems
Hi I am new to the seomoz community I have been browsing for a while now. I put my new website into the seomoz dashboard and out of 250 crawls I have 120 errors! So the main problem is duplicate content. We are a website that finds free content sources for popular songs/artists. While seo is not our main focus for driving traffic I wanted to spend a little time to make sure our site is up to standards. With that said you can see when two songs by an artist are loaded. http://viromusic.com/song/125642 & http://viromusic.com/song/5433265 seomoz is saying that it is duplicate content even though they are two completely different songs. I am not exactly sure what to do about this situation. We will be adding more content to our site such as a blog, artist biographies and commenting maybe this will help? Although if someone was playing multiple bob marley songs the biography that is loaded will also be the same for both songs. Also when a playlist is loaded http://viromusic.com/playlist/sldvjg on the larger playlists im getting an error for to many links on the page. (some of the playlists have over 100 songs) any suggestions? Thanks in advance and any tips or suggestions for my new site would be greatly appreciated!
Technical SEO | | mikecrib10 -
Duplicate Footer Content
A client I just took over is having some duplicate content issues. At the top of each page he has about 200 words of unique content. Below this is are three big tables of text that talks about his services, history, etc. This table is pulled into the middle of every page using php. So, he has the exact same three big table of text across every page. What should I do to eliminate the dup content. I thought about removing the script then just rewriting the table of text on every page... Is there a better solution? Any ideas would be greatly appreciated. Thanks!
Technical SEO | | BigStereo0 -
Does turning website content into PDFs for document sharing sites cause duplicate content?
Website content is 9 tutorials published to unique urls with a contents page linking to each lesson. If I make a PDF version for distribution of document sharing websites, will it create a duplicate content issue? The objective is to get a half decent link, traffic to supplementary opt-in downloads.
Technical SEO | | designquotes0 -
Noindex duplicate content penalty?
We know that google now gives a penalty to a whole duplicate if it finds content it doesn't like or is duplicate content, but has anyone experienced a penalty from having duplicate content on their site which they have added noindex to? Would google still apply the penalty to the overall quality of the site even though they have been told to basically ignore the duplicate bit. Reason for asking is that I am looking to add a forum to one of my websites and no one likes a new forum. I have a script which can populate it with thousands of questions and answers pulled direct from Yahoo Answers. Obviously the forum wil be 100% duplicate content but I do not want it to rank for anyway anyway so if I noindex the forum pages hopefully it will not damage the rest of the site. In time, as the forum grows, all the duplicate posts will be deleted but it's hard to get people to use an empty forum so need to 'trick' them into thinking the section is very busy.
Technical SEO | | Grumpy_Carl0 -
About duplicate content
Hi i'm a new guy around here, but i'm having this problem in my website. Using de Seomoz tools i ran a camping to my website, in results i get to many errors for duplicate conten, for example, http://www.mysite/blue/ http://www.mysite/blue/index.html, so my question is, what is the best way to resolve this problem, use a 301 or use the rel canonical tag? Wich url will be consider for main url, Thanks for yor help.
Technical SEO | | NorbertoMM0