How do I combat content theft?
-
A new site popped up that has completely replicated a site own by my client. This site is literally a copycat, scraped all the content, and copied the design down to the colors.
I've already reported the site to the hosting provider and filled a spam report on Google. I noticed that the author changed some of the text, and internal links so that they don't link to our site anymore. Some of these were missed.
I'm also going to take a couple preventative actions like change stuff in .htaccess, but that doesn't help me now, just in case it happens again in the future.
I'm wondering what else i can or should be doing?
-
One of our sites has be quite well scraped already, and because we use absolute linking throughout the site we are getting a few links from the sites in question. I don't anticipate the links being worth a great deal but they may be helpful.
Provided that you're using absolute linking and your content is getting crawled first it shouldn't be a problem.
People will always copy good content, and it probably takes less time for them to scrape and set a site up than it does for you to do do something about it.
-
Hi Pashmina!
The best recommendation would be to Initiate a DMCA Take-down procedure. Essentially it's filing a notice with their host that the site is in violation of the Digital Millennium Copyright Act, which should get their host to take action more readily than simply contacting them. It says "we're serious and will pursue this". The same notice should go to the site owner if you can get valid contact info for them. You can read about this more and see a sample notice here http://ipwatchdog.com/2009/07/06/sample-dmca-take-down-letter/id=4501/
Note that the infringing party does not have to have a 100% complete copy - only that they go beyond "fair use" of a minor portion of content. If we're talking about major swaths of content, look and feel, it's pretty cut and dried.
Beyond that, it's really best to contact an attorney who specializes in digital law.
For individual articles, that's generally much more challenging because there's so much scraping going on. My take on it and now common thinking within the industry by some notable people is it's not worth going after sites that scrape when those sites scrape from multiple sources. They're usually impossible to find valid contact info on, and Google does a "fair" job at discerning origin source.
To help in that, Google's got their new article origin tag, but the best thing to do is to ensure content links to other pages within the site (most scrapers fail to strip that out), and include a standard paragraph at the closing of every page's content about the content being original information located on Domain.com (without making it a link so it's harder for scrapers to strip out). Or even better, also including the company name.
And finally, theory has it that scraper links might actually not be a bad thing for those scrapers that leave them in, since a lot of scraper content actually does rank
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is content on widget bar less 'seo important' than main content?
hi, i wonder if content on widget bar less 'seo important' than main content.. i mean, is better to place content and links on main cotent than on wordpress widget bar? What are the pros and cons? tx!
Technical SEO | | Dreamrealemedia0 -
How to deal with 80 websites and duplicated content
Consider the following: A client of ours has a Job boards website. They then have 80 domains all in different job sectors. They pull in the jobs based on the sectors they were tagged in on the back end. Everything is identical across these websites apart from the brand name and some content. whats the best way to deal with this?
Technical SEO | | jasondexter0 -
Duplicate Content Due to Pagination
Recently our newly designed website has been suffering from a rankings loss. While I am sure there are a number of factors involved, I'd like to no if this scenario could be harmful... Google is showing a number of duplicate content issues within Webmaster Tools. Some of what I am seeing is duplicate Meta Titles and Meta Descriptions for page 1 and page 2 of some of my product category pages. So if a category has many products and has 4 pages, it is effectively showing the same page title and meta desc. across all 4 pages. I am wondering if I should let my site show, say 150 products per page to get them all on one page instead of the current 36 per page. I use the Big Commerce platform. Thank you for taking the time to read my question!
Technical SEO | | josh3300 -
Pages with content defined by querystring
I have a page that show traveltips: http://www.spies.dk/spanien/alcudia/rejsemalstips-liste This page shows all traveltips for Alcudia. Each traveltip also has its own url: http://www.spies.dk/spanien/alcudia/rejsemalstips?TravelTipsId=19767 ( 2 weeks ago i noticed the url http://www.spies.dk/spanien/alcudia/rejsemalstips show up in google webmaster tools as a 404 page, along with 100 of others urls to the subpage /rejsemalstips WITHOUT a querystring. With no querystring there is no content on the page and it goes 404. I need my technicians to redirect that page so it shows the list, but in the meantime i would like to block it in robots.txt But how do i block a page if it is called without a querystring?
Technical SEO | | alsvik0 -
How to get rid of duplicate content
I have duplicate content that looks like http://deceptionbytes.com/component/mailto/?tmpl=component&link=932fea0640143bf08fe157d3570792a56dcc1284 - however I have 50 of these all with different numbers on the end. Does this affect the search engine optimization and how can I disallow this in my robots.txt file?
Technical SEO | | Mishelm1 -
What to do about similar content getting penalized as duplicate?
We have hundreds of pages that are getting categorized as duplicate content because they are so similar. However, they are different content. Background is that they are names and when you click on each name it has it's own URL. What should we do? We can't canonical any of the pages because they are different names. Thank you!
Technical SEO | | bonnierSEO0 -
Duplicate content issue
Hi everyone, I have an issue determining what type of duplicate content I have. www.example.com/index.php?mact=Calendar,m57663,default,1&m57663return_id=116&m57663detailpage=&m57663year=2011&m57663month=6&m57663day=19&m57663display=list&m57663return_link=1&m57663detail=1&m57663lang=en_GB&m57663returnid=116&page=116 Since I am not an coding expert, to me it looks like it is a URL parameter duplicate content. Is it? At the same time "return_id" would makes me think it is a session id duplicate content. I am confused about how to determine different types of duplicate content, even by reading articles on Seomoz about it: http://www.seomoz.org/learn-seo/duplicate-content. Could someone help me on how to recognize different types of duplicate content? Thank you!
Technical SEO | | Ideas-Money-Art0 -
Duplicate Content
Hi - We are due to launch a .com version of our site, with the ability to put prices into local currency, whereas our .co.uk site will be solely £. If the content on both the .com and .co.uk sites is the same (at product level mainly), will we be penalised? What is the best way to get around this?
Technical SEO | | swgolf1230