What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Technical : Duplicate content and domain name change
Hi guys, So, this is a tricky one. My server team just made quite a big mistake :We are a big We are a big magento ecommerce website, selling well, with about 6000 products. And we are about to change our domaine name for administrative reasons. Let's call the current site : current.com and the future one : future.com Right, here is the issue Connecting to the search console, I saw future.com sending 11.000 links to current.com. At the same time DA was hit by 7 points. I realized future.com was uncorrectly redirected and showed a duplicated site or current.com. We corrected this, and future.com now shows a landing page until we make the domain name change. I was wondering what is the best way to avoid the penalty now and what can be the consequences when changing domain name. Should I set an alias on search console or something ? Thanks
White Hat / Black Hat SEO | | Kepass0 -
Duplicate Content issue in Magento
I am getting duplicate content issue because of the following product URL in my Magento store. http://www.sitename.com/index.php/sports-nutritions/carbohydrates http://www.sitename.com/sports-nutritions/carbohydrates Please can someone guide me on how to solve it. Thanks Guys
White Hat / Black Hat SEO | | webteamBlackburn0 -
Duplicate content or not? If you're using abstracts from external sources you link to
I was wondering if a page (a blog post, for example) that offers links to external web pages along with abstracts from these pages would be considered duplicate content page and therefore penalized by Google. For example, I have a page that has very little original content (just two or three sentences that summarize or sometimes frame the topic) followed by five references to different external sources. Each reference contains a title, which is a link, and a short abstract, which basically is the first few sentences copied from the page it links to. So, except from a few sentences in the beginning everything is copied from other pages. Such a page would be very helpful for people interested in the topic as the sources it links to had been analyzed before, handpicked and were placed there to enhance user experience. But will this format be considered duplicate or near-duplicate content?
White Hat / Black Hat SEO | | romanbond0 -
Anyone used clicksubmit.co.uk?
As title, anyone used them? their reviews all sound really positive (if they're real). The system sounds like an auto submitting back link generator - which can't be good?
White Hat / Black Hat SEO | | FDFPres0 -
What happens when content on your website (and blog) is an exact match to multiple sites?
In general, I understand that having duplicate content on your website is a bad thing. But I see a lot of small businesses (specifically dentists in this example) who hire the same company to provide content to their site. They end up with the EXACT same content as other dentists. Here is a good example: http://www.hodnettortho.com/blog/2013/02/valentine’s-day-and-your-teeth-2/ http://www.braces2000.com/blog/2013/02/valentine’s-day-and-your-teeth-2/ http://www.gentledentalak.com/blog/2013/02/valentine’s-day-and-your-teeth/ If you google the title of that blog article you find tons of the same article all over the place. So, overall, doesn't this make the content on these blogs irrelevant? Does this hurt the SEO on these sites at all? What is the value of having completely unique content on your site/blog vs having duplicate content like this?
White Hat / Black Hat SEO | | MorganPorter0 -
How to Not Scrap Content, but still Being a Hub
Hello Seomoz members. I'm relatively new to SEO, so please forgive me if my questions are a little basic. One of the sites I manage is GoldSilver.com. We sell gold and silver coins and bars, but we also have a very important news aspect to our site. For about 2-3 years now we have been a major hub as a gold and silver news aggregator. At 1.5 years ago (before we knew much about SEO), we switched from linking to the original news site to scraping their content and putting it on our site. The chief reason for this was users would click outbound to read an article, see an ad for a competitor, then buy elsewhere. We were trying to avoid this (a relatively stupid decision with hindsight). We have realized that the Search Engines are penalizing us, which I don't blame them for, for having this scraped content on our site. So I'm trying to figure out how to move forward from here. We would like to remain a hub for news related to Gold and Silver and not be penalized by SEs, but we also need to sell bullion and would like to avoid loosing clients to competitors through ads on the news articles. One of the solutions we are thinking about is perhaps using an iFrame to display the original url, but within our experience. An example is how trap.it does this (see attached picture). This way we can still control the experience some what, but are still remaining a hub. Thoughts? Thank you, nick 3dLVv
White Hat / Black Hat SEO | | nwright0 -
Shadow Pages for Flash Content
Hello. I am curious to better understand what I've been told are "shadow pages" for Flash experiences. So for example, go here:
White Hat / Black Hat SEO | | mozcrush
http://instoresnow.walmart.com/Kraft.aspx#/home View the page as Googlebot and you'll see an HTML page. It is completely different than the Flash page. 1. Is this ok?
2. If I make my shadow page mirror the Flash page, can I put links in it that lead the user to the same places that the Flash experience does?
3. Can I put "Pinterest" Pin-able images in my shadow page?
3. Can a create a shadow page for a video that has the transcript in it? Is this the same as closed captioning? Thanks so much in advance, -GoogleCrush0 -
Possibly a dumb question - 301 from a banned domain to new domain with NEW content
I was wondering if banned domains pass any page rank, link love, etc. My domain got banned and I AM working to get it unbanned, but in the mean time, would buying a new domain, and creating NEW content that DOES adhere to the google quality guidelines, help at all? Would this force an 'auto-evaluation' or 're-evaluation' of the site by google? or would the new domain simply have ZERO effect from the 301 unless that old domain got into google's good graces again.
White Hat / Black Hat SEO | | ilyaelbert0