What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unnatural inbound links message from Google Webmaster Tools!
Hi Everyone, I just got this message from GWT(image below) This is probably a penguin Penalty. What is clear is I have to find the best and most efficient way to tackle this issue. We will probably lose tons of traffic in the next couple of weeks so I would like to get the best suggestions and maybe a guideline on how to do this in the most effective way! Thank you! 1a0X2M2a1h0A
White Hat / Black Hat SEO | | Ideas-Money-Art0 -
Content optimized for old keywords and G Updates
Hi, We've got some old content, about 50 pages worth in an Ecommerce site, that is optimized for keywords that aren't the subject of the page - these keywords occur about 8 times (2 keywords per page) in the old content. We are going through these 50 pages and changing the title, H1, and meta description tag to match the exact subject of the page - so that we will increase in rankings again - the updates have been lowering our rankings. Do we need to completely rewrite the content for these 50 pages, or can we just sprinkle it with any needed additions of the one keyword that is the subject of the page? The reason I'm asking is that our rankings keep dropping and these 50 pages seem to be part of the problem. We're in the process of updating these 50 pages Thanks.
White Hat / Black Hat SEO | | BobGW0 -
Negative SEO and when to use to Dissavow tool?
Hi guys I was hoping someone could help me on a problem that has arisen on the site I look after. This is my first SEO job and I’ve had it about 6 months now. I think I’ve been doing the right things so far building quality links from reputable sites with good DA and working with bloggers to push our products as well as only signing up to directories in our niche. So our backlink profile is very specific with few spammy links. Over the last week however we have received a huge increase in backlinks which has almost doubled our linking domains total. I’ve checked the links out from webmaster tools and they are mainly directories or webstat websites like the ones below | siteinfo.org.uk deperu.com alestat.com domaintools.com detroitwebdirectory.com ukdata.com stuffgate.com | We’ve also just launched a new initiative where we will be producing totally new and good quality content 4-5 times a week and many of these new links are pointing to that page which looks very suspicious to me. Does this look like negative Seo to anyone? I’ve read a lot about the disavow tool and it seems people’s opinions are split on when to use it so I was wondering if anyone had any advice on whether to use it or not? It’s easy for me to identify what these new links are, yet some of them have decent DA so will they do any harm anyway? I’ve also checked the referring anchors on Ahrefs and now over 50% of my anchor term cloud are totally unrelated terms to my site and this has happened over the last week which also worries me. I haven’t seen any negative impact on rankings yet but if this carries on it will destroy my link profile. So would it be wise to disavow all these links as they come through or wait to see if they actually have an impact? It should be obvious to Google that there has been a huge spike in links so then the question is would they be ignored or will I be penalised. Any ideas? Thanks in advance Richard
White Hat / Black Hat SEO | | Rich_9950 -
Schema.org tricking and duplicate content across domains
I've found the following abuse, and Im curious what could I do about it. Basically the scheme is: own some content only once (pictures, description, reviews etc) use different domain names (no problem if you use the same IP or IP-C address) have a different layout (this is basically the key) use schema.org tricking, meaning show (the very same) reviews on different scale, show a little bit less reviews on one site than on an another Quick example: http://bit.ly/18rKd2Q
White Hat / Black Hat SEO | | Sved
#2: budapesthotelstart.com/budapest-hotels/hotel-erkel/szalloda-attekintes.hu.html (217.113.62.21), 328 reviews, 8.6 / 10
#6: szallasvadasz.hu/hotel-erkel/ (217.113.62.201), 323 reviews, 4.29 / 5
#7: xn--szlls-gyula-l7ac.hu/szallodak/erkel-hotel/ (217.113.62.201), no reviews shown It turns out that this tactic even without the 4th step can be quite beneficial to rank with several domains. Here is a little investigation I've done (not really extensive, took around 1 and a half hour, but quite shocking nonetheless):
https://docs.google.com/spreadsheet/ccc?key=0Aqbt1cVFlhXbdENGenFsME5vSldldTl3WWh4cVVHQXc#gid=0 Kaspar Szymanski from Google Webspam team said that they have looked into it, and will do something, but honestly I don't know whether I could believe it or not. What do you suggest? should I leave it, and try to copy this tactic to rank with the very same content multiple times? should I deliberately cheat with markups? should I play nice and hope that these guys sooner or later will be dealt with? (honestly can't see this one working out) should I write a case study for this, so maybe if the tactics get bigger attention, then google will deal with it? Does anybody could push this towards Matt Cutts, or anybody else who is responsible for these things?0 -
Competitor using "unatural inbound links" not penalized??!
Since Google's latest updates, I think it would be safe to say that building links is harder. But i also read that Google applies their latest guidelines retro-actively. In other words, if you have built your ilnking profile on a lot of unnatural links, with spammy anchor text, you will get noticed and penalized. In the past, I used to use SEO friendly directories and "suggest URL's" to build back links, with keyword/phrase anchor text. But I thought that this technique was frowned upon by Google these days. So, what is safe to do? Why is Google not penalizing the competitor? And bottom line what is considered to be "unnatural link building" ?
White Hat / Black Hat SEO | | bjs20101 -
Moving content to a clean URL
Greetings My site was seriously punished in the recent penguin update. I foolishly got some bad out sourced spammy links built and I am now paying for it 😞 I am now thinking it best to start fresh on a new url, but I am wondering if I can use the content from the flagged site on the new url. Would this be flagged as duplicate content, even if i took the old site down? your help is greatly appreciated Silas
White Hat / Black Hat SEO | | Silasrose0 -
Duplicate content showing on local pages
I have several pages which are showing duplicate content on my site for web design. As its a very competitive market I had create some local pages so I rank high if someone is searching locally i.e web design birmingham, web design tamworth etc.. http://www.cocoonfxmedia.co.uk/web-design.html http://www.cocoonfxmedia.co.uk/web-design-tamworth.html http://www.cocoonfxmedia.co.uk/web-design-lichfield.html I am trying to work out what is the best way reduce the duplicate content. What would be the best way to remove the duplicate content? 1. 301 redirect (will I lose the existing page) to my main web design page with the geographic areas mentioned. 2. Re write the wording on each page and make it unique? Any assistance is much appreciated.
White Hat / Black Hat SEO | | Cocoonfxmedia0 -
Ways to find private - non-indexed forums in a niche
I would wondering if there were ways to find non-indexed content in private forums/discussion boards. Is there a scalable 'foot-print' that suggests the forum has a private section?
White Hat / Black Hat SEO | | ilyaelbert0