What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Social engineering content detected
hello, i have Got Social engineering content detected Message on webmaster tools on my around 20 sites, i have checked on server cleared, all unnecessary folders, But still i am not getting rectified this issue. One more error i got is Remove the deceptive content, But there is no any content on website which can harm my site, so kindly help & tell us steps we need take to resolve this issue, i am facing it from 10 days, yet not able to resolve, thnx in advance
White Hat / Black Hat SEO | | rohitiepl0 -
Unnatural inbound links message from Google Webmaster Tools!
Hi Everyone, I just got this message from GWT(image below) This is probably a penguin Penalty. What is clear is I have to find the best and most efficient way to tackle this issue. We will probably lose tons of traffic in the next couple of weeks so I would like to get the best suggestions and maybe a guideline on how to do this in the most effective way! Thank you! 1a0X2M2a1h0A
White Hat / Black Hat SEO | | Ideas-Money-Art0 -
SEOLutions - Paint it White... Has any one used?
Has anyone used the tiered link building service offered by seolutions (http://seolutions.biz/store/seo-solutions/premium-solutions-paint-it-white.html)? If so, can you provide any insight into how effective it was in the long and short term? Thanks!
White Hat / Black Hat SEO | | PeterAlexLeigh0 -
How does Google decide what content is "similar" or "duplicate"?
Hello all, I have a massive duplicate content issue at the moment with a load of old employer detail pages on my site. We have 18,000 pages that look like this: http://www.eteach.com/Employer.aspx?EmpNo=26626 http://www.eteach.com/Employer.aspx?EmpNo=36986 and Google is classing all of these pages as similar content which may result in a bunch of these pages being de-indexed. Now although they all look rubbish, some of them are ranking on search engines, and looking at the traffic on a couple of these, it's clear that people who find these pages are wanting to find out more information on the school (because everyone seems to click on the local information tab on the page). So I don't want to just get rid of all these pages, I want to add content to them. But my question is... If I were to make up say 5 templates of generic content with different fields being replaced with the schools name, location, headteachers name so that they vary with other pages, will this be enough for Google to realise that they are not similar pages and will no longer class them as duplicate pages? e.g. [School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards. Something like that... Anyone know if Google would slap me if I did that across 18,000 pages (with 4 other templates to choose from)?
White Hat / Black Hat SEO | | Eteach_Marketing0 -
I need to find a website I can get guest blogs on for a removal website.
Hello everyone, I need to find a website I can guess blog posts on. Please can someone tell me where I need to look and how the process works: E.g Do i email the blogger saying I'll pay him? Also what categories would work well for removal website. www.van-plus.com to be precise. Thanks in advance!
White Hat / Black Hat SEO | | vanplus1 -
Access Denied - 2508 Errors - 403 Response code in webmaster tools
Hello Fellow members, From 9th may I am getting this error messages & these crawl errors is increasing daily. Google is not able to crawl my URLS & getting 403 response code & saying ACCESS Denied Errors in GWT. My all Indexed pages are de-indexed. Why I am receiving this errors ? My website is working fine but why Google is not able to crawl my pages. PLEASE TELL ME what is the ISSUE, I need to resolve ASAP on 9th may I got a message in GWT as well for "http://www.mysitename.co.uk/ Increase in authorization permission errors " Google detected a significant increase in the number of URLs we were blocked from crawling due to authorization permission errors. After this all problem started. Kindly tell what is the issue & how can I solve this. WGsu8pU
White Hat / Black Hat SEO | | sourabhrana390 -
Using Redirects To Avoid Penalties
A quick question, born out of frustration! If a webpage has been penalised for unnatural links, what would be the effects of moving that page to a new URL and setting up a 301 redirect from the old penalised page to the new page? Will Google treat the new page as ‘non-penalised’ and restore your rankings? It really shouldn’t work, but I’m convinced (although not certain) that our clients competitor has done this, with great effect! I suppose you could also achieve this using canonicalisation too! Many thanks in advance, Lee.
White Hat / Black Hat SEO | | Webpresence0 -
Possibly a dumb question - 301 from a banned domain to new domain with NEW content
I was wondering if banned domains pass any page rank, link love, etc. My domain got banned and I AM working to get it unbanned, but in the mean time, would buying a new domain, and creating NEW content that DOES adhere to the google quality guidelines, help at all? Would this force an 'auto-evaluation' or 're-evaluation' of the site by google? or would the new domain simply have ZERO effect from the 301 unless that old domain got into google's good graces again.
White Hat / Black Hat SEO | | ilyaelbert0