What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplication content management across a subdir based multisite where subsites are projects of the main site and naturally adopt some ideas and goals from it
Hi, I have the following problem and would like which would be the best solution for it: I have a site codex21.gal that is actually part of a subdirectories based multisite (galike.net). It has a domain mapping setup, but it is hosted on a folder of galike.net multisite (galike.net/codex21). My main site (galike.net) works as a frame-brand for a series of projects aimed to promote the cultural & natural heritage of a region in NW Spain through creative projects focused on the entertainment, tourism and educational areas. The projects themselves will be a concretion (put into practice) of the general views of the brand, that acts more like a company brand. CodeX21 is one of those projects, it has its own logo, etc, and is actually like a child brand, yet more focused on a particular theme. I don't want to hide that it makes part of the GALIKE brand (in fact, I am planning to add the Galike logo to it, and a link to the main site on the menu). I will be making other projects, each of them with their own brand, hosted in subsites (subfolders) of galike.net multisites. Not all of them might have their own TLD mapped, some could simply be www.galike.net/projectname. The project codex21.gal subsite might become galike.net/codex21 if it would be better for SEO. Now, the problem is that my subsite codex21.gal re-states some principles, concepts and goals that have been defined (in other words) in the main site. Thus, there are some ideas (such as my particular vision on the possibilities of sustainable exploitation of that heritage, concepts I have developed myself as "narrative tourism" "geographical map as a non lineal story" and so on) that need to be present here and there on the subsite, since it is also philosophy of the project. BUT it seems that Google can penalise overlapping content in subdirectories based multisites, since they can seem a collection of doorways to access the same product (*) I have considered the possibility to substitute those overlapping ideas with links to the main page of the site, thought it seems unnatural from the user point of view to be brought off the page to read a piece of info that actually makes part of the project description (every other child project of Galike might have the same problem). I have considered also taking the subsite codex21 out of the network and host it as a single site in other server, but the problem of duplicated content might persist, and anyway, I should link it to my brand Galike somewhere, because that's kind of the "production house" of it. So which would be the best (white hat) strategy, from a SEO point of view, to arrange this brand-project philosophy overlapping? (*) “All the same IP address — that’s really not a problem for us. It’s really common for sites to be on the same IP address. That’s kind of the way the internet works. A lot of CDNs (content delivery networks) use the same IP address as well for different sites, and that’s also perfectly fine. I think the bigger issue that he might be running into is that all these sites are very similar. So, from our point of view, our algorithms might look at that and say “this is kind of a collection of doorway sites” — in that essentially they’re being funnelled toward the same product. The content on the sites is probably very similar. Then, from our point of view, what might happen is we will say we’ll pick one of these pages and index that and show that in the search results. That might be one variation that we could look at. In practice that wouldn’t be so problematic because one of these sites would be showing up in the search results. On the other hand, our algorithm might also be looking at this and saying this is clearly someone trying to overdo things with a collection of doorway sites and we’ll demote all of them. So what I recommend doing here is really trying to take a step back and focus on fewer sites and making those really strong, and really good and unique. So that they have unique content, unique products that they’re selling. So then you don’t have this collection of a lot of different sites that are essentially doing the same thing.” (John Mueller, Senior Webmaster Trend Analyst at Google. https://www.youtube.com/watch?time_continue=1&v=kQIyk-2-wRg&feature=emb_logo)
White Hat / Black Hat SEO | | PabloCulebras0 -
Having a Size Chart and Personalization Descriptions on each page - Duplicate Content?
Hi everyone, I am coding a Shopify Store theme currently and we want to show customers the size comparisons and personalization options for each product. It will be a great UX addition since it is the number one & two things asked via customer support. But my only concern is that Google might flag it as duplicate content since it will be visible on each product page. What are your thoughts and/or suggestions? Thank you so much in advance.
White Hat / Black Hat SEO | | MadeByBrew0 -
Can the disavow tool INCREASE rankings?
Hi Mozzers, I have a new client who has some bad links in their profile that are spammy and should be disavowed. They rank on the first page for some longer tail keywords. However, we're aiming at shorter, well-known keywords where they aren't ranking. Will the disavow tool, alone, have the ability to increase rankings (assuming on-site / off-site signals are better than competition)? Thanks, Cole
White Hat / Black Hat SEO | | ColeLusby0 -
Best Location to find High Page Authority/ Domain Authority Expired Domains?
Hi, I've been looking online for the best locations to purchase expired domains with existing Page Authority/ Domain Authority attached to them. So far I've found: http://www.expireddomains.net
White Hat / Black Hat SEO | | VelasquezEF
http://www.domainauthoritylinks.com
http://moonsy.com/expired_domains/ These site's are great but I'm wondering if I'm potentially missing other locations? Any other recommendations? Thanks.1 -
Using Yext - Opinions? Thoughts? Harmful effects on SEO?
Hi All, Does anyone have any experience using Yext? We have 36 locations across the US and I think it would be great to get our local listings knocked out efficiently. Can anyone provide information on the directories they list you in (perhaps Google places listings, for example)? Also, can anyone provide feedback as to whether or not there is any harm with blasting so many directories at once? I don't want to do anything that might harm our SEO rankings or provide low quality links. Thanks in advance!
White Hat / Black Hat SEO | | CSawatzky0 -
Why do websites use different URLS for mobile and desktop
Although Google and Bing have recommended that the same URL be used for serving desktop and mobile websites, portals like airbnb are using different URLS to serve mobile and web users. Does anyone know why this is being done even though it is not GOOD for SEO?
White Hat / Black Hat SEO | | razasaeed0 -
What happens when content on your website (and blog) is an exact match to multiple sites?
In general, I understand that having duplicate content on your website is a bad thing. But I see a lot of small businesses (specifically dentists in this example) who hire the same company to provide content to their site. They end up with the EXACT same content as other dentists. Here is a good example: http://www.hodnettortho.com/blog/2013/02/valentine’s-day-and-your-teeth-2/ http://www.braces2000.com/blog/2013/02/valentine’s-day-and-your-teeth-2/ http://www.gentledentalak.com/blog/2013/02/valentine’s-day-and-your-teeth/ If you google the title of that blog article you find tons of the same article all over the place. So, overall, doesn't this make the content on these blogs irrelevant? Does this hurt the SEO on these sites at all? What is the value of having completely unique content on your site/blog vs having duplicate content like this?
White Hat / Black Hat SEO | | MorganPorter0 -
Are there tools out there to determine when a link linked to your site? I want to know when a link farm was done a site.
In Webmaster Tools I discovered that a client of mine with signed up for or hired another company to get links. The links are poor quality and from other countries, so it looks like a link farm was done. I want to know when they links were linked to the site, and not sure how to find that information out. Does anyone know how to find this out?
White Hat / Black Hat SEO | | StrategicEdgePartners0