Website content has been scraped - recommended action
-
So whilst searching for link opportunities, I found a website that has scraped content from one of our websites. The website looks pretty low quality and doesn't link back. What would be the recommended course of action?
-
Email them and ask for a link back. I've got a feeling this might not be the best idea. The website does not have much authority (yet) and a link might look a bit dodgy considering the duplicate content
-
Ask them to remove the content. It is duplicate content and could hurt our website.
-
Do nothing. I don't think our website will get penalised for it since it was here first and is in the better quality website. Possibly report them to google for scraping?
What do you guys think?
-
-
It's good to be aware of the scrapers to see what they are trying to do with your content, and it can't hurt to ask them to remove it.
Don't ask for a link, you never want links for sites that rely on bad practices like that, it can hurt you.
This is most likely not effect you if left alone. If the scraper is grabbing from source code, then implementing a canonical tag in your content will help Google know where the content came from (but they probably already know).
-
Most of the time, contacting them is a waste of time. Being a weasel is their business model. Weasels usually have hidden domain registration data so finding their contact information is really hard.
If they have republished my content on blogspot, youtube, facebook or other community sites, I simply file a DMCA and the content is usually taken down quickly.
I don't want duplicates of my content on the web, especially not on powerful sites. Powerful sites are generally more responsible than Joe Schmoe working in his basement. Often just an email to them with "copyright infringement on yourdamndomain.com will get your content taken down. I've called people on the phone to tell them that they have my stuff on their site and that is faster than filling out forms. Be nice, not threatening and they usually comply if you get them on the phone.
I don't ask for links because I don't want weasels linking to me.
-
Was your site's scraped content already indexed in Google?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content from Wordpress Template
Hi Wondering if anyone can help, my site has flagged up with duplicate content on almost every page, i think this is because the person who set up the site created a lot of template pages which are using the same code but have slightly different features on. How would I go about resolving this? Would I need to recode every template page they have created?
Technical SEO | | Alix_SEO0 -
Many Errors on E-commerce website mainly Duplicate Content - Advice needed please!
Hi Mozzers, I would need some advice on how to tackle one of my client’s websites. We have just started doing SEO for them and after moz crawled the e-commerce it has detected: 36 329 Errors – 37496 warnings and 2589 Notices all going up! Most of the errors are due to duplicate titles and page content but I cannot identify where the duplicate pages come from, these are the links moz detected of the Duplicate pages (unfortunately I cannot add the website for confidentiality reasons) : • www.thewebsite.com/index.php?dispatch=categories.view&category_id=233&products_per_00&products_per_2&products_per_2&products_per_2&page=2 • www.thewebsite.com/index.php?dispatch=categories.view&category_id=233&products_per_00=&products_per_00&products_per_2&products_per_2&products_per_2&page=2 • www.thewebsite.com/index.php?dispatch=categories.view&category_id=233&products_per_00=&products_per_00&products_per_2&page=2 • www.thewebsite.com/index.php?dispatch=categories.view&category_id=233&products_per_2=&products_per_00&page=2 • www.thewebsite.com/index.php?dispatch=categories.view&category_id=233&products_per_00&products_per_00&products_per_00&products_per_00&page=2 With these URLs it is quite hard to identify which pages need to be canonicalize. And this is jsut an example out of thousands on this website. If anyone would have any advice on how to fix this and how to tackle 37496 errors on a website like this that would be great. Thank you for your time, Lyam
Technical SEO | | AlphaDigital0 -
Duplicate Content in Wordpress.com
Hi Mozers! I have a client with a blog on wordpress.com. http://newsfromtshirts.wordpress.com/ It just had a ranking drop because of a new Panda Update, and I know it's a Dupe Content problem. There are 3900 duplicate pages, basically because there is no use of noindex or canonical tag, so archives, categories pages are totally indexed by Google. If I could install my usual SEO plugin, that would be a piece of cake, but since Wordpress.com is a closed environment I can't. How can I put a noindex into all category, archive and author peges in wordpress.com? I think this could be done by writing a nice robot.txt, but I am not sure about the syntax I shoud use to achieve that. Thank you very much, DoMiSol Rossini
Technical SEO | | DoMiSoL0 -
Duplicate Content
Hi, we need some help on resolving this duplicate content issue,. We have redirected both domains to this magento website. I guess now Google considered this as duplicate content. Our client wants both domain name to go to the same magento store. What is the safe way of letting Google know these are same company? Or this is not ideal to do this? thanks
Technical SEO | | solution.advisor0 -
Optimizing Flash websites
Does anyone know the latest on how well Google is indexing flash sites? I'm talking to a prospect now whose site is 100% flash, but looking at it all the content within the site is indexed and returning in the results. This is promising enough, but does anyone know exactly how well Google is indexing flash content these days and are there still any problems with that? Thanks!
Technical SEO | | MattBarker0 -
Duplicate Content - That Old Chestnut!!!
Hi Guys, Hope all is well, I have a question if I may? I have several articles which we have written and I want to try and find out the best way to post these but I have a number of concerns. I am hoping to use the content in an attempt to increase the Kudos of our site by providing quality content and hopefully receiving decent back links. 1. In terms of duplicate content should I only post it on one place or should I post the article in several places? Also where would you say the top 5 or 10 places would be? These are articles on XML, Social Media & Back Links. 2. Can I post the article on another blog or article directory and post it on my websites blog or is this a bad idea? A million thanks for any guidance. Kind Regards, C
Technical SEO | | fenwaymedia0 -
Lots of duplicate content warnings
I have a site that says that I have 2,500 warnings. It is a real estate website and of course we use feeds. it says I have a lot of duplicate content. One thing is a page called "Request an appointment" and that is a url for each listing. Since there are 800 listings on my site. How could I solve this problem so that this doesn't show up as duplicate content since I use the same "Request an Appointment" verbeage on each of those? I guess my developer who used php to do it, created a dedicated url to each. Any help would be greatly appreciated.
Technical SEO | | SeaC0 -
Duplicate Content issue
I have been asked to review an old website to an identify opportunities for increasing search engine traffic. Whilst reviewing the site I came across a strange loop. On each page there is a link to printer friendly version: http://www.websitename.co.uk/index.php?pageid=7&printfriendly=yes That page also has a link to a printer friendly version http://www.websitename.co.uk/index.php?pageid=7&printfriendly=yes&printfriendly=yes and so on and so on....... Some of these pages are being included in Google's index. I appreciate that this can't be a good thing, however, I am not 100% sure as to the extent to which it is a bad thing and the priority that should be given to getting it sorted. Just wandering what views people have on the issues this may cause?
Technical SEO | | CPLDistribution0