Remove Scraped Content?
-
There is a site I work for that has content that, when you search in Google a snippet of text from, they are not the top result for. I believe what has happened is that they had written blogs and articles and added them to their site and article directories at the same time and the article directories got cached first.
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Should I remove all content from our site where this is happening, even though we actually did create these articles?
-
I explained the answer to this in the second part of my original post.
-
I would hope you had a link, when possible, back to your site. If not, then the page should be dated by creation and last update which Google can see. Although I would not leave anything up to guess work, but make sure you have links, and I would even put the date it was posted onto the post on your site like news article are. Just another indicator.
I would not remove the content if in fact, it did originate from you.
-
Yes, it was intentionally distributed. I would like to know whether the duplicate content on our site is being seen (by Google) as copied, not original, scraped, pulled from another source because we're so lazy we can't come up with any material of our own??
If this is the case, I will be removing the content, as the quality of the content sucks and there is quite a bit of it. Please, do not respond "if the content sucks, then why have it on your site..."
-
The term "scraped content" is most often used for content that has been grabbed from your website by a visiting robot.
Based upon your posting, the duplicate content that you are talking about was intentionally distributed.
-
Then how do you determine if Google is seeing content as scraped? As you know, Google has made it very clear recently how they feel about scraped content.
-
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Search engines can not identify original authors. (unless you use the rel="author" attribute and then they are merely taking your word for it) They only know which page with the content was discovered first. The content could have been on other pages first or the content could have been published first offline. Search engines don't have divine powers
The page that ranks first in the SERPs is the one that has the best combination of relevance, domain authority and other ranking factors. Has nothing to do with authorship.
Should I remove all content from our site where this is happening, even though we actually did create these articles?
I would not do that if the content is valuable for your visitors, has acquired links from other sites or if the content is pulling traffic from search.
The take-away from this is not to give your content away if you want to rank for it in search. Giving it away can create strong competitors and feed existing competitors.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Creation For Blog & Ranking Locally
Hey Everyone, I'm trying to rank for specific long tail key words such as: lower back pain treatment exercises lower back surgery options herniated disc exercises and etc. My question is: If I create a blog or article on these key words and integrate content and video within it and on youtube, will these blog posts come up locally when someone searches it within my area?
Content Development | | backinmotion1231 -
Knowledge base for seo, announcing new articles on blog (dupe content)
Hi all, Im thinking of creating a knowledge base with all many asked questions in my company. This could be a great Link-bait source but also nice ranking opportunities i think. But sometimes some new articles are so actual that i also want to blog them.
Content Development | | mdkay
Can i for example double post them (or post a big excerpt) on the blog and canonicalise it to the KB article?
Will links to the blog have equal value to KB links? And will this work?0 -
Duplicate Content
I have a service based client that is interested in optimizing his website for all the services that he provides in all the locations that he provides them in. For example: Service 1, location 1 Service 1, location 2 Service 2, location 1 Service 2, location 2 He wants to essentially create an individual page for each of the above, but i'm concerned that he will be penalized for duplicate content. Each of the pages would have the keyword in the url, page title and within the main body of content. We would certainly alter the content somewhat, but not sure how much a difference this would make. Any thoughts or advice would be greatly appreciated.
Content Development | | embracedarrenhughes1 -
Stolen Content and a Panda Penalty
Hey Folks Question for those folks that have spent some time helping people with the recent penalties and the like. I have a client who has a clear Panda Penalty, huge drop in traffic on the initial Panda date and a further drop on the second date. Much smaller incremental drops on subsequent recent updates as well. From digging in it seems fairly cut and dry - copyscape shows another 250 or so sites with content from this site and there are nearly 2000 external URLs with duplicate content across these sites. We are talking complete, shameless copies of all of the text, sometimes the images as well. The client claims the content is all 100% unique and is his content and that the other blogs must have stolen his content resulting in the penalty - which, if it is true, and I have no reason to suspect otherwise, kind of sucks. Now, many moons ago, way before Penguin or Panda (maybe around 2006) I had a client that had suddenly lost all traffic and their historical rankings. No funny business, it was a small company, had been online since around 2000 and they were pretty much the first of their kind and always did very well from organic search. As it turned out, the content from the site had not really changed since it was set up and as lots of companies had sprung up offering a similar service they had seen their content copied wholesale, across many sites, all over the world. We attempted to contact many of these sites and got some results but many were just old, abandoned copy cat sites on advert supported hosting that had ceased to trade so we maybe got rid of about 20%. Well, in the end we just decided to rewrite the content, we did this and sure enough, the site bounced back to it's previous standing and has been pretty much there ever since. Now that was kind of easy, the site had maybe 20 pages, and it needed a sprucing up but in this case the site has around 500 pages so doing a rewrite is not going to be so easy. Problem is, I don't see removal requests being particularly successful either. So, I see the options and steps as being. Contact all the sites and request the removal of the content use the Google content removal facility:
Content Development | | Marcus_Miller
https://www.google.com/webmasters/tools/removals File a DMCA takedown for anything remaining Report Scraped Pages to Google:
https://docs.google.com/spreadsheet/viewform?formkey=dGM4TXhIOFd3c1hZR2NHUDN1NmllU0E6MQ&ndplr=1 Submit a spam report for all sites involved ? Submit a reconsideration request to let Google know what we have been doing (unlikely In a nutshell, do everything we can to get this content removed and then documenting this to Google in the hope we catch hold of someone who hears our plight. Interestingly enough, this is a sensitive one, so no URL but I would welcome any thoughts or experiences any of you may have had with similar problems. There is a little extra info here from Matt Cutts + Barry Schwartz that kind of tallies with my approach above but would really like to hear any feedback. http://www.seroundtable.com/google-stolen-content-13243.html Cheers all Marcus0 -
Duplicate Content Discovery
I was hit with Penguin on April 24th like a ton of bricks. Luckily my cash cow keyword was kept safe and still is today with even an increase in traffic over the year. With some other main keywords I used to rank far I fell off the board on that day. Since then I have been slowly trying to clean things up as much as I know Today I was sitting down with my coffee and Penguin mindset and I decided to use copyscape again to review duplicate content issues and something I noticed which I either didn't before or didn't think was an issue was my footer. In my footer I used a blurb from some other site in my niche a long time ago. Which I discovered they used from one of the main sites in my niche. Anyways I noticed that my footer is what kept coming up as being duplicate content and was always at an overage of 28% according to copyscape. My question is should I be worried about the footer? Is 28% a lot?
Content Development | | cbielich0 -
Fresh content ideas for a static site?
I have an ecommerce site. My home page is set-up just as I want it. I'm not looking to redo it or change my site to a blog. Just looking for some new, different, SEO friendly ideas or concepts to keep it "fresh".
Content Development | | VictorVC0 -
Duplicate content on the homepage
Hello SEOMOZ Is giving me an error on duplicated content on my site. When viewing the details it is showing the following as duplicated content domain.co.uk/ domain.co.uk domain.co.uk/index.html Obviously these are the same pages. Why is it seeing them as seperate. Does anyone know how I can resolve this issue? Many thanks
Content Development | | lcdesign0 -
Is this Duplicate Content?
I searched a snippet of one of our Articles (in quotes) and got two results back in Google, one for the article on our site and one for our development/staging site. Does that mean that our development site is getting indexed by Google, even thought we "Disallow:/" in the robots.txt file? Is this a big duplicate content issue? Thanks
Content Development | | poolguy0