Remove Scraped Content?
-
There is a site I work for that has content that, when you search in Google a snippet of text from, they are not the top result for. I believe what has happened is that they had written blogs and articles and added them to their site and article directories at the same time and the article directories got cached first.
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Should I remove all content from our site where this is happening, even though we actually did create these articles?
-
I explained the answer to this in the second part of my original post.
-
I would hope you had a link, when possible, back to your site. If not, then the page should be dated by creation and last update which Google can see. Although I would not leave anything up to guess work, but make sure you have links, and I would even put the date it was posted onto the post on your site like news article are. Just another indicator.
I would not remove the content if in fact, it did originate from you.
-
Yes, it was intentionally distributed. I would like to know whether the duplicate content on our site is being seen (by Google) as copied, not original, scraped, pulled from another source because we're so lazy we can't come up with any material of our own??
If this is the case, I will be removing the content, as the quality of the content sucks and there is quite a bit of it. Please, do not respond "if the content sucks, then why have it on your site..."
-
The term "scraped content" is most often used for content that has been grabbed from your website by a visiting robot.
Based upon your posting, the duplicate content that you are talking about was intentionally distributed.
-
Then how do you determine if Google is seeing content as scraped? As you know, Google has made it very clear recently how they feel about scraped content.
-
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Search engines can not identify original authors. (unless you use the rel="author" attribute and then they are merely taking your word for it) They only know which page with the content was discovered first. The content could have been on other pages first or the content could have been published first offline. Search engines don't have divine powers
The page that ranks first in the SERPs is the one that has the best combination of relevance, domain authority and other ranking factors. Has nothing to do with authorship.
Should I remove all content from our site where this is happening, even though we actually did create these articles?
I would not do that if the content is valuable for your visitors, has acquired links from other sites or if the content is pulling traffic from search.
The take-away from this is not to give your content away if you want to rank for it in search. Giving it away can create strong competitors and feed existing competitors.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Writing <200 word pieces of content in a 7.5 hour day
My employer has a content writer who is currently working on writing unique descriptions for many pages, on the order of around 150-200 words per piece of content. A recurring theme in this content is to write a list of features such as "it does X, X, X, X, X and X", which can sometimes happen a couple of times during the content and takes up a decent chunk of wording. This content does not require in-depth research over and above reading the about us page of some sites and looking at what services they provide, as well as some quick details like their payment and delivery methods etc. As well as that the writer also writes the Meta Description and then uploads these to a CMS. There are no other tasks. Considering the writer is doing this 5 days a week, 7.5 hours a day, and isn't getting paid a poor or trainee-type wage, what would you say would be an acceptable amount to achieve on the average day? The current average works out to around, or slightly less than 8 of these pieces of content each day. Thoughts?
Content Development | | crystal.fde1 -
Is Publishing Content from a Book to your Site Considered Duplicate Content?
It is a book we don't own, either. Would you need to somehow find the original and rel=canonical it? Or is this just all around bad to do? Thanks.
Content Development | | ThridHour0 -
Content: Best Blogs Article
Hello, For an Ecommerce site, I think a good way to get known is to write a "Best X Blogs" article, where X is a topic in your industry, and then letting the people you link to know about the article. I got the idea from a Mozinar. My question is, how close does the X from above have to be in your niche? For example, if your product is running shoes can you write a "best athlete blogs" article? I'm worried about them reading the article, then leaving. In some smaller niches the topics closest to the product don't have much in the way of blogs out there. So how close to your niche does the Best X Blogs topic have to be?
Content Development | | BobGW0 -
What are your favourite tools for discoving popular content?
Since looking around for popular content discovery tools I came across a review about a tool called PostRank which seemed ideal until I learned Google had bought it and shut it down already 😞 So far I have been using Google reader and Topsy to discover popular content in my niches but I am guessing there are a whole bunch of other tools out there that may work even better - please do let me know your favourites!
Content Development | | Clicksjim0 -
Is this duplicate content?
I'm optimizing a Magento site and have a question regarding duplicate content. Currently, you can dig down to an individual product listings with URLs similar to this: (1) http://www.foo.com/category/sub-category/sub-sub-category/item.html However, we also have a "Top 50" area, with a link to the same page; however, the URL for that page is: (2) http://www.foo.com/item.html Both are dynamic, so a static page for (2) with different content is out of the question. I asked IT to have both (1) and (2) point to exactly the same page, within the same categor(ies), but they said I would have choose one or the other So, here are my questions: Will Google consider the pages to be duplicates of each other, and thus incur a penalty; If I were to choose one structure, which would be the "friendliest?" I've think I've come across questions similar to this in Q&A, but haven't been able to locate them; so, I'm sorry to be posting a "duplicate question." I've been busy writing completely different product descriptions, nice and deep and value-rich, for more than 300 items and categories and am only now starting to look at current SEO protocols; I'm hoping to ask Google for a site reevaluation in another 2 weeks or so. Thanks.
Content Development | | RScime250 -
Nearely identical content
Hi Everybody, I'm just checking the warnings from Seomoz an realized that on our site there are a lot of duplicate page content problems. In fact some of them are not really duplicated content because there are subtle differencies ie. colour or pack of products: http://www.szepsegbolt.hu/termekek/david_beckham_intimately_yours_for_man_eau_de_toilette_30_ml.html http://www.szepsegbolt.hu/termekek/david_beckham_intimately_yours_for_man_eau_de_toilette_50_ml.html What do you suggest, ignore this warning or change something on the site? Thank you in advance Balint
Content Development | | SanomaMediaseo0 -
Syndicating content with rel=author tag in it
If I have an article with my rel=author tag attached to it, and then I syndicate that article to another web site, should I keep the rel=author tag in that synbdicated article? Basically, what I'm worried about is that there will be 2 duplicate articles with my author tag on 2 different web sites. (I intend to put a canonical tag in the syndicated article so there is no duplicate content penalty) What is the best practice for this?
Content Development | | greggseo0 -
My WebSite has two sections with overlapping, or redundant articles on the same topics. Google is only listing one or the other article in Search Results. What should I do to have both pages (similiar but unique content ) to be listed?
My Web Site has two sections with overlapping, or redundant articles on the same topics. Google is only listing one or the other article in Search Results. What should I do to have both pages (similar but unique content ) to be listed? Example: http://www.womenshealthcaretopics.com/pregnancy_week_12.htm http://www.womenshealthcaretopics.com/pregnancy_12_weeks.html
Content Development | | docjamesmd0