Duplicate content
-
I run about 10 sites and most of them seemed to fall foul of the penguin update and even though I have never sought inorganic links I have been frantically searching for a link based answer since April.
However since asking a question here I have been pointed in another direction by one of your contributors. It seems At least 6 of my sites have duplicate content issues.
If you search Google for "We have selected nearly 200 pictures of short haircuts and hair styles in 16 galleries" which is the first bit of text from the site short-hairstyles.com about 30000 results appear. I don't know where they're from nor why anyone would want to do this. I presume its automated since there is so much of it.
I have decided to redo the content. So I guess (hope) at some point in the future the duplicate nature will be flushed from Google's index?
But how do I prevent it happening again? It's impractical to redo the content every month or so.
For example if you search for "This facility is written in Flash to use it you need to have Flash installed." from another of my sites that I coincidently uploaded a new page to a couple of days ago, only the duplicate content shows up not my original site. So whoever is doing this is finding new stuff on my site and getting it indexed on google before even google sees it on my site!
Thanks,
Ian
-
I don't have any experience with Cloudflare so I can't offer an opinion on their services. And without a proper audit of your site and link profile, there is no honest way to know exactly what the core issues are on the site. Short of a proper audit, it's all a guess. That's the bigger concern.
Maybe it's links. Maybe its duplicate content perception. Maybe it's a dozen seemingly insignificant issues that accumulated to the breaking point with a trigger event like Penguin.
Unfortunately that's the reality of SEO in 2012.
-
ok, maybe I'm not getting something or not explaining myself properly.
When I say things like "30000 times", "every page" and "it is the majority of the content" in the context that I have in my head I'm saying its not a trivial thing and I have looked into it at length.
If you thought there was some verification needed to answer the question the information is there to have a look.
Complex things are made up of lots of uncomplex things.
How strong is this site? Up until April I'd say very strong, it came in at number 1 for several high volume keywords (still does in bing and yahoo)
As I said in the original question I have decided to redo most of the content on this site anyway so whether this whole issue is an issue or not isn't an issue.
The original question was how do you prevent it happening again? Is rel author rel-publisher and g+ the answer?
or what about this? http://www.cloudflare.com/plans
-
"it is the majority of my content". that's what I asked originally - if it is the majority of content on individual pages. If that's true, it could be a cause of problems, however SEO is an extremely complex process with multiple algorithms so unfortunately, without a detailed review of the site, it's dangerous to assume that specific issue is the cause of your problems.
How strong is your site in other regards? Do you implement rel-author or rel-publisher code and tie it to a Google+ account to communicate you're the original source? Do you have enough other trust signals in place? There are many other similar questions that need to be answered before anyone can confidently make serious recommendations.
-
1. Google doesn't seem to know this and has penalised my sites for something.
2. It is the majority of the content. Its pretty much all of it, upto 30000 times.
3. I've lost 70% of my traffic via recent Google updates. That is THE over whelming concern which is why I came and joined this site.
I arrived at this point by asking this question http://www.seomoz.org/q/penguin-issues if you disagree with the track I got sent on can you suggest a different one?
-
1. you're not generating the duplicate content so there's nothing you can logically do about on any kind of a scalable frequency, let alone prevent.
2. If it's not the majority of content on a page, it's not a serious problem. In fact, it's common to the internet.
3. Don't allow non-issues become an overwhelming concern. Focus on what you can do something about, and things that are more important and really do have a negative impact on your SEO that are within you control.
-
OK but the snippet is an exact match (in speech marks) and there's 30000 of them that's not just monkeys typing Shakespeare. Every page (300 or so) on that site has unique content and more or less each page has upto 30000 duplicates, most a lot less that 30000 but a lot more that 1, which it should be. If there was a couple of coincidences, fine, but there's not.
-
Just finding a snippet that's as short as the examples you gave is not a reason to be concerned about duplicate content in itself. A typical page should have hundreds of words and rank for whatever phrase or phrases you care about, not for a single sentence within the content.
If, on the other hand, you have the overwhelming majority of the content from one of your pages duplicated, that's a reason to be concerned.
So - how much content do you have on YOUR site on the page(s) in question? And have you checked to find out if the majority is duplicated? That's where the focus needs to be.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Putting my content under domain.com/content, or under related categories: domain.com/bikes/content ?
Hello This questions plays on what Joe Hall talked about during this years' MozCon: Rethinking Information Architecture for SEO and Content Marketing. My Case:
Intermediate & Advanced SEO | | Inevo
So.. we're working out guidelines and templates for a costumer (sporting goods store) on how to publish content (articles, videos, guides) on their category pages, product pages, and other pages. At this moment I have 2 choices:
1. Use a url-structure/information architecture where all the content is placed in one subfolder, for example domain.com/content. Although it's placed here, there's gonna be extensive internal linking from /content to the related category pages, so the content about bikes (even if it's placed under domain.com/bikes) will be just as visible on the pages related to bikes. 2. Place the content about bikes on a subdirectory under the bike category, **for example domain.com/bikes/content. ** The UX/interface for these two scenarios will be identical, but the directories/folder-hierarchy/url structure will be different. According to Joe Hall, the latter scenario will build up more topical authority and relevance towards the category/topic, and should be the overall most ideal setup. Any thoughts on which of the two solutions is the most ideal? PS: There is one critical caveat her: my costumer uses many url-slugs subdirectories for their categories, for example domain.com/activity/summer/bikes/, which means the content in the first scenario will be 4 steps away from the home page. Is this gonna be a problem? Looking forward to your thoughts 🙂 Sigurd, INEVO0 -
Medical / Health Content Authority - Content Mix Question
Greetings, I have an interesting challenge for you. Well, I suppose "interesting" is an understatement, but here goes. Our company is a women's health site. However, over the years our content mix has grown to nearly 50/50 between unique health / medical content and general lifestyle/DIY/well being content (non-health). Basically, there is a "great divide" between health and non-health content. As you can imagine, this has put a serious damper on gaining ground with our medical / health organic traffic. It's my understanding that Google does not see us as an authority site with regard to medical / health content since we "have two faces" in the eyes of Google. My recommendation is to create a new domain and separate the content entirely so that one domain is focused exclusively on health / medical while the other focuses on general lifestyle/DIY/well being. Because health / medical pages undergo an additional level of scrutiny per Google - YMYL pages - it seems to me the only way to make serious ground in this hyper-competitive vertical is to be laser targeted with our health/medical content. I see no other way. Am I thinking clearly here, or have I totally gone insane? Thanks in advance for any reply. Kind regards, Eric
Intermediate & Advanced SEO | | Eric_Lifescript0 -
Pages with Duplicate Page Content (with and without www)
How can we resolve pages with duplicate page content? With and without www?
Intermediate & Advanced SEO | | directiq
Thanks in advance.0 -
Merge content pages together to get one deep high quality content page - good or not !?
Hi, I manage the SEO of a brand poker website that provide ongoing very good content around specific poker tournaments, but all this content is split into dozens of pages in different sections of the website (blog section, news sections, tournament section, promotion section). It seems like today having one deep piece of content in one page has better chance to get mention / social signals / links and therefore get a higher authority / ranking / traffic than if this content was split into dozens of pages. But the poker website I work for and also many other website do generate naturally good content targeting long tail keywords around a specific topic into different section of the website on an ongoing basis. Do you we need once a while to merge those content pages into one page ? If yes, what technical implementation would you advice ? (copy and readjust/restructure all content into one page + 301 the URL into one). Thanks Jeremy
Intermediate & Advanced SEO | | Tit0 -
Duplicate Content and Titles
Hi Mozzers, I saw a considerable amount of duplicate content and page titles on our clients website. We are just implementing a fix in the CMS to make sure that these are all fixed. What changes do you think I could see in terms of rankings?
Intermediate & Advanced SEO | | KarlBantleman0 -
Could you use a robots.txt file to disalow a duplicate content page from being crawled?
A website has duplicate content pages to make it easier for users to find the information from a couple spots in the site navigation. Site owner would like to keep it this way without hurting SEO. I've thought of using the robots.txt file to disallow search engines from crawling one of the pages. Would you think this is a workable/acceptable solution?
Intermediate & Advanced SEO | | gregelwell0 -
Duplicate content issue for franchising business
Hi All We are in the process of adding a franchise model to our exisitng stand alone business and as part of the package given to the franchisee will be a website with conent identical to our existing website apart from some minor details such as contact and address details. This creates a huge duplicate content issue and even if we implement a cannonical approach to this will still be unfair to the franchisee in terms of their markeitng and own SEO efforts. The url for each franchise will be unique but the content will be the same to a large extend. The nature of the service we offer (professional qualificaitons) is such that the "products" can only be described in a certain way and it will be near on in impossible to have a unique set of "product" pages for each franchisee. I hope that some of you have come across a similar problem or that some of you have suggestions or ideas for us to get round this. Kind regards Peter
Intermediate & Advanced SEO | | masterpete0 -
Duplicate Content issue on pages with Authority and decent SERP results
Hi, I'm not sure what the best thing to do here is. I've got quite a few duplicate page errors in my campaign. I must admit the pages were originally built just to rank a keyword variation. e.g. Main page keyword is [Widget in City] the "duplicate" page is [Black Widget in City] I guess the normal route to deal with duplicate pages is to add a canonical tag and do a 304 redirect yea? Well these pages have some page Authority and are ranking quite well for their exact keywords, what do I do?
Intermediate & Advanced SEO | | SpecialCase0