Duplicate content
-
I run about 10 sites and most of them seemed to fall foul of the penguin update and even though I have never sought inorganic links I have been frantically searching for a link based answer since April.
However since asking a question here I have been pointed in another direction by one of your contributors. It seems At least 6 of my sites have duplicate content issues.
If you search Google for "We have selected nearly 200 pictures of short haircuts and hair styles in 16 galleries" which is the first bit of text from the site short-hairstyles.com about 30000 results appear. I don't know where they're from nor why anyone would want to do this. I presume its automated since there is so much of it.
I have decided to redo the content. So I guess (hope) at some point in the future the duplicate nature will be flushed from Google's index?
But how do I prevent it happening again? It's impractical to redo the content every month or so.
For example if you search for "This facility is written in Flash
to use it you need to have Flash
installed." from another of my sites that I coincidently uploaded a new page to a couple of days ago, only the duplicate content shows up not my original site. So whoever is doing this is finding new stuff on my site and getting it indexed on google before even google sees it on my site!
Thanks,
Ian
-
I don't have any experience with Cloudflare so I can't offer an opinion on their services. And without a proper audit of your site and link profile, there is no honest way to know exactly what the core issues are on the site. Short of a proper audit, it's all a guess. That's the bigger concern.
Maybe it's links. Maybe its duplicate content perception. Maybe it's a dozen seemingly insignificant issues that accumulated to the breaking point with a trigger event like Penguin.
Unfortunately that's the reality of SEO in 2012.
-
ok, maybe I'm not getting something or not explaining myself properly.
When I say things like "30000 times", "every page" and "it is the majority of the content" in the context that I have in my head I'm saying its not a trivial thing and I have looked into it at length.
If you thought there was some verification needed to answer the question the information is there to have a look.
Complex things are made up of lots of uncomplex things.
How strong is this site? Up until April I'd say very strong, it came in at number 1 for several high volume keywords (still does in bing and yahoo)
As I said in the original question I have decided to redo most of the content on this site anyway so whether this whole issue is an issue or not isn't an issue.
The original question was how do you prevent it happening again? Is rel author rel-publisher and g+ the answer?
or what about this? http://www.cloudflare.com/plans
-
"it is the majority of my content". that's what I asked originally - if it is the majority of content on individual pages. If that's true, it could be a cause of problems, however SEO is an extremely complex process with multiple algorithms so unfortunately, without a detailed review of the site, it's dangerous to assume that specific issue is the cause of your problems.
How strong is your site in other regards? Do you implement rel-author or rel-publisher code and tie it to a Google+ account to communicate you're the original source? Do you have enough other trust signals in place? There are many other similar questions that need to be answered before anyone can confidently make serious recommendations.
-
1. Google doesn't seem to know this and has penalised my sites for something.
2. It is the majority of the content. Its pretty much all of it, upto 30000 times.
3. I've lost 70% of my traffic via recent Google updates. That is THE over whelming concern which is why I came and joined this site.
I arrived at this point by asking this question http://www.seomoz.org/q/penguin-issues if you disagree with the track I got sent on can you suggest a different one?
-
1. you're not generating the duplicate content so there's nothing you can logically do about on any kind of a scalable frequency, let alone prevent.
2. If it's not the majority of content on a page, it's not a serious problem. In fact, it's common to the internet.
3. Don't allow non-issues become an overwhelming concern. Focus on what you can do something about, and things that are more important and really do have a negative impact on your SEO that are within you control.
-
OK but the snippet is an exact match (in speech marks) and there's 30000 of them that's not just monkeys typing Shakespeare. Every page (300 or so) on that site has unique content and more or less each page has upto 30000 duplicates, most a lot less that 30000 but a lot more that 1, which it should be. If there was a couple of coincidences, fine, but there's not.
-
Just finding a snippet that's as short as the examples you gave is not a reason to be concerned about duplicate content in itself. A typical page should have hundreds of words and rank for whatever phrase or phrases you care about, not for a single sentence within the content.
If, on the other hand, you have the overwhelming majority of the content from one of your pages duplicated, that's a reason to be concerned.
So - how much content do you have on YOUR site on the page(s) in question? And have you checked to find out if the majority is duplicated? That's where the focus needs to be.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO: How to change page content + shift its original content to other page at the same time?
Hello, I want to replace the content of one page of our website (already indexeed) and shift its original content to another page. How can I do this without problems like penalizations etc? Current situation: Page A
Intermediate & Advanced SEO | | daimpa
URL: example.com/formula-1
Content: ContentPageA Desired situation: Page A
URL: example.com/formula-1
Content: NEW CONTENT! Page B
URL: example.com/formula-1-news
Content: ContentPageA (The content that was in Page A!) Content of the two pages will be about the same argument (& same keyword) but non-duplicate. The new content in page A is more optimized for search engines. How long will it take for the page to rank better?0 -
Duplicate Content with URL Parameters
Moz is picking up a large quantity of duplicate content, consists mainly of URL parameters like ,pricehigh & ,pricelow etc (for page sorting). Google has indexed a large number of the pages (not sure how many), not sure how many of them are ranking for search terms we need. I have added the parameters into Google Webmaster tools And set to 'let google decide', However Google still sees it as duplicate content. Is it a problem that we need to address? Or could it do more harm than good in trying to fix it? Has anyone had any experience? Thanks
Intermediate & Advanced SEO | | seoman100 -
Duplicate content based on filters
Hi Community, There have probably been a few answers to this and I have more or less made up my mind about it but would like to pose the question or as that you post a link to the correct article for this please. I have a travel site with multiple accommodations (for example), obviously there are many filter to try find exactly what you want, youcan sort by region, city, rating, price, type of accommodation (hotel, guest house, etc.). This all leads to one invevitable conclusion, many of the results would be the same. My question is how would you handle this? Via a rel canonical to the main categories (such as region or town) thus making it the successor, or no follow all the sub-category pages, thereby not allowing any search to reach deeper in. Thanks for the time and effort.
Intermediate & Advanced SEO | | ProsperoDigital0 -
Duplicate Content Question
Hey Everyone, I have a question regarding duplicate content. If your site is penalized for duplicate content, is it just the pages with the content on it that are affected or is the whole site affected? Thanks 🙂
Intermediate & Advanced SEO | | jhinchcliffe0 -
Duplicate content across internation urls
We have a large site with 1,000+ pages of content to launch in the UK. Much of this content is already being used on a .nz url which is going to stay. Do you see this as an issue or do you thin Google will take localised factoring into consideration. We could add a link from the NZ pages to the UK. We cant noindex the pages as this is not an option. Thanks
Intermediate & Advanced SEO | | jazavide0 -
How to prevent duplicate content within this complex website?
I have a complex SEO issue I've been wrestling with and I'd appreciate your views on this very much. I have a sports website and most visitors are looking for the games that are played in the current week (I've studied this - it's true). We're creating a new website from scratch and I want to do this is as best as possible. We want to use the most elegant and best way to do this. We do not want to use work-arounds such as iframes, hiding text using AJAX etc. We need a solid solution for both users and search engines. Therefor I have written down three options: Using a canonical URL; Using 301-redirects; Using 302-redirects. Introduction The page 'website.com/competition/season/week-8' shows the soccer games that are played in game week 8 of the season. The next week users are interested in the games that are played in that week (game week 9). So the content a visitor is interested in, is constantly shifting because of the way competitions and tournaments are organized. After a season the same goes for the season of course. The website we're building has the following structure: Competition (e.g. 'premier league') Season (e.g. '2011-2012') Playweek (e.g. 'week 8') Game (e.g. 'Manchester United - Arsenal') This is the most logical structure one can think of. This is what users expect. Now we're facing the following challenge: when a user goes to http://website.com/premier-league he expects to see a) the games that are played in the current week and b) the current standings. When someone goes to http://website.com/premier-league/2011-2012/ he expects to see the same: the games that are played in the current week and the current standings. When someone goes to http://website.com/premier-league/2011-2012/week-8/ he expects to the same: the games that are played in the current week and the current standings. So essentially there's three places, within every active season within a competition, within the website where logically the same information has to be shown. To deal with this from a UX and SEO perspective, we have the following options: Option A - Use a canonical URL Using a canonical URL could solve this problem. You could use a canonical URL from the current week page and the Season page to the competition page: So: the page on 'website.com/$competition/$season/playweek-8' would have a canonical tag that points to 'website.com/$competition/' the page on 'website.com/$competition/$season/' would have a canonical tag that points to 'website.com/$competition/' The next week however, you want to have the canonical tag on 'website.com/$competition/$season/playweek-9' and the canonical tag from 'website.com/$competition/$season/playweek-8' should be removed. So then you have: the page on 'website.com/$competition/$season/playweek-9' would have a canonical tag that points to 'website.com/$competition/' the page on 'website.com/$competition/$season/' would still have a canonical tag that points to 'website.com/$competition/' In essence the canonical tag is constantly traveling through the pages. Advantages: UX: for a user this is a very neat solution. Wherever a user goes, he sees the information he expects. So that's all good. SEO: the search engines get very clear guidelines as to how the website functions and we prevent duplicate content. Disavantages: I have some concerns regarding the weekly changing canonical tag from a SEO perspective. Every week, within every competition the canonical tags are updated. How often do Search Engines update their index for canonical tags? I mean, say it takes a Search Engine a week to visit a page, crawl a page and process a canonical tag correctly, then the Search Engines will be a week behind on figuring out the actual structure of the hierarchy. On top of that: what do the changing canonical URLs to the 'quality' of the website? In theory this should be working all but I have some reservations on this. If there is a canonical tag from 'website.com/$competition/$season/week-8', what does this do to the indexation and ranking of it's subpages (the actual match pages) Option B - Using 301-redirects Using 301-redirects essentially the user and the Search Engine are treated the same. When the Season page or competition page are requested both are redirected to game week page. The same applies here as applies for the canonical URL: every week there are changes in the redirects. So in game week 8: the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-8' the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-8' A week goes by, so then you have: the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-9' the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-9' Advantages There is no loss of link authority. Disadvantages Before a playweek starts the playweek in question can be indexed. However, in the current playweek the playweek page 301-redirects to the competition page. After that week the page's 301-redirect is removed again and it's indexable. What do all the (changing) 301-redirects do to the overall quality of the website for Search Engines (and users)? Option C - Using 302-redirects Most SEO's will refrain from using 302-redirects. However, 302-redirect can be put to good use: for serving a temporary redirect. Within my website there's the content that's most important to the users (and therefor search engines) is constantly moving. In most cases after a week a different piece of the website is most interesting for a user. So let's take our example above. We're in playweek 8. If you want 'website.com/$competition/' to be redirecting to 'website.com/$competition/$season/week-8/' you can use a 302-redirect. Because the redirect is temporary The next week the 302-redirect on 'website.com/$competition/' will be adjusted. It'll be pointing to 'website.com/$competition/$season/week-9'. Advantages We're putting the 302-redirect to its actual use. The pages that 302-redirect (for instance 'website.com/$competition' and 'website.com/$competition/$season') will remain indexed. Disadvantages Not quite sure how Google will handle this, they're not very clear on how they exactly handle a 302-redirect and in which cases a 302-redirect might be useful. In most cases they advise webmasters not to use it. I'd very much like your opinion on this. Thanks in advance guys and galls!
Intermediate & Advanced SEO | | StevenvanVessum0 -
Mobile Site - Same Content, Same subdomain, Different URL - Duplicate Content?
I'm trying to determine the best way to handle my mobile commerce site. I have a desktop version and a mobile version using a 3rd party product called CS-Cart. Let's say I have a product page. The URLs are... mobile:
Intermediate & Advanced SEO | | grayloon
store.domain.com/index.php?dispatch=categories.catalog#products.view&product_id=857 desktop:
store.domain.com/two-toned-tee.html I've been trying to get information regarding how to handle mobile sites with different URLs in regards to duplicate content. However, most of these results have the assumption that the different URL means m.domain.com rather than the same subdomain with a different address. I am leaning towards using a canonical URL, if possible, on the mobile store pages. I see quite a few suggesting to not do this, but again, I believe it's because they assume we are just talking about m.domain.com vs www.domain.com. Any additional thoughts on this would be great!0 -
SEOMoz mistaking image pages as duplicate content
I'm getting duplicate content errors, but it's for pages with high-res images on them. Each page has a different, high-res image on it. But SEOMoz keeps telling me it's duplicate content, even though the images are different (and named different). Is this something I can ignore or will Google see it the same way too?
Intermediate & Advanced SEO | | JHT0