How are they avoiding duplicate content?

ukss1984

One of the largest stores in USA for soccer runs a number of whitelabel sites for major partners such as Fox and ESPN. However, the effect of this is that they are creating duplicate content for their products (and even the overall site structure is very similar). Take a look at:

http://www.worldsoccershop.com/23147.html

http://www.foxsoccershop.com/23147.html

http://www.soccernetstore.com/23147.html

You can see that practically everything is the same including:

product URL
product title
product description

My question is, why is Google not classing this as duplicate content? Have they coded for it in a certain way or is there something I'm missing which is helping them achieve rankings for all sites?

IPINGlobal54

The answer is right in your question - "runs a number of whitelabel sites". As mentioned, it is largely due to the original publisher publishing the content first and getting indexed - from there, anytime the google bot stumbles across the same content - it will figure out that it has seen the content before, and attribute the ranking to the original. Something that google themselves covered last year here (although more specifically for news at the time).

Duplicate content unfortunately isn't just "not shown" by the search engines (imagine how "clean" the SERPS would be if that were the case!) it's just ranked lower than the original publisher that google is aware of. Occasionally you will get the odd page that will rank from a different domain - but that is usually due to being fresh content, I have seen this myself with my own content being aggregated by a large news site - they might outrank me on occasion for a day on one or two pieces - but my original url comes out on top in the end.

RyanKent

They rank as #1 for the relevant terms. It is very clear Google feels they are the original source of the content, and the other sites are duplicates.

I don't have a crystal ball to see the future, but based on current information, the original source site is not suffering in any manner.

ukss1984

Interesting feedback - are worldsoccershop (the original source) likely to suffer any penalties as a result of the whitelabel sites carrying the duplicate content?

Marcus_Miller

Hey

I just did a search for some phrases I found on one of their product pages and I wrapped up this long query in double quotes.

"Large graffiti print on front that illustrates the club's famous players and history. The traditional blue jersey has gold details including team badge, adidas logo and sponsor design"

the results that are returned shows the worldsoccershop.com result first & second and therefore they seem to be an authority on this product description.

I have a client that is setting up a store to take on some rather big boys like notonthehighstreet.com and in this industry where they have several, established competitors for each product the big authority stores seem to rank for the generic product descriptions with no real issues.

This is ultimately difficult for the smaller stores as whilst they have less resources, pages on my clients site that use these duplicate descriptions are just getting filtered out of the results. We can see this filtering in action with very specific searches like the one above where we get the 'we have filtered out similar results' message in the search results and low and behold, my clients results are in those that are filtered.

So, to answer your original question:

They have not 'coded' anything in a specific way and there is nothing you are missing as such. They are just an authority site and as such are 'getting away with it' - which, for the smaller players, kind of sucks. That said, only the worldofsoccer pages are returned so the other sites could well be filtered out.

Still, as I am coaching our client, see this not as a problem but as an opportunity. By creating unique content, we can hopefully piggy back other more authoritative sites that are all returning an exact same product description and whilst I don't expect us to get 1st place, we can work towards first page and out of that filter.

Duplicate content is a massive problem and on this site we are working on there is one product description that copyscape tells us is on 300 other sites. Google wants to return rich result sets, some shops, some information, some pictures etc and not just 10 sets of the same thing so dare to be different and give them a reason to display your page.

Hope it helps
Marcus

RyanKent

My question is, why is Google not classing this as duplicate content?

Why do you feel this content has not been flagged as duplicate content?

The reasonable search for these pages is Barcelona Soccer Jersey. Only one of the three sites has results for this term in the top 50, and it is the #1 and #2 results. If this was not duplicate content, you would expect to find the other two sites listed on the first page of google results as well.

The perfect search for the page (very longtail and unrealistic) is Barcelona 11/12 home soccer jersey. For this result, the worldsoccershop.com site ranks as #1 and 3, the foxsoccershop ranks as #8 which is a big drop down considering the content is the same, and the soccernetstore.com site is not in the top 50 results.

The other two sites have clearly been identified as duplicate content or are otherwise being penalized quite severely.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How are they avoiding duplicate content?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

How do we avoid duplicate/thin content on +150,000 product pages?

Duplicate content across similar computer "models" and how to properly handle it.

Base copy on 1 page, then adding a bit more for another page - potential duplicate content. What to do?

Partial duplicate content and canonical tags

Duplicate Content: Organic vs Local SEO

Should I redirect all my subdomains to a single unique subdomain to eliminate duplicate content?

Virtual Domains and Duplicate Content

Subdomains - duplicate content - robots.txt