Internal Duplicate Content - Classifieds (Panda)
-
I've been wondering for a while now, how Google treats internal duplicate content within classified sites.
It's quite a big issue, with customers creating their ads twice.. I'd guess to avoid the price of renewing, or perhaps to put themselves back to the top of the results. Out of 10,000 pages crawled and tested, 250 (2.5%) were duplicate adverts.
Similarly, in terms of the search results pages, where the site structure allows the same advert(s) to appear under several unique URLs. A prime example would be in this example. Notice, on this page we have already filtered down to 1 result, but the left hand side filters all return that same 1 advert.
Using tools like Siteliner and Moz Analytics just highlights these as urgent high priority issues, but I've always been sceptical.
On a large scale, would this count as Panda food in your opinion, or does Google understand the nature of classifieds is different, and treat it as such?
Appreciate thoughts.
Thanks.
-
TL;DR: You're right to be skeptical that this is an urgent issue (in my opinion), but it is something worth fixing at some point for several reasons.
I was far more concerned by search results, but I see you've added those to noindex/disallow in robots.txt, which is great. Not many people know that works!
I think it's very possible that Google understands the difference between a classified ad and an editorial content piece. They definitely treat products and content differently. That said, it's generally a good idea to avoid relying on Google's intelligence, as many have been let down by Google's failure to understand.
Duplicate content is generally something SEOs are overly-concerned with. More often than not it triggers a filter - not a "penalty." I don't see it as the most dangerous thing you could be doing by any stretch of the imagination. That said, I've seen several classified sites do the following, which I'd recommend as a "best practice" approach. At one time Craigslist did this, and may still be doing it.
- Accept non-spam ads with a pending status
- Check against listings in a given period of time for duplicates. This happens even if the ad is changed slightly, so there's some kind of semantic+image analysis going on.
- If a duplicate is found under the same user name, inform them that they've already posted the ad. From here the rules are up to you. Many sites say the ad can't be posted again for 7 days (if the old ad is deleted) or 30 days (if not). They then encourage users to buy a featured listing that shows up higher than others.
- If duplicates are found under different user names, give a warning that it's against your terms of service (make sure it is) to post duplicate ads from multiple accounts, that accounts can be banned, and have them certify the post is not the same.
You don't need to follow this exactly, but it's here to give you some ideas on having your users prevent duplicate content for you. Given the general positive architecture I've seen on the site it looks like you know what to do with the site better than I would.
Now I don't think 250 out of 10k is bad. Having consulted with a few local classified sites that's actually quite low. But I do think there's something to be gained by detecting duplicates to prevent users from gaining an unfair advantage over those playing by the rules. And if you sell featured listings this is an excellent way to help those who are most desparate to sell while increasing revenue.
I hope that helps.
Obligatory disclaimer: This is merely free advice for your consideration, and not the Moz official stance. The consequences of any changes you do or don't make are ultimately your responsibility.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to fix Duplicate Content Warnings on Pagination? Indexed Pagination?
Hi all! So we have a Wordpress blog that properly has pagination tags of rel="prev" and rel="next" set up for pages, but we're still getting crawl errors with MOZ for duplicate content on all of our pagination pages. Also, we are having all of our pages indexed as well. I'm talking pages as deep as page 89 for the home page. Is this something I should ignore? Is it hurting my SEO potentially? If so, how can I start tackling it for a fix? Would "noindex" or "nofollow" be a good idea? Any help would be greatly appreciated!
Intermediate & Advanced SEO | | jampaper0 -
Unique content for international SEO?
Hi Guys, We have a e-commerce store on generic top-level domain which has 1000s of products in US. We are looking to expand to aus, uk and canda using subfolders. We are going to implement hreflang tags. I was told by our SEO agency we need to make all the content between each page unique. This should be fine for cateogry/product listing pages. But they said we need to make content unique on product pages. If we have 1000 products, thats 4000 pages, which is a big job in terms of creating content. Is this necessary? What is the correct way to approach this, won't the hreflang tag be sufficent to prevent any duplicate content issues with product pages? Cheers.
Intermediate & Advanced SEO | | geekyseotools0 -
Duplicate Content... Really?
Hi all, My site is www.actronics.eu Moz reports virtually every product page as duplicate content, flagged as HIGH PRIORITY!. I know why. Moz classes a page as duplicate if >95% content/code similar. There's very little I can do about this as although our products are different, the content is very similar, albeit a few part numbers and vehicle make/model. Here's an example:
Intermediate & Advanced SEO | | seowoody
http://www.actronics.eu/en/shop/audi-a4-8d-b5-1994-2000-abs-ecu-en/bosch-5-3
http://www.actronics.eu/en/shop/bmw-3-series-e36-1990-1998-abs-ecu-en/ate-34-51 Now, multiply this by ~2,000 products X 7 different languages and you'll see we have a big dupe content issue (according to Moz's Crawl Diagnostics report). I say "according to Moz..." as I do not know if this is actually an issue for Google? 90% of our products pages rank, albeit some much better than others? So what is the solution? We're not trying to deceive Google in any way so it would seem unfair to be hit with a dupe content penalty, this is a legit dilemma where our product differ by as little as a part number. One ugly solution would be to remove header / sidebar / footer on our product pages as I've demonstrated here - http://woodberry.me.uk/test-page2-minimal-v2.html since this removes A LOT of page bloat (code) and would bring the page difference down to 80% duplicate.
(This is the tool I'm using for checking http://www.webconfs.com/similar-page-checker.php) Other "prettier" solutions would greatly appreciated. I look forward to hearing your thoughts. Thanks,
Woody 🙂1 -
Duplicate Content www vs. non-www and best practices
I have a customer who had prior help on his website and I noticed a 301 redirect in his .htaccess Rule for duplicate content removal : www.domain.com vs domain.com RewriteCond %{HTTP_HOST} ^MY-CUSTOMER-SITE.com [NC]
Intermediate & Advanced SEO | | EnvoyWeb
RewriteRule (.*) http://www.MY-CUSTOMER-SITE.com/$1 [R=301,L,NC] The result of this rule is that i type MY-CUSTOMER-SITE.com in the browser and it redirects to www.MY-CUSTOMER-SITE.com I wonder if this is causing issues in SERPS. If I have some inbound links pointing to www.MY-CUSTOMER-SITE.com and some pointing to MY-CUSTOMER-SITE.com, I would think that this rewrite isn't necessary as it would seem that Googlebot is smart enough to know that these aren't two sites. -----Can you comment on whether this is a best practice for all domains?
-----I've run a report for backlinks. If my thought is true that there are some pointing to www.www.MY-CUSTOMER-SITE.com and some to the www.MY-CUSTOMER-SITE.com, is there any value in addressing this?0 -
Duplicate Content For E-commerce
On our E-commerce site, we have multiple stores. Products are shown on our multiple stores which has created a duplicate content problem. Basically if we list a product say a shoe,that listing will show up on our multiple stores I assumed the solution would be to redirect the pages, use non follow tags or to use the rel=canonical tag. Are there any other options for me to use. I think my best bet is to use a mixture of 301 redirects and canonical tags. What do you recommend. I have 5000+ pages of duplicate content so the problem is big. Thanks in advance for your help!
Intermediate & Advanced SEO | | pinksgreens0 -
BEING PROACTIVE ABOUT CONTENT DUPLICATION...
So we all know that duplicate content is bad for SEO. I was just thinking... Whenever I post new content to a blog, website page etc...there should be something I should be able to do to tell Google (in fact all search engines) that I just created and posted this content to the web... that I am the original source .... so if anyone else copies it they get penalised and not me... Would appreciate your answers... 🙂 regards,
Intermediate & Advanced SEO | | TopGearMedia0 -
I need to add duplicate content, how to do this without penalty
On a site I am working on we provide a landing page summary (say top 10 information snippets) and provide a link 'see more' to take viewers to a page with all the snippets. Now those first 10 snippets will be repeated in the full list. Is this going to be a duplicate content problem? If so, any suggestions.
Intermediate & Advanced SEO | | oznappies0 -
Duplicate Content, Campaign Explorer & Rel Canonical
Google Advises to use Rel Canonical URL's to advise them which page with similiar information is more relevant. You are supposed to put a rel canonical on the non-preferred pages to point back to the desired page. How do you handle this with a product catalog using ajax, where the additional pages do not exist? An example would be: <colgroup><col width="470"></colgroup>
Intermediate & Advanced SEO | | eric_since1910.com
| .com/productcategory.aspx?page=1 /productcategory.aspx?page=2 /productcategory.aspx?page=3 /productcategory.aspx?page=4 The page=1,2,3 and 4 do not physically exist, they are simply referencing additional products I have rel canonical urls' on the main page www.examplesite.com/productcategory.aspx, but I am not 100% sure this is correct or how else it could be handled. Any Ideas Pro mozzers? |0