Internal Duplicate Content - Classifieds (Panda)
-
I've been wondering for a while now, how Google treats internal duplicate content within classified sites.
It's quite a big issue, with customers creating their ads twice.. I'd guess to avoid the price of renewing, or perhaps to put themselves back to the top of the results. Out of 10,000 pages crawled and tested, 250 (2.5%) were duplicate adverts.
Similarly, in terms of the search results pages, where the site structure allows the same advert(s) to appear under several unique URLs. A prime example would be in this example. Notice, on this page we have already filtered down to 1 result, but the left hand side filters all return that same 1 advert.
Using tools like Siteliner and Moz Analytics just highlights these as urgent high priority issues, but I've always been sceptical.
On a large scale, would this count as Panda food in your opinion, or does Google understand the nature of classifieds is different, and treat it as such?
Appreciate thoughts.
Thanks.
-
TL;DR: You're right to be skeptical that this is an urgent issue (in my opinion), but it is something worth fixing at some point for several reasons.
I was far more concerned by search results, but I see you've added those to noindex/disallow in robots.txt, which is great. Not many people know that works!
I think it's very possible that Google understands the difference between a classified ad and an editorial content piece. They definitely treat products and content differently. That said, it's generally a good idea to avoid relying on Google's intelligence, as many have been let down by Google's failure to understand.
Duplicate content is generally something SEOs are overly-concerned with. More often than not it triggers a filter - not a "penalty." I don't see it as the most dangerous thing you could be doing by any stretch of the imagination. That said, I've seen several classified sites do the following, which I'd recommend as a "best practice" approach. At one time Craigslist did this, and may still be doing it.
- Accept non-spam ads with a pending status
- Check against listings in a given period of time for duplicates. This happens even if the ad is changed slightly, so there's some kind of semantic+image analysis going on.
- If a duplicate is found under the same user name, inform them that they've already posted the ad. From here the rules are up to you. Many sites say the ad can't be posted again for 7 days (if the old ad is deleted) or 30 days (if not). They then encourage users to buy a featured listing that shows up higher than others.
- If duplicates are found under different user names, give a warning that it's against your terms of service (make sure it is) to post duplicate ads from multiple accounts, that accounts can be banned, and have them certify the post is not the same.
You don't need to follow this exactly, but it's here to give you some ideas on having your users prevent duplicate content for you. Given the general positive architecture I've seen on the site it looks like you know what to do with the site better than I would.
Now I don't think 250 out of 10k is bad. Having consulted with a few local classified sites that's actually quite low. But I do think there's something to be gained by detecting duplicates to prevent users from gaining an unfair advantage over those playing by the rules. And if you sell featured listings this is an excellent way to help those who are most desparate to sell while increasing revenue.
I hope that helps.
Obligatory disclaimer: This is merely free advice for your consideration, and not the Moz official stance. The consequences of any changes you do or don't make are ultimately your responsibility.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do you think this case would be of a duplicated content and what would be the consequences in such case?
At the webpage https://authland.com/ which is a food&wine tours and activities booking platform, primary content - services thumbnails containing information about the destination, title and prices of the particular services, can be found at several sub-pages/urls. For example, service https://authland.com/zadar/zadar-region-food-and-wine-tour/1/. Its thumbnail/card through which the service is available, can be found on multiple pages (Categories, Destinations, All services, Most recent services...) Is this considered a duplicated content? Since all of the thumbnails for services on the platform, are to be found on multiple pages. If it is, which would be the best way to avoid that content being perceived by Google bots as such? Thank you very much!
Intermediate & Advanced SEO | | ZD20200 -
Semi-duplicate content yet authoritative site
So I have 5 real estate sites. One of those sites is of course the original, and it has more/better content on most of the pages than the other sites. I used to be top ranked for all of the subdivsion names in my town. Then when I did the next 2-4 sites, I had some sites doing better than others for certain keywords, and then I have 3 of those sites that are basically the same URL structures (besides the actual domain) and they aren't getting fed very many visits. I have a couple of agents that work with me that I loaned my sites to to see if that would help since it would be a different name. My same youtube video is on each of the respective subdivision pages of my site and theirs. Also, their content is just rewritten content from mine about the same length of content. I have looked over and seen a few of my competitors who only have one site and their URL structures arent good at all, and their content isn't good at all and a good bit of their pages rank higher than my main site which is very frustrating to say the least since they are actually copy cats to my site. I sort of started the precedent of content, mapping the neighborhood, how far that subdivision is from certain landmarks, and then shot a video of each. They have pretty much done the same thing and are now ahead of me. What sort of advice could you give me? Right now, I have two sites that are almost duplicate in terms of a template and same subdivsions although I did change the content the best I could, and that site is still getting pretty good visits. I originally did it to try and dominate the first page of the SERPS and then Penguin and Panda came out and seemed to figure that game out. So now, I would still like to keep all the sites, but I'm assuming that would entail making them all unique, which seems to be tough seeing as though my town has the same subdivisions. Curious as to what the suggestions would be, as I have put a lot of time into these sites. If I post my site will it show up in the SERPS? Thanks in advance
Intermediate & Advanced SEO | | Veebs0 -
Duplicated Content with Index.php
Good Afternoon, My website uses Joomla CMS and has the htaccess rewrite code enabled to ensure the use of search engine friendly URLs (SEF's). While browsing the crawl diagnostics I have found that Moz considers the /index.php URL a duplicate to our root. I will always under the impression that the htaccess rewrite took care of that issue and obviously I would like to address it. I attempted to create a 301 redirect from the index.php URL to the root but ran into an issue when attempting to login to the admin portion of the website as the redirect sent me back to the homepage. I was curious if anyone had advice for handling the index.php duplication issue, specifically with Joomla. Additionally, I have confirmed that in Google Webmasters, under URL parameters, the index.php parameter is set as 'Representative URL'.
Intermediate & Advanced SEO | | BrandonEML0 -
How should I manage duplicate content caused by a guided navigation for my e-commerce site?
I am working with a company which uses Endeca to power the guided navigation for our e-commerce site. I am concerned that the duplicate content generated by having the same products served under numerous refinement levels is damaging the sites ability to rank well, and was hoping the Moz community could help me understand how much of an impact this type of duplicate content could be having. I also would love to know if there are any best practices for how to manage this type of navigation. Should I nofollow all of the URLs which have more than 1 refinement used on a category, or should I allow the search engines to go deeper than that to preserve the long tail? Any help would be appreciated. Thank you.
Intermediate & Advanced SEO | | FireMountainGems0 -
Moving some content to a new domain - best practices to avoid duplicate content?
Hi We are setting up a new domain to focus on a specific product and want to use some of the content from the original domain on the new site and remove it from the original. The content is appropriate for the new domain and will be irrelevant for the original domain and we want to avoid creating completely new content. There will be a link between the two domains. What is the best practice for this to avoid duplicate content and a potential Panda penalty?
Intermediate & Advanced SEO | | Citybase0 -
Penalised for duplicate content, time to fix?
Ok, I accept this one is my fault but wondering on time scales to fix... I have a website and I put an affiliate store on it, using merchant datafeeds in a bid to get revenue from the site. This was all good, however, I forgot to put noindex on the datafeed/duplicate content pages and over a period of a couple of weeks the traffic to the site died. I have since nofollowed or removed the products but some 3 months later my site still will not rank for the keywords it was ranking for previously. It will not even rank if I type in the sites' name (bright tights). I have searched for the name using bright tights, "bright tights" and brighttights but none of them return the site anywhere. I am guessing that I have been hit with a drop x place penalty by Google for the duplicate content. What is the easiest way around this? I have no warning about bad links or the such. Is it worth battling on trying to get the domain back or should I write off the domain, buy a new one and start again but minus the duplicate content? The goal of having the duplicate content store on the site was to be able to rank the category pages in the store which had unique content on so there were no problems with that which I could foresee. Like Amazon et al, the categories would have lists of products (amongst other content) and you would click through to the individual product description - the duplicate page. Thanks for reading
Intermediate & Advanced SEO | | Grumpy_Carl0 -
Duplicate content - canonical vs link to original and Flash duplication
Here's the situation for the website in question: The company produces printed publications which go online as a page turning Flash version, and as a separate HTML version. To complicate matters, some of the articles from the publications get added to a separate news section of the website. We want to promote the news section of the site over the publications section. If we were to forget the Flash version completely, would you: a) add a canonical in the publication version pointing to the version in the news section? b) add a link in the footer of the publication version pointing to the version in the news section? c) both of the above? d) something else? What if we add the Flash version into the mix? As Flash still isn't as crawlable as HTML should we noindex them? Is HTML content duplicated in Flash as big an issue as HTML to HTML duplication?
Intermediate & Advanced SEO | | Alex-Harford0 -
Load balancing - duplicate content?
Our site switches between www1 and www2 depending on the server load, so (the way I understand it at least) we have two versions of the site. My question is whether the search engines will consider this as duplicate content, and if so, what sort of impact can this have on our SEO efforts? I don't think we've been penalised, (we're still ranking) but our rankings probably aren't as strong as they should be. The SERPs show a mixture of www1 and www2 content when I do a branded search. Also, when I try to use any SEO tools that involve a site crawl I usually encounter problems. Any help is much appreciated!
Intermediate & Advanced SEO | | ChrisHillfd0