SEOMOZ and non-duplicate duplicate content
-
Hi all,
Looking through the lovely SEOMOZ report, by far its biggest complaint is that of perceived duplicate content. Its hard to avoid given the nature of eCommerce sites that oestensibly list products in a consistent framework.
Most advice about duplicate content is about canonicalisation, but thats not really relevant when you have two different products being perceived as the same.
Thing is, I might have ignored it but google ignores about 40% of our site map for I suspect the same reason. Basically I dont want us to appear "Spammy". Actually we do go to a lot of time to photograph and put a little flavour text for each product (in progress).
I guess my question is, that given over 700 products, why 300ish of them would be considered duplicates and the remaning not?
Here is a URL and one of its "duplicates" according to the SEOMOZ report:
http://www.1010direct.com/DGV-DD1165-970-53/details.aspx
http://www.1010direct.com/TDV-019-GOLD-50/details.aspxThanks for any help people
-
The point I'm trying to get across is this:
"I asked the question of why these pages are considered duplicate, the answer appears to be : because textually they are even if visually they are not."
I don't think that's the complete answer, or even the most important part of the answer. Surely having mostly similar content across pages won't help, but as I've tried to point out, there are other factors that come into play here. It's not just about the content, but putting the content into context for the search engines. In order for them to understand what it is they're looking it, there's more that's important than just the content.
Michel
-
I think this highlights the fundamental problem with SEO and
eCommerce sites.We are all aware that the ultimate aim for search engines and
therefore ultimately SEO is to add value to users. But is "value" the
same for an eCommerce site as it is for a blog, or a travel information site or
a site offering information on health and advice?In my opinion, it is not. If I am looking to make a purchase, I
am looking for a site that is responsive, easy to navigate, has good imagery to
help me visualise, is secure and doesn’t clutter with in-your-face promotional
info, and of course offers value for money.Unique content therefore doesn’t really factor into it too much. Its hard enough for us, but I can only imagine how difficult it is for a company selling screws or rope, just how much creativity does that take to provide unique content for 3.5 inch brass screws over 2.5 inch steel ones?
The current mantra is to stop worrying about SEO tricks, and
focus on building a site with value. But this particular issue is an indication
we are still not there with that utopia yet.For example, as pointed out in the posts above .. these pages are considered duplicate, because by percentage the variable information is minimal; If you look at our product page we put the functionality of filling in your prescription below the product to make it
easier for the customer, but in order to solve the "percentage unique" issue, we would need to move that onto another page. Basically, we need to reduce value (convenience) to appear to add value (uniqueness).Anyway, little point complaining, I asked the question of why these pages are considered duplicate, the answer appears to be : because textually they are even if visually they are not.
I could be worrying about nothing, I believe all these pages are indexed (through crawling), its just a good proportion of our sitemap is being overlooked, I am assuming its perceived duplication as suggested in SEOMOZ. That in turn makes me concerned google is marking us down as spammy.
I appreciate all your comments.
Thanks
Paul
-
I do not agree. I see these kinds of pages on e-commerce websites on a daily basis. For webshops that sell only a certain kind of product, almost all product pages will look alike.
In this case, the H1 is different, the page title is different, and the description is different. This is only a small portion of the page but that's not uncommon, so I would argue that it cannot be just that.
I would look into URLs, marking up your data using http://schema.org/Product, possibly making small changes to accomodate the tags. For instance splitting up brand, color etc. so that you can mark them accordingly.
-
Tom has this spot on. Google doesn't only look for direct duplication, but also very similar, and these really are I'm afraid.
You need to find ways to make each page unique in its own right - let Google see that no two pages are the same and there is a real reason to rank them.
-
I wonder if the details.aspx has something to do with it?
www.1010direct.com/TDV-019-GOLD-50/details.aspx
www.1010direct.com/DGV-DD1165-970-53/details.aspxBasically, both pages are called details.aspx. Depending on how you look at it, you have 2 pages that are named the same (with mostly similar content, though not unusual for e-commerce websites) in different subfolders. I'm not sure if there's some kind of difference in the way Moz works, and if that's part of why Moz marks this as duplicate content?
Are you unable to create 'prettier' URL's? Such as:
www.1010direct.com/tim-dilsen-019-gold-50-glasses.aspx
www.1010direct.com/dolce-gabbana-dd1165-970-53-glasses.aspxWith or without the aspx of course.
-
I'm not surprised Moz is flagging those pages as duplicate content and I wouldn't be totally surprised if Google did in the future.
Put it this way, the pages are identical bar for a single sentence title description, a price and roughly a 20 word section describing the product. Everything else is identical. It's duplicate.
Look at it another through Google's eyes. Here's how the two pages look when crawled by Google:
(If that doesn't work, try yourself at http://www.seo-browser.com/)
Just look at how much text and HTML is shared between the two pages. Yes, there are key differences on the pages (namely the product), but the Google bot nor the Mozbot is going to recognise those elements when it crawls it.
Presuming Google ignores the site nav, it still has a bunch of text and crawlable elements that are shared - pretty much everything under the product description. It doesn't see the individual images and the flavour text is frankly too small to make any sort of dent in the duplicate content %.
I'd seriously recommend at revising how your product pages look - there's far too much repeated content per page (you can still promote these things on each page but in a much, much smaller way) and the individual descriptions for the products, in my eyes, are not substantial enough.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content and canonicalization confusion
Hello, http://bit.ly/1b48Lmp and http://bit.ly/1BuJkUR pages have same content and their canonical refers to the page itself. Yet, they rank in search engines. Is it because they have been targeted to different geographical locations? If so, still the content is same. Please help me clear this confusion. Regards
Technical SEO | | IM_Learner0 -
How to avoid duplicate content on internal search results page?
Hi, according to Webmaster Tools and Siteliner our website have an above-average amount of duplicate content. Most of the pages are the search results pages, where it finds only one result. The only difference in this case are the TDK, H1 and the breadcrumbs. The rest of the layout is pretty static and similar. Here is an example for two pages with "duplicate content": https://soundbetter.com/search/Globo https://soundbetter.com/search/Volvo Edit: These are legitimate results that happen to have the same result. In this case we want users to be able to find the audio engineers by 'credits' (musicians they've worked with). Tags. We want users to rank for people searching for 'engineers who worked with'. And searching for two different artists (credit tags) returns this one service provider, with different urls (the tag being the search parameter) hence the duplicate content. I guess every e-commerce/directory website faces this kind of issue. What is the best practice to avoid duplicate content on search results page?
Technical SEO | | ShaqD1 -
Wordpress tags and duplicate content?
I've seen a few other Q&A posts on this but I haven't found a complete answer. I read somewhere a while ago that you can use as many tags as you would like. I found that I rank for each tag I used. For example, I could rank for best night clubs in san antonio, good best night clubs in san antonio, great best night clubs in san antonio, top best night clubs in san antonio, etc. However, I now see that I'm creating a ton of duplicate content. Is there any way to set a canonical tag on the tag pages to link back to the original post so that I still keep my rankings? Would future tags be ignored if I did this?
Technical SEO | | howlusa0 -
Duplicate Content Issue
My issue with duplicate content is this. There are two versions of my website showing up http://www.example.com/ http://example.com/ What are the best practices for fixing this? Thanks!
Technical SEO | | OOMDODigital0 -
Duplicate Page Content for sorted archives?
Experienced backend dev, but SEO newbie here 🙂 When SEOmoz crawls my site, I get notified of DPC errors on some list/archive sorted pages (appending ?sort=X to the url). The pages all have rel=canonical to the archive home. Some of the pages are shorter (have only one or two entries). Is there a way to resolve this error? Perhaps add rel=nofollow to the sorting menu? Or perhaps find a method that utilizes a non-link navigation method to sort / switch sorted pages? No issues with duplicate content are showing up on google webmaster tools. Thanks for your help!
Technical SEO | | jwondrusch0 -
Are aggregate sites penalised for duplicate page content?
Hi all,We're running a used car search engine (http://autouncle.dk/en/) in Denmark, Sweden and soon Germany. The site works in a conventional search engine way with a search form and pages of search results (car adverts).The nature of car searching entails that the same advert exists on a large number of different urls (because of the many different search criteria and pagination). From my understanding this is problematic because Google will penalize the site for having duplicated content. Since the order of search results is mixed, I assume SEOmoz cannot always identify almost identical pages so the problem is perhaps bigger than what SEOmoz can tell us. In your opinion, what is the best strategy to solve this? We currently use a very simple canonical solution.For the record, besides collecting car adverts AutoUncle provide a lot of value to our large user base (including valuations on all cars) . We're not just another leech adword site. In fact, we don't have a single banner.Thanks in advance!
Technical SEO | | JonasNielsen0 -
Duplicate content question with PDF
Hi, I manage a property listing website which was recently revamped, but which has some on-site optimization weaknesses and issues. For each property listing like http://www.selectcaribbean.com/property/147.html there is an equivalent PDF version spidered by google. The page looks like this http://www.selectcaribbean.com/pdf1.php?pid=147 my question is: Can this create a duplicate content penalty? If yes, should I ban these pages from being spidered by google in the robots.txt or should I make these link nofollow?
Technical SEO | | multilang0 -
Duplicate content connundrum
Hey Mozzers- I have a tricky situation with one of my clients. They're a reputable organization and have been mentioned in several major news articles. They want to create a Press page on their site with links to each article, but they want viewers to remain within the site and not be redirected to the press sites themselves. The other issue is some of the articles have been removed from the original press sites where they were first posted. I want to avoid duplicate content issues, but I don't see how to repost the articles within the client's site. I figure I have 3 options: 1. create PDFs (w/SEO-friendly URLs) with the articles embedded in them that open in a new window. 2. Post an image with screenshot of article on a unique URL w/brief content. 3. Copy and paste the article to a unique URL. If anyone has experience with this issue or any suggestions, I would greatly appreciate it. Jaime Brown
Technical SEO | | JamesBSEO0