SEOMOZ and non-duplicate duplicate content
-
Hi all,
Looking through the lovely SEOMOZ report, by far its biggest complaint is that of perceived duplicate content. Its hard to avoid given the nature of eCommerce sites that oestensibly list products in a consistent framework.
Most advice about duplicate content is about canonicalisation, but thats not really relevant when you have two different products being perceived as the same.
Thing is, I might have ignored it but google ignores about 40% of our site map for I suspect the same reason. Basically I dont want us to appear "Spammy". Actually we do go to a lot of time to photograph and put a little flavour text for each product (in progress).
I guess my question is, that given over 700 products, why 300ish of them would be considered duplicates and the remaning not?
Here is a URL and one of its "duplicates" according to the SEOMOZ report:
http://www.1010direct.com/DGV-DD1165-970-53/details.aspx
http://www.1010direct.com/TDV-019-GOLD-50/details.aspxThanks for any help people
-
The point I'm trying to get across is this:
"I asked the question of why these pages are considered duplicate, the answer appears to be : because textually they are even if visually they are not."
I don't think that's the complete answer, or even the most important part of the answer. Surely having mostly similar content across pages won't help, but as I've tried to point out, there are other factors that come into play here. It's not just about the content, but putting the content into context for the search engines. In order for them to understand what it is they're looking it, there's more that's important than just the content.
Michel
-
I think this highlights the fundamental problem with SEO and
eCommerce sites.We are all aware that the ultimate aim for search engines and
therefore ultimately SEO is to add value to users. But is "value" the
same for an eCommerce site as it is for a blog, or a travel information site or
a site offering information on health and advice?In my opinion, it is not. If I am looking to make a purchase, I
am looking for a site that is responsive, easy to navigate, has good imagery to
help me visualise, is secure and doesn’t clutter with in-your-face promotional
info, and of course offers value for money.Unique content therefore doesn’t really factor into it too much. Its hard enough for us, but I can only imagine how difficult it is for a company selling screws or rope, just how much creativity does that take to provide unique content for 3.5 inch brass screws over 2.5 inch steel ones?
The current mantra is to stop worrying about SEO tricks, and
focus on building a site with value. But this particular issue is an indication
we are still not there with that utopia yet.For example, as pointed out in the posts above .. these pages are considered duplicate, because by percentage the variable information is minimal; If you look at our product page we put the functionality of filling in your prescription below the product to make it
easier for the customer, but in order to solve the "percentage unique" issue, we would need to move that onto another page. Basically, we need to reduce value (convenience) to appear to add value (uniqueness).Anyway, little point complaining, I asked the question of why these pages are considered duplicate, the answer appears to be : because textually they are even if visually they are not.
I could be worrying about nothing, I believe all these pages are indexed (through crawling), its just a good proportion of our sitemap is being overlooked, I am assuming its perceived duplication as suggested in SEOMOZ. That in turn makes me concerned google is marking us down as spammy.
I appreciate all your comments.
Thanks
Paul
-
I do not agree. I see these kinds of pages on e-commerce websites on a daily basis. For webshops that sell only a certain kind of product, almost all product pages will look alike.
In this case, the H1 is different, the page title is different, and the description is different. This is only a small portion of the page but that's not uncommon, so I would argue that it cannot be just that.
I would look into URLs, marking up your data using http://schema.org/Product, possibly making small changes to accomodate the tags. For instance splitting up brand, color etc. so that you can mark them accordingly.
-
Tom has this spot on. Google doesn't only look for direct duplication, but also very similar, and these really are I'm afraid.
You need to find ways to make each page unique in its own right - let Google see that no two pages are the same and there is a real reason to rank them.
-
I wonder if the details.aspx has something to do with it?
www.1010direct.com/TDV-019-GOLD-50/details.aspx
www.1010direct.com/DGV-DD1165-970-53/details.aspxBasically, both pages are called details.aspx. Depending on how you look at it, you have 2 pages that are named the same (with mostly similar content, though not unusual for e-commerce websites) in different subfolders. I'm not sure if there's some kind of difference in the way Moz works, and if that's part of why Moz marks this as duplicate content?
Are you unable to create 'prettier' URL's? Such as:
www.1010direct.com/tim-dilsen-019-gold-50-glasses.aspx
www.1010direct.com/dolce-gabbana-dd1165-970-53-glasses.aspxWith or without the aspx of course.
-
I'm not surprised Moz is flagging those pages as duplicate content and I wouldn't be totally surprised if Google did in the future.
Put it this way, the pages are identical bar for a single sentence title description, a price and roughly a 20 word section describing the product. Everything else is identical. It's duplicate.
Look at it another through Google's eyes. Here's how the two pages look when crawled by Google:
(If that doesn't work, try yourself at http://www.seo-browser.com/)
Just look at how much text and HTML is shared between the two pages. Yes, there are key differences on the pages (namely the product), but the Google bot nor the Mozbot is going to recognise those elements when it crawls it.
Presuming Google ignores the site nav, it still has a bunch of text and crawlable elements that are shared - pretty much everything under the product description. It doesn't see the individual images and the flavour text is frankly too small to make any sort of dent in the duplicate content %.
I'd seriously recommend at revising how your product pages look - there's far too much repeated content per page (you can still promote these things on each page but in a much, much smaller way) and the individual descriptions for the products, in my eyes, are not substantial enough.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Who gets punished for duplicate content?
What happens if two domains have duplicate content? Do both domains get punished for it, or just one? If so, which one?
Technical SEO | | Tobii-Dynavox0 -
Duplicate content. Wordpress and Website
Hi All, Will Google punish me for having duplicate blog posts on my website's blog and wordpress? Thanks
Technical SEO | | Mike.NW0 -
Duplicate Content with ADN, DNS and F5 URLs
In my duplicate content report, there are URLs showing as duplicate content. All of the pages work, they do not redirect, and they are used for either IT debugging or as part of a legacy system using a split DNS, QAing the site, etc... They aren't linked (or at least, shouldn't be) on any pages, and I am not seeing them in Search Results, but Moz is picking them up. Should I be worried about duplicate content here and how should I handle them? They are replicates of the current live site, but have different subdomains. We are doing clean up before migrating to a new CMS, so I'm not sure it's worth fixing at this point, or if it is even an issue at all. But should I make sure they are in robots or take any action to address these? Thanks!
Technical SEO | | QAD_ERP0 -
Index.php duplicate content
Hi, new here. Im looking for some help with htaccess file. index.php is showing duplicate content errors with: mysite.com/index.php mysite.com/ mysite.com ive managed to use the following code to remove the www part of the url: IfModule mod_rewrite.c>
Technical SEO | | klsdnflksdnvl
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_HOST} ^www.(.+)$ [NC]
RewriteRule ^ http://%1%{REQUEST_URI} [R=301,L] but how can i redirect the mysite.com/index.php and mysite.com/ to mysite.com. Please help0 -
Duplicate content vs. less content
Hi, I run a site that is currently doing very well in google for the terms that we want. We are 1,2 or 3 for our 4 targeted terms, but havent been able to jump to number one in two categories that I would really like to. In looking at our site, I didn't realize we have a TON of duplicate content as seen by SEO moz and I guess google. It appears to be coming from our forum, we use drupal. RIght now we have over 4500 pages of duplicate content. Here is my question: How much is this hurting us as we are ranking high. Is it better to kill the forum (which is more community service than business) and have a very tight site SEO-wise, or leave the forum even with the duplicate content. Thanks for your help. Erik
Technical SEO | | SurfingNosara0 -
Using robots.txt to deal with duplicate content
I have 2 sites with duplicate content issues. One is a wordpress blog. The other is a store (Pinnacle Cart). I cannot edit the canonical tag on either site. In this case, should I use robots.txt to eliminate the duplicate content?
Technical SEO | | bhsiao0 -
The Bible and Duplicate Content
We have our complete set of scriptures online, including the Bible at http://lds.org/scriptures. Users can browse to any of the volumes of scriptures. We've improved the user experience by allowing users to link to specific verses in context which will scroll to and highlight the linked verse. However, this creates a significant amount of duplicate content. For example, these links: http://lds.org/scriptures/nt/james/1.5 http://lds.org/scriptures/nt/james/1.5-10 http://lds.org/scriptures/nt/james/1 All of those will link to the same chapter in the book of James, yet the first two will highlight the verse 5 and verses 5-10 respectively. This is a good user experience because in other sections of our site and on blogs throughout the world webmasters link to specific verses so the reader can see the verse in context of the rest of the chapter. Another bible site has separate html pages for each verse individually and tends to outrank us because of this (and possibly some other reasons) for long tail chapter/verse queries. However, our tests indicated that the current version is preferred by users. We have a sitemap ready to publish which includes a URL for every chapter/verse. We hope this will improve indexing of some of the more popular verses. However, Googlebot is going to see some duplicate content as it crawls that sitemap! So the question is: is the sitemap a good idea realizing that we can't revert back to including each chapter/verse on its own unique page? We are also going to recommend that we create unique titles for each of the verses and pass a portion of the text from the verse into the meta description. Will this perhaps be enough to satisfy Googlebot that the pages are in fact unique? They certainly are from a user perspective. Thanks all for taking the time!
Technical SEO | | LDS-SEO0 -
Duplicate content connundrum
Hey Mozzers- I have a tricky situation with one of my clients. They're a reputable organization and have been mentioned in several major news articles. They want to create a Press page on their site with links to each article, but they want viewers to remain within the site and not be redirected to the press sites themselves. The other issue is some of the articles have been removed from the original press sites where they were first posted. I want to avoid duplicate content issues, but I don't see how to repost the articles within the client's site. I figure I have 3 options: 1. create PDFs (w/SEO-friendly URLs) with the articles embedded in them that open in a new window. 2. Post an image with screenshot of article on a unique URL w/brief content. 3. Copy and paste the article to a unique URL. If anyone has experience with this issue or any suggestions, I would greatly appreciate it. Jaime Brown
Technical SEO | | JamesBSEO0