Avoiding Duplicate Content in E-Commerce Product Search/Sorting Results
-
How do you handle sorting on ecommerce sites? Does it look something like this?
For Example:
- example.com/inventory.php
- example.com/inventory.php?category=used
- example.com/inventory.php?category=used&price=high
- example.com/inventory.php?category=used&location=seattle
If not, how would you handle this? If so, would you just include a no-index tag on all sorted pages to avoid duplicate content issues?
Also, how does pagination play into this? Would it be something like this?
For Example:
- example.com/inventory.php?category=used&price=high__
- example.com/inventory.php?category=used&price=high&page=2
- example.com/inventory.php?category=used&price=high&page=3
If not, how would you handle this? If so, would you still include a no-index tag?
Would you include a rel=next/prev tag on these pages in addition to or instead of the no-index tag?
I hope this makes sense. Let me know if you need me to clarify any of this. Thanks in advance for your help!
-
Thanks everyone, for the feedback!
Dr. Pete, as always, you are a tremendous help!! I look forward to reporting back any findings I come up with during implementation.
Thanks again!
-Alex
-
Unfortunately, it does get tricky in those multi-parameter situations. Googe has suggested that you NOT use canonical to solve pagination issues, unless you canonical to a "View All", and that has some restrictions. So, don't use canonical if it covers "page=2", etc.
Adam Audette has a great post on the subject, but it is complex (he just didn't an updated talk at SMX, but I don't have that link offhand yet):
http://searchengineland.com/five-step-strategy-for-solving-seo-pagination-problems-95494
Basically, you can use canonical and rel=prev/next together:
(1) The canonical tag would point to "?category=used&page=2"
(2) Rel=prev/next should include the "price=high" parameter, and other parameters.
Unfortunately, this makes for tricky code. See the end of this post:
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
I'm not thrilled with Google's solution, but it does seem to be working. Bing only partially supports rel=prev/next, to complicate matters.
The other option is to use Google Webmaster Tools (and Bing Webmaster Central) parameter handling to inform them of the functions of "price=" and "page=". If you're just trying to prevent problems, that's viable (i.e. you don't have any current issues).
You can also NOINDEX the variants - Google says they don't recommend it anymore, but I still find it does work in some cases. I just wouldn't combine NOINDEX with rel=prev/next/canonical - you can end up with a mess.
-
On my Ecommerce i've just added it in robots.txt
You should be able to do something similar. Below is what I have but I use x-Cart as my ecommerce site.
User-agent: *
Disallow: /printable=Y
Disallow: /js=
Disallow: /sort=
Disallow: /sort_direction=
Disallow: /product.php
Disallow: /home.php?cat=*
Disallow: /catalog/
Disallow: /search.php
Disallow: /cart.php
Disallow: /help.php
Disallow: /giftcert.php
Disallow: /product.php
Disallow: /orders.php
Disallow: /register.php
Disallow: /icon.php
Disallow: /image.php
Disallow: /error_message.php
Disallow: /offers.php
Disallow: /product_image.php
Sitemap: http://www.domainurlhere.co.uk/sitemap.xml -
I would use rel="canonical" to example.com/inventory.php
on
- example.com/inventory.php?category=used
- example.com/inventory.php?category=used&price=high
- example.com/inventory.php?category=used&location=seattle
This should cover you for pagination : http://googlewebmastercentral.blogspot.com.au/2011/09/pagination-with-relnext-and-relprev.html
-
Regarding pagination - urls look fine and you should use rel=prev/rel=next instead of the no-index tag.
Regarding sorting - Google have a handy little sheet about this which you may or may not have seen that covers this kind of issue
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Purchasing duplicate content
Morning all, I have a client who is planning to expand their product range (online dictionary sites) to new markets and are considering the acquisition of data sets from low ranked competitors to supplement their own original data. They are quite large content sets and would mean a very high percentage of the site (hosted on a new sub domain) would be made up of duplicate content. Just to clarify, the competitor's content would stay online as well. I need to lay out the pros and cons of taking this approach so that they can move forward knowing the full facts. As I see it, this approach would mean forgoing ranking for most of the site and would need a heavy dose of original content as well as supplementing the data on page to build around the data. My main concern would be that launching with this level of duplicate data would end up damaging the authority of the site and subsequently the overall domain. I'd love to hear your thoughts!
Technical SEO | | BackPack851 -
Tricky Duplicate Content Issue
Hi MOZ community, I'm hoping you guys can help me with this. Recently our site switched our landing pages to include a 180 item and 60 item version of each category page. They are creating duplicate content problems with the two examples below showing up as the two duplicates of the original page. http://www.uncommongoods.com/fun/wine-dine/beer-gifts?view=all&n=180&p=1 http://www.uncommongoods.com/fun/wine-dine/beer-gifts?view=all&n=60&p=1 The original page is http://www.uncommongoods.com/fun/wine-dine/beer-gifts I was just going to do a rel=canonical for these two 180 item and 60 item pages to the original landing page but then I remembered that some of these landing pages have page 1, page 2, page 3 ect. I told our tech department to use rel=next and rel=prev for those pages. Is there anything else I need to be aware of when I apply the canonical tag for the two duplicate versions if they also have page 2 and page 3 with rel=next and rel=prev? Thanks
Technical SEO | | znotes0 -
Google Search Results Display URL
Our urls show as www.domain.com/getproduct.aspx?productid=48376 (url #1) in Google search results. When you click on the link and go to the site the URL is www.domain.com/product-name.aspx (url #2) I checked in Google Webmaster Tools (Fetch as Google) and there is a 302 redirect from url #1 to url #2. It also shows a Set-Cookie value, ASP.NET_SessionID= If we make it a 301 redirect instead, will the url displayed in Google search results be the url #2? We need to get rid of the Set-Cookie for crawlers correct?
Technical SEO | | Guy_Huyett0 -
Is duplicate content ok if its on LinkedIn?
Hey everyone, I am doing a duplicate content check using copyscape, and realized we have used a ton of the same content on LinkedIn as our website. Should we change the LinkedIn company page to be original? Or does it matter? Thank you!
Technical SEO | | jhinchcliffe0 -
WordPress - How to stop both http:// and https:// pages being indexed?
Just published a static page 2 days ago on WordPress site but noticed that Google has indexed both http:// and https:// url's. Usually I only get http:// indexed though. Could anyone please explain why this may have happened and how I can fix? Thanks!
Technical SEO | | Clicksjim1 -
URL query considered duplicate content?
I have a Magento site. In order to reduce duplicate content for products of the same style but with different colours I have combined them on to 1 product page. I would like to allow the pictures to be dynamic, i.e. allow a user to search for a colour and all the products that offer that colour appear in the results, but I dont want the default product image shown but the product image for that colour applying to the query. Therefore to do this I have to append a query string to the end of the URL to produce this result: www.website.com/category/product-name.html?=red My question is, will the query variations then be picked up as duplicate content: www.website.com/category/product-name.html www.website.com/category/product-name.html?=red www.website.com/category/product-name.html?=yellow Google suggest it has contingencies in its algorithm and I will not be penalised: http://googlewebmastercentral.blogspot.co.uk/2007/09/google-duplicate-content-caused-by-url.html But other sources suggest this is not accurate. Note the article was written in 2007.
Technical SEO | | BlazeSunglass0 -
Complex duplicate content question
We run a network of three local web sites covering three places in close proximity. Each sitehas a lot of unique content (mainly news) but there is a business directory that is shared across all three sites. My plan is that the search engines only index the business in the directory that are actually located in the place the each site is focused on. i.e. Listing pages for business in Alderley Edge are only indexed on alderleyedge.com and businesses in Prestbury only get indexed on prestbury.com - but all business have a listing page on each site. What would be the most effective way to do this? I have been using rel canonical but Google does not always seem to honour this. Will using meta noindex tags where appropriate be the way to go? or would be changing the urls structure to have the place name in and using robots.txt be a better option. As an aside my current url structure is along the lines of: http://dev.alderleyedge.com/directory/listing/138/the-grill-on-the-edge Would changing this have any SEO benefit? Thanks Martin
Technical SEO | | mreeves0 -
Search Engine Blocked by Robot Txt warnings for Filter Search result pages--Why?
Hi, We're getting 'Yellow' Search Engine Blocked by Robot Txt warnings for URLS that are in effect product search filter result pages (see link below) on our Magento ecommerce shop. Our Robot txt file to my mind is correctly set up i.e. we would not want Google to index these pages. So why does SeoMoz flag this type of page as a warning? Is there any implication for our ranking? Is there anything we need to do about this? Thanks. Here is an example url that SEOMOZ thinks that the search engines can't see. http://www.site.com/audio-books/audio-books-in-english?audiobook_genre=132 Below are the current entries for the robot.txt file. User-agent: Googlebot
Technical SEO | | languedoc
Disallow: /index.php/
Disallow: /?
Disallow: /.js$
Disallow: /.css$
Disallow: /checkout/
Disallow: /tag/
Disallow: /catalogsearch/
Disallow: /review/
Disallow: /app/
Disallow: /downloader/
Disallow: /js/
Disallow: /lib/
Disallow: /media/
Disallow: /.php$
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /utm
Disallow: /var/
Disallow: /catalog/
Disallow: /customer/
Sitemap:0