Filtered Navigation, Duplicate content issue on an Ecommerce Website
-
I have navigation that allows for multiple levels of filtering. What is the best way to prevent the search engine from seeing this duplicate content? Is it a big deal nowadays? I've read many articles and I'm not entirely clear on the solution.
For example.
You have a page that lists 12 products out of 100:
companyname.com/productcategory/page1.htm
And then you filter these products:
companyname.com/productcategory/filters/page1.htm
The filtered page may or may not contain items from the original page, but does contain items that are in the unfiltered navigation pages. How do you help the search engine determine where it should crawl and index the page that contains these products?
I can't use rel=canonical, because the exact set of products on the filtered page may not be on any other unfiltered pages. What about robots.txt to block all the filtered pages? Will that also stop pagerank from flowing? What about the meta noindex tag on the filitered pages?
I have also considered removing filters entirely, but I'm not sure if sacrificing usability is worth it in order to remove duplicate content. I've read a bunch of blogs and articles, seen the whiteboard special on faceted navigation, but I'm still not clear on how to deal with this issue.
-
Hi Dstrunin,
I would still use the rel canonical tag even with or without the filter in place. So if you have a list of products displayed unfilter at companyname.com/productcategory/page1.htm, I would add a rel canonical with it pointing at companyname.com/productcategory/page1.htm. For the filtered results,companyname.com/productcategory/filters/page1.htm , the canoncial tag would still point to companyname.com/productcategory/page1.htm.
It doesn't hurt to have a canonical tag point to the same page it's on.
If you can't do that I would meta noindex those filtered pages and remove the robots.txt stuff. Robots.txt doesn't tell Google they can't index it it only says they can't crawl it. So they could still index old stuff they crawled before you did the robots.txt stuff or index the title tags.
Casey
-
I have been doing that, but robots.txt only does so much. I've implemented the meta noindex tag as well and it doesn't seem to be taking all the pages out of the index.
-
My unprofessional opinion would be to use robot.txt on some areas. I'll also be interested to see what the pros here say.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Duplicate Contents in Order Pages of Multiple Products
Hi, I have a website containing 30 software products. Each product has an order page. The problem is that the layout and content of these 30 order pages are very similar, except for the product name, for example: https://www.datanumen.com/access-repair-order/
On-Page Optimization | | ccw
https://www.datanumen.com/outlook-repair-order/
https://www.datanumen.com/word-repair-order/ Siteliner has reports these pages as duplicate contents. I am thinking of noindex these pages. However, in such a case, if a user search for "DataNumen Outlook Repair order page", then he will not be able to see the order page of our product, which drives the revenue go away. So, how to deal with such a case? Thank you.1 -
What to do with repetitive content
Hi, I recently took over a site from another SEO firm. They created lots of articles targeting the same terms. The articles aren't bad but I fear they could dilute the site's ranking power for a given term. I don't want to give away the specific industry, but let's say they have eight pages targeting the term "______ billing software." I'd rather focus their resources on ranking one page for that term. Does that make sense? And if so, how do I do that? The company has a writer that can see if any of the content is good enough to add to their primary ______ billing software page. Would you 301 redirect all these pages to the one you want to rank, or would you canonicalize them? Or am I way off base in my thinking?
On-Page Optimization | | rich.owings0 -
Duplicate content on events site
I have an event website and for every day the event occurs the event has a page. For example: The Oktoberfest in Germany the event takes 16 days. My site would have 16 (almost)identical pages about the Oktoberfest(same text, adres, photos, contact info). The only difference between the pages is the date mentioned on the page. I use rich snippets. How does google treat my pages and what is the best practice.
On-Page Optimization | | dragonflo0 -
Duplicate Content Issues with Forum
Hi Everyone, I just signed up last night and received the crawl stats for my site (ShapeFit.com). Since April of 2011, my site has been severely impacted by Google's Panda and Penguin algorithm updates and we have lost about 80% of our traffic during that time. I have been trying to follow the guidelines provided by Google to fix the issues and help recover but nothing seems to be working. The majority of my time has been invested in trying to add content to "thin" pages on the site and filing DMCA notices for copyright infringement issues. Since this work has not produced any noticeable recovery, I decided to focus my attention on removing bad backlinks and this is how I found SEOmoz. My question is about duplicate content. The crawl diagnostics showed 6,000 errors for duplicate page content and the same for duplicate page title. After reviewing the details, it looks like almost every page is from the forum (shapefit.com/forum). What's the best way to resolve these issues? Should I completely block the "forum" folder from being indexed by Google or is there something I can do within the forum software to fix this (I use phpBB)? I really appreciate any feedback that would help fix these issues so the site can hopefully start recovering from Panda/Penguin. Thank you, Kris
On-Page Optimization | | shapefit0 -
tagged as duplicate content?
Hello folks, I'm new to SEOmoz . I was looking at our Crawl Diagnostics and found that some of our blog posts that have been commented on were tagged as duplicate content. For example: http://thankyouregistry.com/blog/remarriages-and-gift-registries/ http://thankyouregistry.com/blog/remarriages-and-gift-registries/comment-page-1/ I'm unsure how to fix these, so any ideas would be appreciated. Thanks a lot!
On-Page Optimization | | GiftReg0 -
Duplicated Content with joomla multi language website
Dear Seomoz Community I am running a multi language joomla website (www.siam2nite.com) with 2 active languages. The first and primary language is english. the second language is thai. Most of the content (articles, event descriptions ...) is in english only. What we did is a thai translation for the navigation bars, headers, titles etc (translation of all joomla language files) those texts are static and only help the user navigate / understand our site in their thai language. Now I facing a problem with duplicated content. Lets take our Q&A component as example. the url structure looks like this: english - www.siam2nite.com/en/questions/ thai - www.siam2nite.com/th/questions/ Every question asked will create two URL, one for each language. The content itself (user questions & answers) is identical on both URL's. Only the GUI language is different. If you take a look at this question you will understand what i mean: ENGLISH VERSION: http://www.siam2nite.com/en/questions/where-to-celebrate-halloween-in-bangkok THAI VERSION: http://www.siam2nite.com/th/questions/where-to-celebrate-halloween-in-bangkok As you can see each page has a unique title (H1) and introduction text in the correct language (same for menu, buttons, etc.) but the questions and answers are only available in one language. Now my question 😉 I guess Google will see this pages as duplicated content. How should I proceed with this problem: put all thai links /th/questions/ in the robots.txt and block them or make a canonical tag for the english versions? Not sure if I set a canonical tag google will still index the thai title and introduction texts (they have important thai keywords in them) Would really appreciate your help on this 😉 Regards, Menelik
On-Page Optimization | | menelik0 -
E commerce Website canonical and duplicate content isssue
i have a ecomerce site , i am just wondering if any one could help me answer this the more info page can be access will google consider it as duplicate and if it does then how to best use the canonical tag http://domain.com/product-page http://domain.com/product-page/ http://domain.com/product-Page http://domain.com/product-Page/ also in zencart when link product it create duplicate page content how to tackle it? many thanks
On-Page Optimization | | conversiontactics0 -
Meta descriptions better empty or with duplicate content?
I am working with a yahoo store. Somehow all of the meta description fields were filled in with random content from throughout the store. For example, a black cabinet knob product page might have in its description field the specifications for a drawer slide. I don't know how this happened. We have had a programmer auto populate certain fields to get them ready for product feeds, etc. It's possible they screwed something up during that, this was a long time ago. My question. Regardless of how it happened. Is it better for me to have them wipe these fields entirely clean? Or, is it better for me to have them populate the fields with a duplicate of our text from the body. The site has about 6,500 pages so I have and will make custom descriptions for the more important pages after this process, but the workload to do them all is too much. So, nothing or duplicate content for the pages that likely won't receive personal attention?
On-Page Optimization | | dellcos1