Duplicate content via dynamic URLs where difference is only parameter order?
-
I have a question about the order of parameters in an URL versus duplicate content issues. The URLs would be identical if the parameter order was the same.
E.g.
www.example.com/page.php?color=red&size=large&gender=male versus
www.example.com/page.php?gender=male&size=large&color=redHow smart is Google at consolidating these, and do these consolidated pages incur any penalty (is their combined “weight” equal to their individual selves)?
Does Google really see these two pages as DISTINCT, or does it recognize that they are the same because they have the exact same parameters?
Is this worth fixing in or does it have a trivial impact?
If we have to fix it and can't change our CMS, should we set a preferred, canonical order for these URLs or 301 redirect from one version to the other?
Thanks a million!
-
To be fair to Highland, I do think canonical is a good bet here, but I just have to comment that I don't think Google handles these kinds of URLs very well. They should, in theory, but in my experience they rarely do. The problem with order variants is that you can easily spin 100s or 1000s of them and create serious indexation and ranking problems.
For this particular example, the canonical tag is probably best, but there may be cases where certain parameters have no particular value (like a "sort by" parameter). Those are sometimes better off blocked.
I cover a bunch of examples in my mega-post on duplicate content:
http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world
-
Agreed with Highland, this seems exactly the kind of problem canonical can fix. I wouldn't go down the road of 301ing because for parameters that simple you like aren't going to run into problems. The rule of thumb is you should act if you have more than two parameters in the URL (not sure where I read that), but I've seen Google 'figure out' up to 4 for some of my sites.
Another thing to check out is Google webmaster tools, you can set certain keywords and url parameters there to help Google 'learn' how to crawl your site. This Google blog posting might help too:
http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html
-
Google should recognize the difference but, just to be safe, I would add a canonical to your page so you don't have anything to worry about.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Parameters, Forms & SEO
Hi I have some pages on the site which have a quote form, in my site crawl I see these showing as duplicate content - my webmaster says this isn't the case, but I'm not sure. Landing page - https://www.key.co.uk/en/key/high-esd-chairs Page with form - https://www.key.co.uk/en/key/high-esd-chairs?quote-form - this also somehow has a canonical on it pointing to https://www.key.co.uk/en/key/high-esd-chairs?quote-form Which neither of us have added. I'm thinking we need to get the canonical needs to be updated to https://www.key.co.uk/en/key/high-esd-chairs Is it worth doing this for all these pages or am I worrying about nothing? Becky
Intermediate & Advanced SEO | | BeckyKey0 -
Duplicate Content For Product Alternative listing
Hi I have a tricky one here. cloudswave is a directory of products and we are launching new pages called Alternatives to Product X This page displays 10 products that are an alternative to product X (Page A) Lets say now you want to have the alternatives to a similar product within the same industry, product Y (Page B), you will have 10 product alternatives, but this page will be almost identical to Page A as the products are in similar and in the same industry. Maybe one to two products will differ in the 2 listings. Now even SEO tags are different, aren't those two pages considered duplicate content? What are your suggestions to avoid this problem? thank you guys
Intermediate & Advanced SEO | | RSedrati0 -
A lot of news / Duplicate Content - what to do?
Hi All, I have a blog with a lot of content (news and pr messages), I want to move my blog to new domain. What is your recommendation? 1. Keep it as is. old articles -> 301 -> same article different URL
Intermediate & Advanced SEO | | JohnPalmer
2. Remove all the duplicate content and create 301 from the old URL to my homepage.
3. Keep it as is, but add in the meta-tags NoIndex in duplicate articles. Thanks !0 -
Robots.txt & Duplicate Content
In reviewing my crawl results I have 5666 pages of duplicate content. I believe this is because many of the indexed pages are just different ways to get to the same content. There is one primary culprit. It's a series of URL's related to CatalogSearch - for example; http://www.careerbags.com/catalogsearch/result/index/?q=Mobile I have 10074 of those links indexed according to my MOZ crawl. Of those 5349 are tagged as duplicate content. Another 4725 are not. Here are some additional sample links: http://www.careerbags.com/catalogsearch/result/index/?dir=desc&order=relevance&p=2&q=Amy
Intermediate & Advanced SEO | | Careerbags
http://www.careerbags.com/catalogsearch/result/index/?color=28&q=bellemonde
http://www.careerbags.com/catalogsearch/result/index/?cat=9&color=241&dir=asc&order=relevance&q=baggallini All of these links are just different ways of searching through our product catalog. My question is should we disallow - catalogsearch via the robots file? Are these links doing more harm than good?0 -
Duplicate page content errors stemming from CMS
Hello! We've recently relaunched (and completely restructured) our website. All looks well except for some duplicate content issues. Our internal CMS (custom) adds a /content/ to each page. Our development team has also set-up URLs to work without /content/. Is there a way I can tell Google that these are the same pages. I looked into the parameters tool, but that seemed more in-line with ecommerce and the like. Am I missing anything else?
Intermediate & Advanced SEO | | taylor.craig0 -
Google tagged URL an overly-dynamic URL?
I'm reviewing my campaign, and spotted the overly-dynamic URL box showing a few links. Reviewing it, they are my Google Tagged URLs (utm_source, utm_medium_utm_campaign etc) I've turned some internal links to Google Tagged URLs but should these cause concern?
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Google WMT Showing Duplicate Content, But There is None
In the HTML improvements section of Google Webmaster Tools, it is showing duplicate content and I have verified that the duplicate content they are listing does not exist. I actually have another duplicate content issue I am baffled by, but that it already being discussed on another thread. These are the pages they are saying have duplicate META descriptions, http://www.hanneganremodeling.com/bathroom-remodeling.html (META from bathroom remodeling page) <meta name="<a class="attribute-value">description</a>" content="<a class="attribute-value">Bathroom Remodeling Washington DC, Bathroom Renovation Washington DC, Bath Remodel, Northern Virginia,DC, VA, Washington, Fairfax, Arlington, Virginia</a>" /> http://www.hanneganremodeling.com/estimate-request.html (META From estimate page) <meta name="<a class="attribute-value">description</a>" content="<a class="attribute-value">Free estimates basement remodeling, bathroom remodeling, home additions, renovations estimates, Washington DC area</a>" /> WlO9TLh
Intermediate & Advanced SEO | | WebbyNabler0 -
Blocking Dynamic URLs with Robots.txt
Background: My e-commerce site uses a lot of layered navigation and sorting links. While this is great for users, it ends up in a lot of URL variations of the same page being crawled by Google. For example, a standard category page: www.mysite.com/widgets.html ...which uses a "Price" layered navigation sidebar to filter products based on price also produces the following URLs which link to the same page: http://www.mysite.com/widgets.html?price=1%2C250 http://www.mysite.com/widgets.html?price=2%2C250 http://www.mysite.com/widgets.html?price=3%2C250 As there are literally thousands of these URL variations being indexed, so I'd like to use Robots.txt to disallow these variations. Question: Is this a wise thing to do? Or does Google take into account layered navigation links by default, and I don't need to worry. To implement, I was going to do the following in Robots.txt: User-agent: * Disallow: /*? Disallow: /*= ....which would prevent any dynamic URL with a '?" or '=' from being indexed. Is there a better way to do this, or is this a good solution? Thank you!
Intermediate & Advanced SEO | | AndrewY1