Disallowing URL Parameters vs. Canonicalizing
-
Hi all,
I have a client that has a unique search setup. So they have Region pages (/state/city). We want these indexed and are using self-referential canonicals.
They also have a search function that emulates the look of the Region pages. When you search for, say, Los Angeles, the URL changes to _/search/los+angeles _and looks exactly like /ca/los-angeles.
These search URLs can also have parameters (/search/los+angeles?age=over-2&time[]=part-time), which we obviously don't want indexed.
Right now my concern is how best to ensure the /search pages don't get indexed and we don't get hit with duplicate content penalties. The options are this:
-
Self-referential canonicals for the Region pages, and disallow everything after the second slash in /search/ (so the main search page is indexed)
-
Self-referential canonicals for the Region pages, and write a rule that automatically canonicalizes all other search pages to /search.
Potential Concern: /search/ URLs are created even with misspellings.
Thanks!
-
-
Just so you know Meta no-index can be applied through the HTML but also through the HTTP header which might make it easier to implement on such a highly generated website
-
Yeah, I know the difference between the two, I've just been in a situation where canonicals were recommended as a means of controlling the preferred page _within an indexation context. _If that makes sense.
My biggest concern is with the creation of URLs from misspellings, which still return search results if it's close enough. The redirects could work. Honestly that wasn't something we considered.
I'm liking the noindex approach. They'd have to write a rule that applies it to every page created with a search parameter, which I think they should be able to do.
If it helps, almost the entire site is run by Javascript. Like...everything.
Thanks for the advice. Much appreciated.
-Brad
-
Robots.txt controls crawling, not indexation. Google will still sometimes index pages they cannot crawl. Canonical tags are for duplicate content consolidation, but are not a hard signal and Google frequently ignores them. Meta no-index tags (or X-robots no-index through the HTTP header, if you cannot apply Meta no-index in the HTML) is a harder signal and is meant to help you control indexation
To be honest if the pages are identical why not just 301 redirect the relevant searches (the top-line ones, which result in pages exactly the same as your regional ones) to the regional URLs? If the pages really are the same, it won't be any different for users except for a small delay during the redirect (which won't really be felt, especially if you are using Nginx redirects)
If you can't do that, you're really left with the Meta no-index tag and the canonical tag. Canonical tags avoid content duplication penalties but are a softer signal and they don't consolidate link equity like 301 redirects do (so in many way, there's not actually that much different between Meta no-index and canonicals, except canonical tags are more complex to set up in the first place as they require a destination field)
I'd probably just Meta no-index all the search URLs. Once Google had swallowed that, I would then (after 2-3 weeks) apply the relevant robots.txt rules
If you apply them both at the same time, Google won't be able to crawl the search URLs (since your robots.txt rule will block them) and therefore they will be blind to your canonical / Meta no index directive(s). So you have to handle de-indexation first, and THEN after that block the crawling to save your crawl allowance a bit
But don't do it all at once or you'll get in an unholy mess!
-
Hi there
Canonical tags prevent problems caused by identical or "duplicate" content across multiple URLs. So in this instance implement the disallow rule on al of the URLs containing /search/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Div tags vs. Tables
Is there any reason NOT to code in tables (other than it being outdated) for SEO reasons?
Technical SEO | | EileenCleary0 -
Mobile URLs in the desktop SERPs
Our real estate website URL is listed on desktop search as well as the mobile URL. I've read several blogposts on this subject but I still don't understand the fix for this. I've read to use rel=canonical tags. But does that stop Google from listing it in the desktop SERP? Is there a way to stop this without blocking the mobile site which is what our programmer wants to do? Or is this something we have to live with until Google fixes this issue?
Technical SEO | | MassMedia0 -
301: Dynamic URL to Static Page
I've been going around trying to get this dynamic url to redirect in the .htaccess file. I know I'm missing something but can't figure it out. Code: RewriteEngine on
Technical SEO | | ohlmanngroup
RewriteCond %{QUERY_STRING} ^/dynamic-url.php?id=43$
RewriteRule ^$ http://static/page/url/inserted/here? [R=301,L] Suggestions?0 -
Mobile URL parameter (Redirection to desktop)
Hello, We have a parallel mobile website and recently we implemented a link pointing to the desktop website. This redirect is happening via a javascript code and results in a url followed by this paramenter: ?m=off Example:
Technical SEO | | echo1
http://www.m.website.com redirects to:
http://www.website.com/?m=off Questions: Will the "http://www.website.com/?m=off" be considered duplicate content with "http://www.website.com" since they both return the same content? Is there any possibility that Google will take into consideration the url ending in "/?m=off"? How should we treat this new url? The webmaster tools URL parameter configuration at the moment isn't experiencing problems but should we submit the parameter anyway in order not to be indexed or should we wait first and see the error response? In case we should submit this for removal... what's the best way to do it? Like this? Parameter: ?m=off Does this parameter change page content seen by the user? - doesn't affect page content Any help is much appreciated.
Thank you!0 -
Exact URL Match For Ranking
Has anyone else run into this issue? I have a competitor that purchases domain names for popular inner pages we are trying to rank for. We are trying to build a brand, our competitors have a lower domain authority but rank higher for inner pages in the serps with VERY little content, backlinks/seo work, they host a single page and do a re-direct to their main site. Would this be a good long term strategy? EX. We sell golf clubs our brand name is golfcity (Ex only) and we carry callaway clubs, our competitor is also building a brand but they purchased callawayclubs.net and do a re-direct. They rank on page one for keywords callaway clubs. If I do try to do this does one have an advantage over another? .com. net .org. because Ive seem them all used and rank on page 1. Thank you!!!
Technical SEO | | TP_Marketing0 -
Can you 404 any forms of URL?
Hi seomozzers, <colgroup><col width="548"></colgroup>
Technical SEO | | Ideas-Money-Art
| http://ex.com/user/login?destination=comment%2Freply%2F256%23comment-form |
| http://ex.com/user/login?destination=comment%2Freply%2F258%23comment-form |
| http://ex.com/user/login?destination=comment%2Freply%2F242%23comment-form |
| http://ex.com/user/login?destination=comment%2Freply%2F257%23comment-form |
| http://ex.com/user/login?destination=comment%2Freply%2F260%23comment-form |
| http://ex.com/user/login?destination=comment%2Freply%2F225%23comment-form |
| http://ex.com/user/login?destination=comment%2Freply%2F251%23comment-form |
| http://ex.com/user/login?destination=comment%2Freply%2F176%23comment-form | These are duplicate content and the canonical version is: http://www.ex.com/user (login and pass page of the website) Since there were multiple other duplicates which mostly have been resolved by 301s, I figured that all "LOGIN" URLs (above) should be 404d since they don't carry any authority and 301 those wouldn't be the best solution since "too many 301s" can slow down the website speed. But a member of the dev team said: "Looks like all the urls requested to '404 redirect' are actually the same page http://ex.com/user/login. The only part of the url that changes is the variables after the "?" . I don't think you can (or highly not recommended) make 404 pages display for variables in a url. " So my question is: I am not sure what he means by that? and Is it really better to not 404 these? Thanks0 -
Someone is redirecting their url to mine
Hello, I have just discovered that a company in Poland www.realpilot.pl is directing their domain to ours www.transair.co.uk. We have not authorised this, neither do we want this. I have contacted the company and the webmaster to get it removed. If you search for the domain name www.realpilot.pl we (www.transair.co.uk) come up top. My biggest worry is that we will get penalised by Google for this re-direct as it appears to be done using some kind of frame. Does anyone know anything about this kind of thing? Many Thanks Rob Martin
Technical SEO | | brightonseorob0 -
Permanent 301 redirects vs canonical urls?
Im moving a website that was .php to wordpress with a few static HTML pages. Which is better use permanent 301 redirects and delte the old pages, leave the old pages and use canonical urls and 301 redirects or something else?
Technical SEO | | senith0