Disallowing URL Parameters vs. Canonicalizing
-
Hi all,
I have a client that has a unique search setup. So they have Region pages (/state/city). We want these indexed and are using self-referential canonicals.
They also have a search function that emulates the look of the Region pages. When you search for, say, Los Angeles, the URL changes to _/search/los+angeles _and looks exactly like /ca/los-angeles.
These search URLs can also have parameters (/search/los+angeles?age=over-2&time[]=part-time), which we obviously don't want indexed.
Right now my concern is how best to ensure the /search pages don't get indexed and we don't get hit with duplicate content penalties. The options are this:
-
Self-referential canonicals for the Region pages, and disallow everything after the second slash in /search/ (so the main search page is indexed)
-
Self-referential canonicals for the Region pages, and write a rule that automatically canonicalizes all other search pages to /search.
Potential Concern: /search/ URLs are created even with misspellings.
Thanks!
-
-
Just so you know Meta no-index can be applied through the HTML but also through the HTTP header which might make it easier to implement on such a highly generated website
-
Yeah, I know the difference between the two, I've just been in a situation where canonicals were recommended as a means of controlling the preferred page _within an indexation context. _If that makes sense.
My biggest concern is with the creation of URLs from misspellings, which still return search results if it's close enough. The redirects could work. Honestly that wasn't something we considered.
I'm liking the noindex approach. They'd have to write a rule that applies it to every page created with a search parameter, which I think they should be able to do.
If it helps, almost the entire site is run by Javascript. Like...everything.
Thanks for the advice. Much appreciated.
-Brad
-
Robots.txt controls crawling, not indexation. Google will still sometimes index pages they cannot crawl. Canonical tags are for duplicate content consolidation, but are not a hard signal and Google frequently ignores them. Meta no-index tags (or X-robots no-index through the HTTP header, if you cannot apply Meta no-index in the HTML) is a harder signal and is meant to help you control indexation
To be honest if the pages are identical why not just 301 redirect the relevant searches (the top-line ones, which result in pages exactly the same as your regional ones) to the regional URLs? If the pages really are the same, it won't be any different for users except for a small delay during the redirect (which won't really be felt, especially if you are using Nginx redirects)
If you can't do that, you're really left with the Meta no-index tag and the canonical tag. Canonical tags avoid content duplication penalties but are a softer signal and they don't consolidate link equity like 301 redirects do (so in many way, there's not actually that much different between Meta no-index and canonicals, except canonical tags are more complex to set up in the first place as they require a destination field)
I'd probably just Meta no-index all the search URLs. Once Google had swallowed that, I would then (after 2-3 weeks) apply the relevant robots.txt rules
If you apply them both at the same time, Google won't be able to crawl the search URLs (since your robots.txt rule will block them) and therefore they will be blind to your canonical / Meta no index directive(s). So you have to handle de-indexation first, and THEN after that block the crawling to save your crawl allowance a bit
But don't do it all at once or you'll get in an unholy mess!
-
Hi there
Canonical tags prevent problems caused by identical or "duplicate" content across multiple URLs. So in this instance implement the disallow rule on al of the URLs containing /search/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Personalized Content Vs. Cloaking
Hi Moz Community, I have a question about personalization of content, can we serve personalized content without being penalized for serving different content to robots vs. users? If content starts in the same initial state for all users, including crawlers, is it safe to assume there should be no impact on SEO because personalization will not happen for anyone until there is some interaction? Thanks,
Technical SEO | | znotes0 -
Canonicalize or Block?
Hi Mozers, We have staff profile pages w/ one main URL and then URLs with query parameters and jump links to take you to different parts of the page. The longer URLs with parameters canonicalize to the main pages but should they also be nonidexed? Thanks, Yael
Technical SEO | | yaelslater0 -
Removed URLs
Hi all, We have recently removed 200+ articles from our blog. However, those links are still being shown on Google weeks after their removal. In there a way to speed up the process? What effect will this have on our SEO ranking?
Technical SEO | | businessowner0 -
Changing URL - Ranking Disappeared?
Hi All, I named a page URL /plectrums/ within the back end framework. But then decided to change it to /personalised-plectrums/ I resubmitted a GWT sitemap and 301 redirected plectrums -> personalised-plectrums My ranking for personalised plectrums has disappeared and has not come back does anyone know why this is? Or is there something I have missed? Lewis
Technical SEO | | SO_UK0 -
Removing a URL from Search Results
I recently renamed a small photography company, and so I transferred the content to the new website, put a 301-redirect on the old website URL, and turned off hosting for that website. But when I search for certain terms that the old URL used to rank highly for (branded terms) the old URL still shows up. The old URL is "www.willmarlowphotography.com" and when you type in "Will Marlow" it often appears in 8th and 9th place on a SERP. So, I have two questions: First, since the URL no longer has a hosting account associated with it, shouldn't it just disappear from SERPs? Second, is there anything else I should have done to make the transition smoother to the new URL? Thanks for any insights you can share.
Technical SEO | | williammarlow0 -
Basic URL Structure Question
Hi, Putting together a URL for a product we are selling. We sell IT Training courses and the structure is normally Top Folder=Main Courses section Sub Folder=Vendor Page Specific=Course Name + Term An example is courses/microsoft/mcse-training However I have a product where the vendor and course name are the same. How should I best organise the URL - double mention or single mention So a) courses/togaf/togaf-foundation-training or b) courses/togaf/foundation-training
Technical SEO | | RobertChapman0 -
Singular vs plural in urls
In keyword research for an ecommerce site, I've found that widget, singular gets a lot more searches than widgets, plural AND is much less competitive. Is it better for SEO purposes to have the URLs (and matching title tags) in the catalog as /brass-widget.html, /steel-widget.html, etc., or /brass-widgets.html, etc.? I'm worried that a) searches for widgets will pass by the singular urls but not vice versa, and b) the singular form will strike visitors as bad grammar. Any advice?
Technical SEO | | AmericanOutlets0