Is a Rel Canonical Sufficient or Should I 'NoIndex'
-
Hey everyone,
I know there is literature about this, but I'm always frustrated by technical questions and prefer a direct answer or opinion. Right now, we've got recanonicals set up to deal with parameters caused by filters on our ticketing site. An example is that this:
http://www.charged.fm/billy-joel-tickets?location=il&time=day relcanonicals to...
http://www.charged.fm/billy-joel-tickets
My question is if this is good enough to deal with the duplicate content, or if it should be de-indexed. Assuming so, is the best way to do this by using the Robots.txt? Or do you have to individually 'noindex' these pages?
This site has 650k indexed pages and I'm thinking that the majority of these are caused by url parameters, and while they're all canonicaled to the proper place, I am thinking that it would be best to have these de-indexed to clean things up a bit.
Thanks for any input.
-
I totally agree with EGOL on this. I would like to add my 2cents since I think I am one of the only SEO people that is a developer too.
This is what I would do (in pseudo code) put a <rel="canonical" href="$url=strtok($_SERVER[" request_uri"],'?');"=""> </rel="canonical">
This is in php, I don't know what platform you are on, but what it will do in php is return the current url as the canonical and delete the ? and everything after. So basically it will return the url minus the query string. I use this technique a lot with my clients for doing canonical urls on CMS's that use query strings and it works great.
-
Hi - Just to throw in my two cents - the canonicals should do it as Moosa says but if you really want to de-index then a dynamic meta robots tag is the best way to get them out of the index in my experience.
That being said, having a quick look at your site it doesn't look like those url parameters are the issue, a quick look at something like this: site:charged.fm inurl:date= only shows a few thousand results and the location= and time= show even less - so looks like the rel canonicals are doing the job and will continue to with a bit of patience. If you look at urls with /event/ in them however you see a lot (300,000+) and I am guessing many of those are for past events. Google webmaster tools should help you id what the bulk of those 600 thousand urls are so worth verifying where the exact issue is before attempting to fix something that isn't a problem...
-
There are a few choices for managing parameters. I have used....
A) The URL parameter manager in the "crawl" options of Google Webmaster Tools. I have found it to be totally unreliable.
B) Rel=canonical. It is much more reliable than WMT but you still must rely on search engines to discover it and obey - which can be slow to take effect and is less than 100% effective.
I have not used robots.txt because I think that it would have similar performance to rel=canonical.
I have the belief that you shoud not trust search engines to do things for you that you can do for yourself with 100% reliability. So, I am doing ......
C). Managing parameters on my server with .htaccess so I have 100% control.
-
I believe if you have setup the rel canonical correctly there ideally should be no issue with that but if you really see some of your non preferred versions indexed in Google then you can go with the no index idea.
When no-indexing pages you can go with any approach but in my experience it is better do it by using robots.txt.
I hope this is a direct and to the point opinion J
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Syntax: 'canonical' vs "canonical" (Apostrophes or Quotes) does it matter?
I have been working on a site and through all the tools (Screaming Frog & Moz Bar) I've used it recognizes the canonical, but does Google? This is the only site I've worked on that has apostrophes. rel='canonical' href='https://www.example.com'/> It's apostrophes vs quotes. Could this error in syntax be causing the canonical not to be recognized? rel="canonical"href="https://www.example.com"/>
Intermediate & Advanced SEO | | ccox10 -
I've had to share this for the comedy value!
One of our clients today has sent over a list of keywords which he hopes to be ranked on page one for, please check these out and try not to laugh. All the existing Birmingham xxxx searches Hosted Voice Cloud Communications Cloud Solutions Cloud Services Pure Cloud VoIP Telephony Communications Unified Communications Fixed line SIP & SIP Trunks Broadsoft Yealink Contact Centre & Hosted Contact Centre Cyber Security Ransomware Open DNS Secure device management IoT – Internet of Things CISCO Meraki partner System manager Routers Switches Virtual stacking SOPHOS UTM partner SOPHOS Silver partner General Data Protection Regulation Business Mobile Mobile / Mobility M2M – Mobile 2 Mobile EE Vodafone O2 Managed print Photocopier / Printer Ethernet Leased Line EoFTTC FTTC ADSL2+ Broadband Connectivity WiFi CMX location analytics High capacity 802.11ac Automatic RF optimisation Security radio Identity-based firewall AC Dual Band Cloud managed wifi MDM – mobile device management Critical data Insurance Critical data Storage Collaboration I'm not sure he understood why I wanted to gather this information but he's defiantly not got the right end of the stick!
Intermediate & Advanced SEO | | chrissmithps0 -
Something happened within the last 2 weeks on our WordPress-hosted site that created "duplicates" by counting www.company.com/example and company.com/example (without the 'www.') as separate pages. Any idea what could have happened, and how to fix it?
Our website is running through WordPress. We've been running Moz for over a month now. Only recently, within the past 2 weeks, have we been alerted to over 100 duplicate pages. It appears something happened that created a duplicate of every single page on our site; "www.company.com/example" and "company.com/example." Again, according to our MOZ, this is a recent issue. I'm almost certain that prior to a couple of weeks ago, there existed both forms of the URL that directed to the same page without be counting as a duplicate. Thanks for you help!
Intermediate & Advanced SEO | | wzimmer0 -
Rel=canonical on pre-migration website
I have an e-commerce client that is migrating platforms. The current structure of their existing website has led to what I would believe to be mass duplicate content. They have something north of 150,000 indexed URLs. However, 143,000+ of these have query strings and the content is identical to pages without any query string. Even so, the site does pretty well from an organic stand point compared to many of its direct competitors. Here is my question: (1) I am assuming that I should go into WMT (Google/Bing) and tell both search engines to ignore query strings. (2) In a review of back links, it does appear that there is a mish mash of good incoming links both to the clean and the dirty URLs. Should I add a rel=canonical via a script to all the pages with query strings before we make our migration and allow the search engines some time to process? (3) I'm assuming I can continue to watch the indexation of the URLs, but should I also tell search engines to remove the URLs of the dirty URLs? (4) Should I do Fetch in WMT? And if so, what sequence should I do for 1-4. How long should I wait between doing the above and undertaking the migration?
Intermediate & Advanced SEO | | ExploreConsulting0 -
Noindex a meta refresh site
I have a client's site that is a vanity URL, i.e. www.example.com, that is setup as a meta refresh to the client's flagship site: www22.example.com, however we have been seeing Google include the Vanity URL in the index, in some cases ahead of the flagship site. What we'd like to do is to de-index that vanity URL. We have included a no-index meta tag to the vanity URL, however we noticed within 24 hours, actually less, the flagship site also went away as well. When we removed the noindex, both vanity and flagship sites came back. We noticed in Google Webmaster that the flagship site's robots.txt file was corrupt and was also in need of fixing, and we are in process of fixing that - Question: Is there a way to noindex vanity URL and NOT flagship site? Was it due to meta refresh redirect that the noindex moved out the flagship as well? Was it maybe due to my conducting a google fetch and then submitting the flagship home page that the site reappeared? The robots.txt is still not corrected, so we don't believe that's tied in here. To add to the additional complexity, the client is UNABLE to employ a 301 redirect, which was what I recommended initially. Anyone have any thoughts at all, MUCH appreciated!
Intermediate & Advanced SEO | | ACNINTERACTIVE0 -
Is this structure valid for a canonical tag?
Working on a site, and noticed their canonical tags follow the structure: //www.domain.com/article They cited their reason for this as http://www.ietf.org/rfc/rfc3986.txt. Does anyone know if Google will recognize this as a valid canonical? Are there any issues with using this as a the canonical?
Intermediate & Advanced SEO | | nicole.healthline0 -
Canonical vs noindex for blog tags
Our blog started to user tags & I know this is bad for Panda, but our product team wants use them for user experience. Should we canonizalize these tags to the original blog URL or noindex them?
Intermediate & Advanced SEO | | nicole.healthline0 -
Pagination with rel=“next” and rel=“prev”
Hey mozzers Would be interested to know if anyone has used the rel=“next” and rel=“prev” attributes more info here http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html If you have used it, has it worked and what are your thoughts etc:? And for those that have used it, is it a better way of handling pagination other than the obvious of Google saying so. Thanks
Intermediate & Advanced SEO | | CraigAddyman0