Is a Rel Canonical Sufficient or Should I 'NoIndex'
-
Hey everyone,
I know there is literature about this, but I'm always frustrated by technical questions and prefer a direct answer or opinion. Right now, we've got recanonicals set up to deal with parameters caused by filters on our ticketing site. An example is that this:
http://www.charged.fm/billy-joel-tickets?location=il&time=day relcanonicals to...
http://www.charged.fm/billy-joel-tickets
My question is if this is good enough to deal with the duplicate content, or if it should be de-indexed. Assuming so, is the best way to do this by using the Robots.txt? Or do you have to individually 'noindex' these pages?
This site has 650k indexed pages and I'm thinking that the majority of these are caused by url parameters, and while they're all canonicaled to the proper place, I am thinking that it would be best to have these de-indexed to clean things up a bit.
Thanks for any input.
-
I totally agree with EGOL on this. I would like to add my 2cents since I think I am one of the only SEO people that is a developer too.
This is what I would do (in pseudo code) put a <rel="canonical" href="$url=strtok($_SERVER[" request_uri"],'?');"=""> </rel="canonical">
This is in php, I don't know what platform you are on, but what it will do in php is return the current url as the canonical and delete the ? and everything after. So basically it will return the url minus the query string. I use this technique a lot with my clients for doing canonical urls on CMS's that use query strings and it works great.
-
Hi - Just to throw in my two cents - the canonicals should do it as Moosa says but if you really want to de-index then a dynamic meta robots tag is the best way to get them out of the index in my experience.
That being said, having a quick look at your site it doesn't look like those url parameters are the issue, a quick look at something like this: site:charged.fm inurl:date= only shows a few thousand results and the location= and time= show even less - so looks like the rel canonicals are doing the job and will continue to with a bit of patience. If you look at urls with /event/ in them however you see a lot (300,000+) and I am guessing many of those are for past events. Google webmaster tools should help you id what the bulk of those 600 thousand urls are so worth verifying where the exact issue is before attempting to fix something that isn't a problem...
-
There are a few choices for managing parameters. I have used....
A) The URL parameter manager in the "crawl" options of Google Webmaster Tools. I have found it to be totally unreliable.
B) Rel=canonical. It is much more reliable than WMT but you still must rely on search engines to discover it and obey - which can be slow to take effect and is less than 100% effective.
I have not used robots.txt because I think that it would have similar performance to rel=canonical.
I have the belief that you shoud not trust search engines to do things for you that you can do for yourself with 100% reliability. So, I am doing ......
C). Managing parameters on my server with .htaccess so I have 100% control.
-
I believe if you have setup the rel canonical correctly there ideally should be no issue with that but if you really see some of your non preferred versions indexed in Google then you can go with the no index idea.
When no-indexing pages you can go with any approach but in my experience it is better do it by using robots.txt.
I hope this is a direct and to the point opinion J
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site still indexed after request 'change of address' search console
Hello, A couple of weeks ago we requested a change of address in Search console. The new, correct url is already indexed. Yet when we search the old url (with site:www.) we find that the old url is still indexed. Is there another way to remove old urls?
Intermediate & Advanced SEO | | conversal0 -
Is Google ignoring my canonicals?
Hi, We have rel=canonical set up on our ecommerce site but Google is still indexing pages that have rel=canonical. For example, http://www.britishbraces.co.uk/braces/novelty.html?colour=7883&p=3&size=599 http://www.britishbraces.co.uk/braces/novelty.html?p=4&size=599 http://www.britishbraces.co.uk/braces/children.html?colour=7886&mode=list These are all indexed but all have rel=canonical implemented. Can anyone explain why this has happened?
Intermediate & Advanced SEO | | HappyJackJr0 -
Alt tag for src='blank.gif' on lazy load images
I didn't find an answer on a search on this, so maybe someone here has faced this before. I am loading 20 images that are in the viewport and a bit below. The next 80 images I want to 'lazy-load'. They therefore are seen by the bot as a blank.gif file. However, I would like to get some credit for them by giving a description in the alt tag. Is that a no-no? If not, do they all have to be the same alt description since the src name is the same? I don't want to mess things up with Google by being too aggressive, but at the same time those are valid images once they are lazy loaded, so would like to get some credit for them. Thanks! Ted
Intermediate & Advanced SEO | | friendoffood0 -
Duplicate Title tags even with rel=canonical
Hello, We were having duplicate content in our blog (a replica of each post automatically was done by the CMS), until we recently implemented a rel=canonical tag to all the duplicate posts (some 5 weeks ago). So far, no duplicate content were been found, but we are still getting duplicate title tags, though the rel=canonical is present. Any idea why is this the case and what can we do to solve it? Thanks in advance for your help. Tej Luchmun
Intermediate & Advanced SEO | | luxresorts0 -
Ecommerce URL's
I'm a bit divided about the URL structure for ecommerce sites. I'm using Magento and I have Canonical URLs plugin installed. My question is about the URL structure and length. 1st Way: If I set up Product to have categories in the URL it will appear like this mysite.com/category/subcategory/product/ - and while the product can be in multiple places , the Canonical URL can be either short or long. The advantage of having this URL is that it shows all the categories in the breadcrumbs ( and a whole lot more links over the site ) . The disadvantage is the URL Length 2nd Way: Setting up the product to have no category in the URL URL will be mysite.com/product/ Advantage: short URL. disadvantage - doesn't show the categories in the breadcrumbs if you link direct. Thoughts?
Intermediate & Advanced SEO | | s_EOgi_Bear1 -
High level rel=canonical conceptual question
Hi community. Your advice and perspective is greatly appreciated. We are doing a site replatform and I fear that serious SEO fundamentals were overlooked and I am not getting straight answers to a simple question: How are we communicating to search engines the single URL we want indexed? Backstory: Current site has major duplicate content issues. Rel-canonical is not used. There are currently 2 versions of every category and product detail page. Both are indexed in certain instances. A 60 page audit has recommends rel=canonical at least 10 times for the similar situations an ecommerce site has with dupe urls/content. New site: We are rolling out 2 URLS AGAIN!!! URL A is an internal URL generated by the systerm. We have developed this fancy dynamic sitemap generator which looks/maps to URL A and creates a SEO optimized URL that I call URL B. URL B is then inserted into the site map and the sitemap is communicated externally to google. URL B does an internal 301 redirect back to URL A...so in an essence, the URL a customer sees is not the same as what we want google to see. I still think there is potential for duplicate indexing. What do you think? Is rel=canonical the answer? In my research on this site, past projects and google I think the correct solution is this on each customer facing category and pdp: The head section (With the optimized Meta Title and Meta Description) needs to have the rel-canonical pointing to URL B
Intermediate & Advanced SEO | | mm916157
example of the meta area of URL A: What do you think? I am open to all ideas and I can provide more details if needed.0 -
How does Google determine 'top refeferences'?
Does anyone have any insight into how Google determines 'top references' from medical websites?
Intermediate & Advanced SEO | | nicole.healthline
For example, if you search 'skin disorders,' you'll see 'Sources include <cite>nih.gov</cite>, <cite>medicinenet.com</cite> and <cite>dmoz.org</cite>'--how is that determined?0 -
REL canonicals not fixing duplicate issue
I have a ton of querystrings in one of the apps on my site as well as pagination - both of which caused a lot of Duplicate errors on my site. I added rel canonicals as a php condition so every time a specific string (which only exists in these pages) occurs. The rel canonical notification shows up in my campaign now, but all of the duplicate errors are still there. Did I do it right and just need to ignore the duplicate errors? Is there further action to be taken? Thanks!
Intermediate & Advanced SEO | | Ocularis0