Can I leave off HTTP/HTTPS in a canonical tag?
-
We are working on moving our site to HTTPS and I was asked by my dev team if it is required to declare HTTP or HTTPS in the canonical tag? I know that relative URL's are acceptable but cannot find anything about HTTP/HTTPS.
Example of what they would like to do
Has anyone done this?
Any reason to not leave off the protocol?
-
Very good to hear, thanks Shawn! The goal is to use absolute canonicals, but for a period of time, we may have to use protocol relative. The redirects in place should avoid any duplicate content issues, which seems to be the big landmine.
-
That's good to know. Thanks for the update Shawn.
Since the initial discussion took place several Google reps. have publicly stated that there is no PageRank loss between redirects and rel ="canonical" tags. This seems to substantiate their claim.
The biggest issue with these is when giving conflicting instructions to user agents, such as a redirect to a page that rel canonicals back to the URL from which it was redirected, thus closing an infinite loop. For example, if you redirected from HTTP to HTTPS, but then the HTTPS version had a rel ="canonical" tag that was hard-coded to the HTTP version.
The above issue doesn't apply because you're redirecting from HTTP to HTTPs, which shows a relative path rel canonical tag for the HTTPs domain.
-
Now that our entire site is HTTPS, there does not seem to be any negative impact to our URL's by leaving off the HTTP protocol. If there was any traffic lost, it didn't seem significant as our reports did not indicate a decline. One year later, traffic through SEO is higher than before we implemented.
I personally agree with Everett, don't leave things to chance. I did require that the homepage did have HTTPS for the canonical though. I felt massive panic attacks while we were going through the transition. However, if you are unable to convince your developers the importance of using an absolute path for canonical this did not seem to have a negative impact on our site.
I am glad that we didn't have any noticeable impact, but I am also glad that I didn't turn it into a bigger issue within our leadership team. Since we didn't see anything negative, it could've reduced my credibility within the business which would've had made it difficult for larger SEO problems.
BTW, we are still using relative canonical tags today. (except the homepage, that still has HTTPS)
-
Hey Shawn, did using an unspecified HTTP/HTTPS protocol work for you in the canonical and/or HREF-LANG? We are going through a transition to HTTPS as well, and have multiple systems with some URLs that are hard coded. Hoping this solution would work as a short-term fix, while we update these pages to use a new, more dynamic system.
-
Shawn,
My advice would be to canonical everything to the HTTPS version using an absolute path. That would be the best practice. I understand that is not what you're doing and you aren't getting any errors, but site-wide use of rel canonicals is something that can do more harm than good if a search engine misinterprets what you're trying to accomplish.
Either way, good luck and keep us posted.
-
No worries Shawn. I also hope it doesn't cause issues down the line. Everything in me is screaming "Don't do it!"
Best of luck.
-Andy
-
I know, and that's what sucks. It appears to work, but goes against what seems to be best practice and since I cannot find other instances to state one or the other it's hard not to follow their logic.
I just hope it doesn't screw up everything in the end. Thanks for the discussion.
-
Well, if it works (which I didn't think it would!) then I guess that answers one question - and I ran that page through Screaming Frog just to confirm there are no issues and it does indeed canonical back to the https version of the page.
I just can't get out of the mindset that the format looks wrong. I haven't seen other instances of it done that way, and like you, have no documentation to suggest issues that might be caused.
Sorry I can't be of more help.
-Andy
-
Thanks Andy, I posted a reply to the other response that ties into your comment here. On the page I listed above, there are not errors if I use HTTPS and the canonical doesn't declare anything. We have SSL certs, just haven't made the big switch yet.
-
Thanks for the answers, all of which I've passed on to them.
They have attempted this on a page and have not seen any errors or issues as of yet which is problematic for me in the sense of if I cannot show where any issue results by them taking shortcuts, they will not necessarily listen to my feedback.
Here is the URL that they have left off the protocol in the canonical
http://www.alaskaair.com/content/deals/flights/cheapest-flights-to-hawaii.aspx.
I use the Chrome extension Canonical which doesn't give me the icon indicating that I am not viewing the preferred URL. When I use HTTPS and view source it looks the same as it does with HTTP. Sometimes there are parameters in the URL like ?INT=AS_HomePage_-prodID:SEO and even with HTTP missing from the canonical it still seems to work.
Since I cannot find any documentation against doing it this way I am getting strong resistance to declaring HTTP and then going back at some point when it moves to HTTPS and updating. Like I've stated above, they are using this for links and assets on the site since our site moves back and forth between HTTPS and HTTP depending on what the customer is doing and they have found leaving off the protocol it makes their life easier and limits the errors that Andy below mentions.
https://www.alaskaair.com/content/deals/flights/cheapest-flights-to-hawaii.aspx
-
Hi again
To be clear, I think this would populate http://www.domain.com//www.domain.com as the where the canonical should be attributed to.
Hope this makes a bite more sense. Good luck!
-
Example of what they would like to do
That would be a no-no Shawn. If you are running over SSL, then you need to canonical back to the https version of the page. If you don't, you will end up with errors on the page (yellow warning triangle) and trust issues with Google. What they would like to do is canonical to a malformed URL which it could interpret as a file.
Try going to any URL and just entering it as //www.domain.com
-Andy
-
Hi there
According to Google...
Avoid errors**:** use absolute paths rather than relative paths with the
rel="canonical"
link element. However, they then say (under "Prefer HTTPS over HTTP for canonical URLs)...
Google prefers HTTPS pages over equivalent HTTP pages as canonical, except when there are conflicting signals such as the following:
- The HTTPS page has an invalid SSL certificate.
- The HTTPS page contains insecure dependencies.
- The HTTPS page is roboted (and the HTTP page is not).
- The HTTPS page redirects users to or through an HTTP page.
- The HTTPS page has a
rel="canonical"
link to the HTTP page. - The HTTPS page contains a
noindex
robots meta tag
Although our systems prefer HTTPS pages over HTTP pages by default, you can ensure this behavior by taking any of the following actions:
- Add 301 or 302 redirects from the HTTP page to the HTTPS page.
- Add a
rel="canonical"
link from the HTTP page to the HTTPS page. - Implement HSTS.
To prevent Google from incorrectly making the HTTP page canonical, you should avoid the following practices:
- Bad SSL certificates and HTTPS-to-HTTP redirects cause us to prefer HTTP very strongly. Implementing HSTS cannot override this strong preference.
- Including the HTTP page in your sitemap or hreflang entries rather than the HTTPS version.
- Implementing your SSL/TLS certificafe for the wrong host-variant: for example, example.com serving the certificate for www.example.com. The certificate must match your complete site URL, or be a wildcard certificate that can be used for multiple subdomains on a domain.
Since I don't know how your SSL is configured, I can't tell you one way or another, but if you have a https version of your pages, then head that direction. Having a relative protocol won't seem to work here for what you're asking.
Read the above and let me know if that helps! Good luck!
-
I did read that before I asked, it didn't really answer my question. I understand that relative URL's work, but leaving off the protocol declaration isn't relative it just leaves it up to the server to provide whether the site is secure or not.
Since we use multiple systems across our site, there isn't an easy way to implement relative or absolute canonical tags which is why the dev's want to know if they can implement without HTTP/HTTPS. They like to do this with assets on the site and have started to code links in a similar manner. What I can't determine is if this will cause issues.
-
Hi there
According to Google, they want you to either use relative URLs or use absolute URLs. You can read more here.
I recommend reading this so you can see the types of common mistakes they find and how to resolve those.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
1st Ecommerce site got penalized, can we start a 2nd one?
Hello, A client's first site got penalized by Goolge Penguin. It has recovered through cleaning up backlinks, but not to where it was before. It is 2nd and 3rd for several money keywords, but is far less successful than before penalization. We are starting a second site. Here's the important steps to mention The new site shows up first for it's domain name, and it has 30 pages indexed. It shows up NOWHERE for our leading search term. Out other site has a blog post that is 3rd for that search term. We are using new categories and new organization. We are using a different cart solution We are adding all unique content The home pages and some of the product pages are very thorough. We are adding comprehensive products like nothing else in the industry (10X) We plan on adding a very comprehensive blog, but haven't started yet. We've added the top 100 products so far. Our other store has 500. There's a lot of spam in the industry, so sites are slow to rank. Our category descriptions are 500 words Again, all unique content. No major errors in Moz Campaign tools Just a few categories so far, we're going to add many more. Same Google Analytics account as our other site It looks like we should eventually be on page 3 for our major search term. Again, we're nowhere for anything right now. ... Have you seen that Google will not rank a second site because it's from the same company and Google Analytics account, or does Google let you rank 2 sites in the same industry? We are hoping it's just slow to rank. If you can rank 2 sites, what are your best recommendations to help show up? Thanks.
White Hat / Black Hat SEO | | BobGW0 -
I'm seeing thousands of no-follow links on spam sites. Can you help figure it out?
I noticed that we are receiving thousands of links from many different sites that are obviously disguised as something else. The strange part is that some of them are legitimate sites when you go to the root. I would say 99% of the page titles read something like : 1 Hour Loan Approval No Credit Check Vermont, go cash advance - africanamericanadaa.com. Can someone please help me? Here are some of the URL's we are looking at: http://africanamericanadaa.com/genialt/100-dollar-loans-for-people-with-no-credit-colorado.html http://muratmakara.com/sickn/index.php?recipe-for-cone-06-crackle-glaze http://semtechblog.com/tacoa/index.php?chilis-blue-raspberry-margarita http://wesleygcook.com/rearc/guaranteed-personal-loans-oregon.html
White Hat / Black Hat SEO | | TicketCity0 -
The differences between XXX.domain.com and domain.com/XXX?
hi guys i would like to know which seo value is better? for example if i would put a link in xxx.domain.com or domain.com/XXX which one will give me a better seo value? does it give the same? assuming that domain.com have a huge PR RANK itself. why do people bother making XXX.domain.com instead? hope for clarification thanks!
White Hat / Black Hat SEO | | andzon0 -
Would reviews being served to a search engine user agent through a noscript tag (but not shown for other user types) be considered cloaking?
This one is tough, and I've asked it once here, http://www.quora.com/Search-Engine-Optimization-SEO/Is-having-rich-snippets-placed-below-a-review-that-is-pulled-via-javascript-considered-bad-grey-hat-SEO, but I feel that the response was sided with the company. As an SEO or digital marketer, it seems that if we are pulling in our reviews via iframe for our users, but serving them through a nonscript tag when the user agent is a search engine, that this could be considered cloaking. I understand that the "intent" may be to show the same thing to the bots as the user sees, but if you look at the view source, you'll never see the reviews, because it would only be delivered to the search engine bot. What do you think?
White Hat / Black Hat SEO | | eTundra0 -
Removing/ Redirecting bad URL's from main domain
Our users create content for which we host on a seperate URL for a web version. Originally this was hosted on our main domain. This was causing problems because Google was seeing all these different types of content on our main domain. The page content was all over the place and (we think) may have harmed our main domain reputation. About a month ago, we added a robots.txt to block those URL's in that particular folder, so that Google doesn't crawl those pages and ignores it in the SERP. We now went a step further and are now redirecting (301 redirect) all those user created URL's to a totally brand new domain (not affiliated with our brand or main domain). This should have been done from the beginning, but it wasn't. Any suggestions on how can we remove all those original URL's and make Google see them as not affiliated with main domain?? or should we just give it the good ol' time recipe for it to fix itself??
White Hat / Black Hat SEO | | redcappi0 -
Removing Poison Links w/o Disavow
Okay so I've been working at resolving former black-hat SEO tactics for this domain for many many months. Finally our main keyword is falling down the rankings like crazy no matter how many relevant, quality links I bring to the domain. So I'm ready to take action today. There is one inner-page which is titled exactly as the keyword we are trying to match. Let's call it "inner-page.html" This page has nothing but poison links with exact match anchor phrases pointing at it. The good links I've built are all pointed at the domain itself. So what I want to do is change the url of this page and let all of the current poison links 404. I don't trust the disavow tool and feel like this will be a better option. So I'm going to change the page's url to "inner_page.html" or in otherwords, simply changed to an underscore instead of a hyphen. How effective do you think this will be as far as 404ing the bad links and does anybody out there have experience using this method? And of course, as always, I'll keep you all posted on what happens with this. Should be an interesting experiment at least. One thing I'm worried about is the traffic sources. We seem to have a ton of direct traffic coming to that page. I don't really understand where or why this is taking place... Anybody have any insight into direct traffic sources to inner-pages? There's no reason for current clients to visit and potentials shouldn't be returning so often... I don't know what the deal is there but "direct" is like our number 2 or 3 traffic source. Am I shooting myself in the foot here? Here we go!
White Hat / Black Hat SEO | | jesse-landry0 -
Rel Canonical and Rel No Index, Follow
Hi, Cant implement rel next and prev as getting difficulty in coding - tried lot for same, but to no luck... Considering now rel=canonical and rel noindex,follow to 2 sections Deals and Discounts - We have been consistenly ranking on first position for over 1.5 yr, however recently slipped to position 4,5 on many keywords in this section URL - http://www.mycarhelpline.com/index.php?option=com_offers&view=list&Itemid=9 here, the page content for page 1 and 2 pertains to the current month and from page 3 to all other pages pertains to previous months. Is adding up rel canonical from page 3 to last page to page 1 - makes sense & also simultaneously add noindex, follow from page 3 to last page News & Reviews Section - Here, all news & article items are posted. Been the links of news items are primarily there. However, the pages are not duplicates, does adding noindex, follow makes sense here URL - http://www.mycarhelpline.com/index.php?option=com_latestnews&view=list&Itemid=10 Look forward for recommendations to implement the best - to gain SERP, avoid duplicate and white hat method.. Many thanks
White Hat / Black Hat SEO | | Modi0 -
Can I just delete pages to get rid of bad back-links to those pages?
I just picked up a client who had built a large set of landing pages (1000+) and built a huge amount of spammy links to them (too many to even consider manually requesting deletion for from the respective webmasters). We now think that google may also be seeing the 'landing pages' as 'doorway pages' as there are so many of them 1000+ and they are all optimized for specific keywords and generally pretty low quality. Also, the client received an unnatural links found email from google. I'm going to download the links discovered by google around the date of that email and check out if there are any that look specifily bad but I'm sure it will be just one of the several thosand bad links they built. Anyway, they are now wanting to clean up their act and are considering deleting the landing/doorway pages in a hope to a. rank better for the other non landing/doorway pages (Ie category and sub cats) but more to the crux of my question.. b. essentially get rid of all the 1000s of bad links that were built to those landing/doorway pages. - will this work? if we just remove those pages and use 404 or 410 codes will google see any inbound (external) links to those pages as basicly no longer being links to the site? or is the TLD still likely to be penilized for all the bad links coming into no longer existing URLs on it? Also, any thoughts on whether a 404 or 410 would be better is appreciated. Some info on that here: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033 I guess another option is the disavow feature with google, but Matt Cutts video here: http://www.youtube.com/watch?v=393nmCYFRtA&feature=em- kind of makes it sound like this should just be used for a few links, not 1000s... Thanks so much!!!!
White Hat / Black Hat SEO | | zingseo0