Should we use the rel-canonical tag?
-
We have a secure version of our site, as we often gather sensitive business information from our clients.
Our https pages have been indexed as well as our http version.
-
Could it still be a problem to have an http and an https version of our site indexed by Google? Is this seen as being a duplicate site?
-
If so can this be resolved with a rel=canonical tag pointing to the http version?
Thanks
-
-
Agreed - this is generally an issue with relative paths, and job one is to fix it. In most cases, you really don't want these crawled at all. I do think rel=canonical is a good bet here - 301 redirects can get really tricky with http/https, and you can end up creating loops. It can be done right, but it's also easy to screw up, in my experience.
-
-
Yes, having 2 versions of the same content can be seen duplicate content and could cause issues.
-
Yes, include a canonical tag in the header (assuming both http & https pages are close to identical). This will help Google's crawler figure out which version of the page to show in the search results.
-
-
Yes, would suggest canonical as the easiest resolution -
And Irving is right PDF's are most definitely indexed, I am not sure how they are interpreted and if they would specifically count a dup content, but not sure this idea would EVER be something i would suggest as it it seems to have lots of negative repercussions.
I would most definitely agree that relative links is probably your issue, and if you canonical and remove inline relative links and make them http absolute this should resolve itself in a month or so.
-
I disagree
a) pdfs are both indexed AND read by crawlers.
b) even if you don't have navigation to the file sometimes Google can find it if it's in a folder that you are not blocking in robots.txt.
c) if someone links to it once on the web it's getting crawled and indexed.
If you have a https section that content should be behind a login and not accessible to the engines. Your problem sounds like your https pages have relative links on them and Google is crawling the https page and then following the relative links staying on https so you need to fix that and this will fix your site getting http pages indexed as dupe https.
Absolute http canonical tags will help but it not the solution. you need to fix the https leaking on your secure pages.
.
-
You can "no-index" them within the html - but if you really want a fun trick - when and if you are not able to get around mass amount of duped content and it isn't for the sake of rankings - example, MLS listings, etc
Change the content into a pdf - or file format - thus not being able to be crawled.
Once again - it will NOT be crawled - so don't go doing this to an entire site
But maybe your clients confidential data - can be submitted this way - and it will not get indexed - except for the subpage - but then you can no index that subpage.
Hope this helps.
Your pal
Chenzo
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rel canonical tag from shopify page to wordpress site page
We have pages on our shopify site example - https://shop.example.com/collections/cast-aluminum-plaques/products/cast-aluminum-address-plaque That we want to put a rel canonical tag on to direct to our wordpress site page - https://www.example.com/aluminum-plaques/ We have links form the wordpress page to the shop page, and over time ahve found that google has ranked the shop pages over the wp pages, which we do not want. So we want to put rel canonical tags on the shop pages to say the wp page is the authority. I hope that makes sense, and I would appreciate your feeback and best solution. Thanks! Is that possible?
Intermediate & Advanced SEO | | shabbirmoosa0 -
Using Hreflang Tags For Australian Domain Extension
Hi Guys, We have a company with a Australian domain www.domain.com.au which has just launched in the US market. The company is in the process of purchasing the .com version of the domain and then the plan is to have one single global .com site (like apple.com) on a new domain which would be domain.com and put both the (US version) and (Australian Version) on the new domain: domain.com (global). e.g. domain.com/us and domain.com/au However the .com version won't be available till March 2016. The company still wants to launch in the US market asap with it's current .com.au domain. which it has. So basically the current set-up is like this: http://www.domain.com.au/us/ (US homepage) http://www.domain.com.au/ (Australian homepage) I was wondering, does anyone know if hreflang tag can be used on a .com.au extension to target specific pages to the US. e.g. I was wondering will the hreflang tag override the fact that Google would automatically geo-target the .com.au extension to Australia? e.g. would the http://www.domain.com.au/us/ (US version) with the hreflang tag above be considered as the US version, even-though we it's on a .com.au domain extension? Cheers.
Intermediate & Advanced SEO | | jayoliverwright0 -
Should I be using meta robots tags on thank you pages with little content?
I'm working on a website with hundreds of thank you pages, does it make sense to no follow, no index these pages since there's little content on them? I'm thinking this should save me some crawl budget overall but is there any risk in cutting out the internal links found on the thank you pages? (These are only standard site-wide footer and navigation links.) Thanks!
Intermediate & Advanced SEO | | GSO0 -
Problems with Squarespace Title Tags
Hi All, I'm having problems editing the title tags on individual pages on Squarespace. It seems the only way to do it is via the page title name. Here is an example: http://www.autismsees.com/research/. The page is called research, so it makes that the meta title. The problem is I want to keep research on the page and the Meta Title be: Autism Spectrum Research. I'v tried searching over the web, but no luck so far. Thanks for your help.
Intermediate & Advanced SEO | | PeterRota0 -
Should I use BOTH UBL and Localeze?
Would it be worthwhile to list a business with both UBL and Localeze?
Intermediate & Advanced SEO | | DougHoltOnline0 -
How are PDF image alt tags and "subject" field in document properties used for search
Hello, 1. Does google use image alt tags? According to this 2011 document, the answer is no, but I have seen others claiming yes- has google since begun using alt tags for images within PDFs? http://googlewebmastercentral.blogspot.com/2011/09/pdfs-in-google-search-results.html I am trying to decide if it is worth updating existing PDFs with alt tags for images for the purpose of SEO. 2. How does Google use the "Subject" field in document properties for a PDF? Should it be used as a description field for the document, similar to a meta description? Thank you!
Intermediate & Advanced SEO | | winstoncho0 -
Adding rel=next / prev to pagination that uses Ajax?
Hi I have just been informed that I should be using the rel=next / rel=prev markup on my category pages and search results pages that use pagination. How do i add these in? Is it just the simple case of adding rel=next in the<a href="" for="" item="" in="" the="" pagination?<="" p=""></a> <a href="" for="" item="" in="" the="" pagination?<="" p="">Also does this work if your are using AJAX - on page load it displays the search / category pages then uses AJAX for additional pages so there is no page refresh</a> <a href="" for="" item="" in="" the="" pagination?<="" p="">Many Thanks</a>
Intermediate & Advanced SEO | | ocelot0 -
Not using a robot command meta tag
Hi SEOmoz peeps. Was doing some research on robot commands and found a couple major sites that are not using them. If you check out the code for these: http://www.amazon.com http://www.zappos.com http://www.zappos.com/product/7787787/color/92100 http://www.altrec.com/ You fill not find a meta robot command line. Of course you need the line for any noindex, nofollow, noarchive pages. However for pages you want crawled and indexed, is there any benefit for not having the line at all? Thanks!
Intermediate & Advanced SEO | | STPseo0