Should we use the rel-canonical tag?
-
We have a secure version of our site, as we often gather sensitive business information from our clients.
Our https pages have been indexed as well as our http version.
-
Could it still be a problem to have an http and an https version of our site indexed by Google? Is this seen as being a duplicate site?
-
If so can this be resolved with a rel=canonical tag pointing to the http version?
Thanks
-
-
Agreed - this is generally an issue with relative paths, and job one is to fix it. In most cases, you really don't want these crawled at all. I do think rel=canonical is a good bet here - 301 redirects can get really tricky with http/https, and you can end up creating loops. It can be done right, but it's also easy to screw up, in my experience.
-
-
Yes, having 2 versions of the same content can be seen duplicate content and could cause issues.
-
Yes, include a canonical tag in the header (assuming both http & https pages are close to identical). This will help Google's crawler figure out which version of the page to show in the search results.
-
-
Yes, would suggest canonical as the easiest resolution -
And Irving is right PDF's are most definitely indexed, I am not sure how they are interpreted and if they would specifically count a dup content, but not sure this idea would EVER be something i would suggest as it it seems to have lots of negative repercussions.
I would most definitely agree that relative links is probably your issue, and if you canonical and remove inline relative links and make them http absolute this should resolve itself in a month or so.
-
I disagree
a) pdfs are both indexed AND read by crawlers.
b) even if you don't have navigation to the file sometimes Google can find it if it's in a folder that you are not blocking in robots.txt.
c) if someone links to it once on the web it's getting crawled and indexed.
If you have a https section that content should be behind a login and not accessible to the engines. Your problem sounds like your https pages have relative links on them and Google is crawling the https page and then following the relative links staying on https so you need to fix that and this will fix your site getting http pages indexed as dupe https.
Absolute http canonical tags will help but it not the solution. you need to fix the https leaking on your secure pages.
.
-
You can "no-index" them within the html - but if you really want a fun trick - when and if you are not able to get around mass amount of duped content and it isn't for the sake of rankings - example, MLS listings, etc
Change the content into a pdf - or file format - thus not being able to be crawled.
Once again - it will NOT be crawled - so don't go doing this to an entire site
But maybe your clients confidential data - can be submitted this way - and it will not get indexed - except for the subpage - but then you can no index that subpage.
Hope this helps.
Your pal
Chenzo
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why isn't the rel=canonical tag working?
My client and I have a problem: An ecommerce store with around 20 000 products has nearly 1 000 000 pages indexed (according to Search Console). I frequently get notified by messages saying “High number of URLs found” in search console. It lists a lot of sample urls with filter and parameters that are indexed by google, for example: https://www.gsport.no/barn-junior/tilbehor/hansker-votter/junior?stoerrelse-324=10-11-aar+10-aar+6-aar+12-aar+4-5-aar+8-9-aar&egenskaper-368=vindtett+vanntett&type-365=hansker&bruksomraade-367=fritid+alpint&dir=asc&order=name If you check the source code, there’s a canonical tag telling the crawler to ignore (..or technically commanding it to regard this exact page as another version of the page without all the parameters) everything after the “?” Does this url showing up in the Search Console message mean that this canonical isn’t working properly? If so: what’s wrong with it? Regards,
Intermediate & Advanced SEO | | Inevo
Sigurd0 -
Canonical Query
If Google decides to ignore your canonical and indexes numerous versions, does that count as duplicate content? We've got a large amount of canonicals ignored by Google, so I'm just trying to gauge if it's an issue or not.
Intermediate & Advanced SEO | | ThomasHarvey0 -
Proper Title Tags for ecommerce
In terms of E-commerce title tags. We are a manufacturer of our own clothing products. We are new to the SEO landscape so if this question is an obvious answer, then i apologize for wasting any one times in advance. We Manufacture our own clothing. Each item has a name. The names are American womens names such as amanda or lori or jenniffer etc. When we create the title tag for them should we include the name of the item itself at the beginning or end. For example should it be Item Name - Keyword - Keyword - Brand Name(aka manufacturer) or Keyword - Keyword - Item Name - Brand Name (aka manufacturer) The reason we ask this is because we think it would be a waste to rank for actual American names such as Jennifer and Jessica. All that we have read on Moz suggests that it seems to be better to have pertinent keywords in the beginning of the title as opposed to the end. In terms of our brand name we already rank number 1 for every combination of our brand. So we would like to start picking up traffic for the different product types we sell and there respective synonyms. Not sure if i am making any sense. Sorry in advance, and any help is very very much appreciated.
Intermediate & Advanced SEO | | Imagination0 -
Canonical Help (this is a nightmare)
Hi, We're new to SEO and trying to fix our domain canonical issue. A while back we were misusing the "link canonical" tag such that Google was tracking params (e.g. session ids, tagging ) all as different unique urls. This created a nightmare as now Google thinks there's millions of pages associated with our domain when the reality is really a couple thousand unique links. Since then, we've tried to fix this by: 1) specifying params to ignore via SEO webmasters 2) properly using the canonical tag. However, I'm still recognizing there's a bunch of outsanding search results that resulted from this mess. Any idea on expectation on when we'd see this cleaned up? I'm also recognizing that google is looking at http://domain.com and https://domain.com as 2 different pages even though we specify to only look at "http://domain.com" via the link canonical tag. Again, is this just a matter of waiting for Google to update its results? We submitted a site map but it seems like it's taking forever for the results of our site to clear up... Any help or insight would greatly be appreciated!
Intermediate & Advanced SEO | | sfgmedia0 -
H2 Tag Backlink - is this safe?
I have found that my site is getting a link from a good site, but my concern is that the link is in a H2 tag in the footer of the front page of the site Would getting a link from a site wrapped in H2 tags be safe? The anchor is my sites brand name
Intermediate & Advanced SEO | | JohnPeters0 -
Do I need a canonical tag on the 404 error page?
Per definition, a 404 is displayed for different url (any not existing url ...). As I try to clean my website following SEOmoz pro advices, SEOmoz notify me of duplicate content on urls leading to a 404 🙂 This is I guess not that important, but just curious: should we add a cononical tag to the template returning the 404, with a canonical url such as www.mysite.com/404 ?
Intermediate & Advanced SEO | | nuxeo0 -
Hash as a Replacement for Absolute URL in Canonical Tags?
Any idea why companies like Skechers would be doing this: http://screencast.com/t/ooEkATGN7EX ? I suppose it makes sense, but I've never seen it done before. If this works, why on earth would we be using absolute URLs still?
Intermediate & Advanced SEO | | stevewiideman0 -
Canonical & noindex? Use together
For duplicate pages created by the "print" function, seomoz says its better to use noindex (http://www.seomoz.org/blog/complete-guide-to-rel-canonical-how-to-and-why-not) and JohnMu says its better to use canonical http://www.google.com/support/forum/p/Webmasters/thread?tid=6c18b666a552585d&hl=en What do you think?
Intermediate & Advanced SEO | | nicole.healthline1