How bad is it to have duplicate content across http:// and https:// versions of the site?
-
A lot of pages on our website are currently indexed on both their http:// and https:// URLs. I realise that this is a duplicate content problem, but how major an issue is this in practice?
Also, am I right in saying that the best solution would be to use rel canonical tags to highlight the https pages as the canonical versions?
-
Thank you both - and sorry for not replying earlier. It sounds like we have some work to do
-
There isn't best solution in this case.
All you need is to prefer some http or https and to migrate everything to preferred protocol. This mean that you shoud:
- Make 301 redirects
- Fix all canonicals
- Do site move in SearchConsole. This is tricky because you need to verify both properties there and move one of them to other.
- Fix all internal links in pages to avoid 301 redirects
- Fix all images/scripts and other content to avoid 301 redirect
In theory this is. As you can see "fix canonical" is just one small step there.
-
The biggest problem with duplicate content (in most cases) is that you leave it up to Google (or search engines in general) to decide to which version they are going to send the traffic.
I assume that if you have a site in https you would want all visits to go to the https version. Rather than using a canonical url (which is merely a friendly request) I would 301 the http to the https version.
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Migrating website to new CMS and to https://
Hi, We are migrating an old website to a new one built in Wordpress soon. We also added an SSL to change to https:// Most of the url's stay the same. Can we just migrate from http to https on server level, and for the url's that do change just set a 301 redirect? Or are there other things we should take into account?
Technical SEO | | Mat_C0 -
Do you still loose 15% of value of inbound links when you redirect your site from http to https (so all inbound links to http are being redirected to https version)?
I know when you redesign your on website, you loose about 15% internally due to the 301 redirects (see moz article: https://moz.com/blog/accidental-seo-tests-how-301-redirects-are-likely-impacting-your-brand), but I'm wondering if that also applies to value of inbound links when you redirect your http://www.sitename.com to https://www.sitename.com. I appreciate your help!
Technical SEO | | JBMediaGroup0 -
Duplicate content. Wordpress and Website
Hi All, Will Google punish me for having duplicate blog posts on my website's blog and wordpress? Thanks
Technical SEO | | Mike.NW0 -
Duplicate Content in Wordpress.com
Hi Mozers! I have a client with a blog on wordpress.com. http://newsfromtshirts.wordpress.com/ It just had a ranking drop because of a new Panda Update, and I know it's a Dupe Content problem. There are 3900 duplicate pages, basically because there is no use of noindex or canonical tag, so archives, categories pages are totally indexed by Google. If I could install my usual SEO plugin, that would be a piece of cake, but since Wordpress.com is a closed environment I can't. How can I put a noindex into all category, archive and author peges in wordpress.com? I think this could be done by writing a nice robot.txt, but I am not sure about the syntax I shoud use to achieve that. Thank you very much, DoMiSol Rossini
Technical SEO | | DoMiSoL0 -
Issue: Duplicate Pages Content
Hello, Following the setting up of a new campaign, SEOmoz pro says I have a duplicate page content issue. It says the follwoing are duplicates: http://www.mysite.com/ and http://www.mysite.com/index.htm This is obviously true, but is it a problem? Do I need to do anything to avoid a google penalty? The site in question is a static html site and the real page only exsists at http://www.mysite.com/index.htm but if you type in just the domain name then that brings up the same page. Please let me know what if anything I need to do. This site by the way, has had a panda 3.4 penalty a few months ago. Thanks, Colin
Technical SEO | | Colski0 -
What is the best practice to handle duplicate content?
I have several large sections that SEOMOZ is indicating has duplicate content, even though the content is not identical. For example: Leather Passport Section - Leather Passports - Black - Leather Passposts - Blue - Leather Passports - Tan - Etc. Each of the items has good content, but it is identical, since they are the same products. What is the best practice here: 1. Have only one product with a drop down (fear is that this is not best for the customer) 2. Make up content to have them sound different? 3. Put a do-no-follow on the passport section? 4. Use a rel canonical even though the sections are technically not identical? Thanks!
Technical SEO | | trophycentraltrophiesandawards0 -
Crawl Errors for duplicate titles/content when canonicalised or noindexed
Hi there, I run an ecommerce store and we've recently started changing the way we handle pagination links and canonical links. We run Magento, so each category eg /shoes has a number of parameters and pages depending on the number of products in the category. For example /shoes?mode=grid will display products in grid view, /shoes?mode=grid&p=2 is page 2 in grid mode. Previously, all URL variations per category were canonicalised to /shoes. Now, we've been advised to paginate the base URLs with page number only. So /shoes has a pagination next link to /shoes?p=2, page 2 has a prev link to /shoes and a next link to /shoes?p=3. When any other parameter is introduced (such as mode=grid) we canonicalise that back to the main category URL of /shoes and put a noindex meta tag on the page. However, SEOMoz is picking up duplicate title warnings for urls like /shoes?p=2 and /shoes?mode=grid&p=2 despite the latter being canonicalised and having a noindex tag. Presumably search engines will look at the canonical and the noindex tag so this shouldn't be an issue. Is that correct, or should I be concerned by these errors? Thanks.
Technical SEO | | Fergus_Macdonald0 -
Converse.com - flash and html version of site... bad idea?
I have a questions regarding Converse.com. I realize this ecommerce site is needs a lot of seo help. There’s plenty of obvious low hanging seo fruit. On a high level, I see a very large SEO issue with the site architecture. The site is a full page flash experience that uses a # in the URL. The search engines pretty much see every flash page as the home page. To help with issue a HTML version of the site was created. Google crawls the Home Page - Converse.com http://www.converse.com Marimekko category page (flash version) http://www.converse.com/#/products/featured/marimekko Marimekko category page (html version, need to have flash disabled) http://www.converse.com/products/featured/marimekko Here is the example of the issue. This site has a great post featuring Helen Marimekko shoes http://www.coolmompicks.com/2011/03/finnish_foot_prints.php The post links to the flash Marimekko catagory page (http://www.converse.com/#/products/featured/marimekko) as I would expect (ninety something percent of visitors to converse.com have the required flash plug in). So the flash page is getting the link back juice. But the flash page is invisible to google. When I search for “converse marimekko” in google, the marimekko landing page is not in the top 500 results. So I then searched for “converse.com marimekko” and see the HTML version of the landing page listed as the 4<sup>th</sup> organic result. The result has the html version of the page. When I click the link I get redirected to the flash Marimekko category page but if I do not have flash I go to the html category page. ----- Marimekko - Converse All Star Marimekko Price: $85, Jack Purcell Helen Marimekko Price: $75 ... www.converse.com/products/featured/marimekko - Cached So my issues are… Is converse skating on thin SEO ice by having a HTML and flash version of their site/product pages? Do you think it’s a huge drag on seo rankings to have a large % of back links linking to flash pages when google is crawling the html pages? Any recommendations on to what to do about this? Thanks, SEOsurfer
Technical SEO | | seosurfer-2883190