Google Indexing Of Pages As HTTPS vs HTTP
-
We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content.
As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for.
However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https.
My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks?
-
That's not going to solve your problem, vikasnwu. Your immediate issue is that you have URLs in the index that are HTTPS and will cause searchers who click on them not to reach your site due to the security error warnings. The only way to fix that quickly is to get the SSL certificate and redirect to HTTP in place.
You've sent the search engines a number of very conflicting signals. Waiting while they try to work out what URLs they're supposed to use and then waiting while they reindex them is likely to cause significant traffic issues and ongoing ranking harm before the SEs figure it out for themselves. The whole point of what I recommended is it doesn't depend on the SEs figuring anything out - you will have provided directives that force them to do what you need.
Paul
-
Remember you can force indexing using Google Search Console
-
Nice answer!
But you forgot to mention:
- Updating the sitemap files with the good URLs
- Upload them to Google Search Console
- You can even force the indexing at Google Search Console
Thanks,
Roberto
-
Paul,
I just provided the solution to de-index the https version. I understood that what's wanted, as they need their client to fix their end.And of course that there is no way to noindex by protocol. I do agree what you are saying.
Thanks a lot for explaining further and prividing other ways to help solvinf the issue, im inspired by used like you to help others and make a great community.
GR.
-
i'm first going to see what happens if I just upload a sitemap with http URLs since there wasn't a sitemap in webmaster tools from before. Will give you the update then.
-
Great! I'd really like to hear how it goes when you get the switch back in.
P.
-
Paul that does make sense - i'll add the SSL certificate back, and then redirect from https to http via the htaccess file.
-
You can't noindex a URL by protocol, Gaston - adding no-index would eliminate the page from being returned as a search result regardless of whether HTTP or HTTPS, essentially making those important pages invisible and wasting whatever link equity they may have. (You also can't block in robots.txt by protocol either, in my experience.)
-
There's a very simple solution to this issue - and no, you absolutely do NOT want to artificially force removal of those HTTPS pages from the index.
You need to make sure the SSL certificate is still in place, then re-add the 301-redirect in the site's htaccess file, but this time redirecting all HTTPS URLs back their HTTP equivalents.
You don't want to forcibly "remove" those URLs from the SERPs, because they are what Google now understands to be the correct pages. If you remove them, you'll have to wait however long it takes for Google and other search engines to completely re-understand the conflicting signals you've sent them about your site. And traffic will inevitably suffer in that process. Instead, you need to provide standard directives that the search engines don't have to interpret and can't ignore. Once the search engines have seen the new redirects for long enough, they'll start reverting the SERP listings back to the HTTP URLs naturally.
The key here is the SSL cert must stay in place. As it stands now, a visitor clicking a page in the search engine is trying to make an HTTPS connection to your site. If there is no certificate in place, they will get the harmful security warning. BUT! You can't just put in a 301-redirect in that case. The reason for this is that the initial connection from the SERP is coming in over the "secure channel". That connection must be negotiated securely first, before the redirect can even be read. If that first connection isn't secure, the browser will return the security warning without ever trying to read the redirect.
Having the SSL cert in place even though you're not running all pages under HTTPS means that first connection can still be made securely, then the redirect can be read back to the HTTP URL, and the visitor will get to the page they expect in a seamless manner. And search engines will be able to understand and apply authority without misunderstandings/confusion.
Hope that all makes sense?
Paul
-
Noup, Robots.txt works on a website level. This means that there has to be a file for the http and another for the https website.
And, there is no need for waiting until the whole site is indexed.Just to clarify, robots.txt itself does not remove pages already indexed. It just blocks bots from crawling a website and/or specific pages with in it.
-
GR - thanks for the response.
Given our site is just 65 pages, would it make sense to just put all of the site's "https" URLs in the robots.txt file as "noindex" now rather than waiting for all the pages to get indexed as "https" and then remove them?
And then upload a sitemap to webmaster tools with the URLS as "http://"?
VW
-
Hello vikasnwu,
As what you are looking for is to remove from index the pages, follow this steps:
- Allow the whole website to be crawable in the robots.txt
- add the robots meta tag with "noindex,follow" parametres
- wait several weeks, 6 to 8 weeks is a fairly good time. Or just do a followup on those pages
- when you got the results (all your desired pages to be de-indexed) re-block with robots.txt those pages
- DO NOT erase the meta robots tag.
Remember that http://site.com andhttps://site.com are different websites to google.
When your client's website is fixed with https, follow these steps:- Allow the whole website (or the parts wanted to be indexed) to be crawable in robots.txt
- Remove the robots meta tag
- Redirect 301 http to https
- Sit and wait.
Information about the redirection to HTTPS and a cool checklist:
The Big List of SEO Tips and Tricks for Using HTTPS on Your Website - Moz Blog
The HTTP to HTTPs Migration Checklist in Google Docs to Share, Copy & Download - AleydaSolis
Google SEO HTTPS Migration Checklist - SERoundtableHope I'm helpful.
Best luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do if lots of backend pages have been indexed by Google erroneously?
Hi Guys Our developer forgot to add a no index no follow tag on the pages he created in the back-end. So we have now ended up with lots of back end pages being indexed in google. So my question is, since many of those are now indexed in Google, so is it enough to just place a no index no follow on those or should we do a 301 redirect on all those to the most appropriate page? If a no index no follow is enough, that would create lots of 404 errors so could those affect the site negatively? Cheers Martin
Intermediate & Advanced SEO | | martin19700 -
Why Is this page de-indexed?
I have dropped out for all my first page KWDs for this page https://www.key.co.uk/en/key/dollies-load-movers-door-skates Can anyone see an issue? I am trying to find one.... We did just migrate to HTTPS but other areas have no problem
Intermediate & Advanced SEO | | BeckyKey0 -
Any idea why this page isn't indexing?
Hi Mozzers, Question for all of you. Any idea why this page isn't indexing in Google? It's indexing in Bing, but we don't see it in Google's results. It doesn't seem like we have any noindex tags or anyway issues with the robots files either. Any ideas? http://ohva.k12.com/
Intermediate & Advanced SEO | | petertong230 -
Google + pages and SEO results...
Hi, Can anyone give me insight into how people are getting away with naming their business by the SEO search term, creating a BS Google + page, then having that page rank high in the search results. I am speaking specifically about the results you get when you Google: "Los Angeles DUI Lawyer". As you can see from my attached screenshot (I'm doing the search in Los Angeles), the FIRST listing is a Google + business. Strangely, the phone number listed doesn't actually take you to a DUI attorney, but rather to some marketing group that never answers the phone. Can anyone give me insight into why Google even allows this? I just find it odd that Google cares so much about the user experience, but have the first result be something completely misleading. I know it sounds like I'm just jealous (which I am, a little), but I find it disheartening that we work so hard on SEO, and someone takes the top spot with an obvious BS page. UupqBU9
Intermediate & Advanced SEO | | mrodriguez14400 -
Google cached pages and search terms
Here's something I noticed. We have a rank A page and it's ranking 10 on Google search results. When I hover my mouse over our search result, Google gives us a preview, but Google also highlights in red where the search keyword is present on the page. Reviewing our page, even though we have it as the h1 header and intro paragraph, Google is highlighting it half way down the page. Any ideas why? I review rank 1 - 5 and Google highlights the keyword on the intro paragraph and h1 header Have you guys experienced anything like this? It makes me think..Google could be crawling my site and thinking I haven't got it in the h1 or intro paragraph etc.. Thoughts?
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Google Sitemap only indexing 50% Is that a problem?
We have about 18,000 pages submitted on our Google Sitemap and only about 9000 of them are indexed. Is this a problem? We have a script that creates a sitemap on a daily basis and it is submitted on a daily basis. Am I better off only doing it once a week? Is this why I never get to the full 18,000 indexed?
Intermediate & Advanced SEO | | EcommerceSite0 -
How long does google take to show the results in SERP once the pages are indexed ?
Hi...I am a newbie & trying to optimize the website www.peprismine.com. I have 3 questions - A little background about this : Initially, close to 150 pages were indexed by google. However, we decided to remove close to 100 URLs (as they were quite similar). After the changes, we submitted the NEW sitemap (with close to 50 pages) & google has indexed those URLs in sitemap. 1. My pages were indexed by google few days back. How long does google take to display the URL in SERP once the pages get indexed ? 2. Does google give more preference to websites with more number of pages than those with lesser number of pages to display results in SERP (I have just 50 pages). Does the NUMBER of pages really matter ? 3. Does removal / change of URLs have any negative effect on ranking ? (Many of these URLs were not shown on the 1st page) An answer from SEO experts will be highly appreciated. Thnx !
Intermediate & Advanced SEO | | PepMozBot0 -
Google is ranking the wrong page for the targeted keyword
I have two examples below where we want it to rank for the targeted page but google picked another page to rank instead. This is happening a lot on this site I just recently started to work on. Example 1 Googles Choice for key word Motorcycle Tires: http://www.rockymountainatvmc.com/cl/50/Tires-and-Wheels What we want Google to choice for Motorcycle Tires: http://www.rockymountainatvmc.com/c/49/-/181/Motorcycle-Tires Other pages about Motorcycle tires: http://www.rockymountainatvmc.com/d/12/Motorcycle-Tires We even used the rel="canonical" for this url to point to our target page. http://www.rockymountainatvmc.com/c/50/-/181/Motorcycle-Tires Example 2 ATV Tires We want this page to rank http://www.rockymountainatvmc.com/c/43/81/165/ATV-Tires however google has decided to rank http://www.rockymountainatvmc.com/t/43/81/165/723/ATV-Tires-All that is acutally one folder under where we want it.
Intermediate & Advanced SEO | | DoRM0