Removing secure subdomain from google index
-
we've noticed over the last few months that Google is not honoring our main website's robots.txt file. We have added rules to disallow secure pages such as:
Disallow: /login.cgis Disallow: /logout.cgis Disallow: /password.cgis Disallow: /customer/* We have noticed that google is crawling these secure pages and then duplicating our complete ecommerce website across our secure subdomain in the google index (duplicate content) https://secure.domain.com/etc. Our webmaster recently implemented a specific robots.txt file for the secure subdomain disallow all however, these duplicated secure pages remain in the index.
User-agent: *
Disallow: /My question is should i request Google to remove these secure urls through Google Webmaster Tools? If so, is there any potential risk to my main ecommerce website? We have 8,700 pages currently indexed into google and would not want to risk any ill effects to our website. How would I submit this request in the URL Removal tools specifically? would inputting https://secure.domain.com/ cover all of the urls? We do not want any secure pages being indexed to the index and all secure pages are served on the secure.domain example. Please private message me for specific details if you'd like to see an example. Thank you,
-
I think you're saying you have
mainwebsitethatsellsstuff.com
securesubdomainof.mainwebsitethatsellsstuff.comand that you want to keep the main domain, and remove the subdomain, and that it's not a case of http vs https with the URL otherwise being the same, right?
You can verify a subdomain in Google Webmaster Tools and remove the entire subdomain. I've had to do this for a dev subdomain that accidentally got indexed. I was able to keep the main domain, and remove the subdomain. The key is to verify that subdomain, and leave the main domain alone, provided I'm understanding your question correctly.
-
Do you need 8700 pages served on https? Protocol should transition when a page is ok to serve unsecured. Generally you would only serve pages on https that contain confidential information and have general content on http. If you look at the site and ask how many of those pages can a no logged in user see? If they are not protected by authorization then they do not need https as the content is publically viewable.
-
URL Removal would not be a good action in this case. According to Google, when they remove the https version, they will also remove the http version along with it.
How long ago did you implement the robots.txt exclusion for the https pages? It will take Google some time to pull this from their index. To help you can add the following on your https pages which will keep the pages from continuing to be cached:
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why google is not visiting any website from past 10 days
Hi, I observed why Google is not visiting www.SubhaVastu.com from past 10 days, later I checked thoroughly, not only for my site, google stopped visiting all websites from past 10/12 days. Is Google releasing any new updates to the crawler? Any new system is releasing soon. I am expecting Google updated their crawler by this Sunday night and it may visit as usual to all sites from 12-midnight pacific time. Has anyone observed it, any information regarding on this Google step. Thanks.
Algorithm Updates | | SubhaVaastu0 -
Google creating it own content
I am based in Australia but a US founded search on 'sciatica' shows an awesome answer on the RHS of the SERP https://www.google.com/search?q=sciatica&oq=sciatica&aqs=chrome.0.69i59.3631j0j7&sourceid=chrome&ie=UTF-8 The download on sciatica is a pdf created by google. Firstly is this common in the US? secondly any inputs on where this is heading for rollout would be appreciated. Is google now creating its own content to publish?
Algorithm Updates | | ClaytonJ0 -
Is Moz Domain Authority still relvant when it comes to Google ranking?
My understanding of Moz DA is that it is predominantly based on external links. Since Penguin I am noticing more and more websites ranking high in Google with a "low" number of links and certainly a low DA but quality and relevancy of content and also of offering. I understand that there was always more to ranking than DA but is it anymore even relevant to how a site will rank in Google?
Algorithm Updates | | halloranc0 -
Google Unable to Access Robots.txt
We haven't made any changes to the robots.txt file and suddenly Google claims they can no longer access the file. The site has been up and active for well over a year now. What are my next steps? I have included a screenshot of the top half of the file. See anything wrong? D3H5tgE.png
Algorithm Updates | | rhoadesjohn0 -
Google.ca English and French returning different rankings
French Keyword : "Chauffage électrique" Currently Ranking 4th on Google.ca (French) It is not even top 50 on Google.ca (English) Why so much gap between them? Both are on Google.ca, just different language. Also, when searching the keyword on Google.ca (English), all the results shown are in french anyway ! Why is mine way off ? How can I help the ranking on the EN version? Why does Google.ca FR and EN have different rankings?
Algorithm Updates | | Kezber0 -
Effect of new Google SSL policy on our Analytics - AACK!
So I went to look at our keyword reports in GA today and our most popular keyword was "(not provided)". It now accounts for 10% of our referred visits. Unfortunately, it also has a 125% avg order value compared to the rest of our site. This is a really annoying policy that Google has implemented and will clearly have an effect on our ability to effectively market our site.
Algorithm Updates | | IanTheScot0 -
Can AJAX implementation affect the rankings in Google Panda?
Hi there, I have the following situation with one of our job sites. We migrate the site to a new application, which is better from design point of view and also usability. For this we use a lot AJAX especially in searches. So every time a user is filtering down their search new results will be shown on the page, at the same url and with no page load. But, having this implementation. affected Bounce rate - which increased from 38% to nearly 60%, PI/visits - which are now half, at 3 and also Avg Time on Site is half that is used to be coming to 2,5 min from nearly 6 min. From Rand post, it is clearly that the content is very important in Google Panda, and all of these parameters we should consider, as it is telling the quality of the content. So, my question will be, can this site be hit by Panda updates (maybe later on) because Bounce Rate, PI/Visits and Avg Time on site, decreased in such way? At the moment we don't measure the Ajax impresion, but as I understood that we can do that though virtual pages in GA, does anyone of you have the experience how to handle this? Won't be this an artificial increase? Thanks, Irina
Algorithm Updates | | InformMedia0 -
Why is a website with lower content interest reaching higher in google
there is a website that i am competing with <cite>www.gastricbandhypnotherapy.net for the term gastric band hypnotherapy and for some reason it is now ranching higher than me.</cite> I have been number one in google with http://www.clairehegarty.co.uk/virtual-gastric-band-with-hypnotherapy for the term Gastric Band Hypnotherapy but for some reason in the past few days it has ranked number one and pushed me down to number three. i do not understand it as there is not much relevant content to gastric band hypnotherapy and also it does not have many links pointing into it can you please help with this question
Algorithm Updates | | ClaireH-1848860