Using the Google Remove URL Tool to remove https pages
-
I have found a way to get a list of 'some' of my 180,000+ garbage URLs now, and I'm going through the tedious task of using the URL removal tool to put them in one at a time. Between that and my robots.txt file and the URL Parameters, I'm hoping to see some change each week.
I have noticed when I put URL's starting with https:// in to the removal tool, it adds the http:// main URL at the front.
For example, I add to the removal tool:-
https://www.mydomain.com/blah.html?search_garbage_url_addition
On the confirmation page, the URL actually shows as:-
http://www.mydomain.com/https://www.mydomain.com/blah.html?search_garbage_url_addition
I don't want to accidentally remove my main URL or cause problems. Is this the right way this should look?
AND PART 2 OF MY QUESTION
If you see the search description in Google for a page you want removed that says the following in the SERP results, should I still go to the trouble of putting in the removal request?
www.domain.com/url.html?xsearch_...
A description for this result is not available because of this site's robots.txt – learn more.
-
Thanks so much for taking the time to respond.
I think I will add the https to WMT and remove them that way.
I will take a look through the .htaccess file and the creation of the ssl robots file. A while back, it seemed that Google was indexing a lot of my site as https and then the dropped it and went mainly back to http. I will get that sorted to make it clear.
-
Hi there
I'll start with question 2 first as it's a bit easier to answer. Robots.txt blocks the crawling of a page, but not necessarily indexing. Of course, if the page cannot be crawled it will be deindexed eventually anyway, but if you're getting that description for one of your URLs, Google has not been able to access it and will stop trying to. So that is usually enough, although if you want to remove it as well, you can by all means.
For question 1 - GWT is a bit awkward in the sense that it treats http and https versions of your site as different webmaster properties. Furthermore, if you want to remove a URL on your site, it will always prefix it with the http/https version of your site, no matter how you enter it.
If you added another WMT property that was https://www.yourdomain.com - you would be able to manage that domain as well and thus you would be able to remove any URLs under that prefix.
Incidentally, if you want to block all HTTPS pages from being accessed, you can do that with a special instruction in your htaccess file and robots txt. You can instruct the Googlebot and other bots to read a specific robots.txt file if they visit an HTTPS URL. To do that, you would first add this to your htaccess file:
RewriteCond %{HTTPS} ^on$
RewriteCond %{REQUEST_URI} ^/robots.txt$
RewriteRule ^(.*)$ /robots_ssl.txt [L]This command basically says "if the URL has https, read the robots_ssl.txt file". You then upload a file called robots_ssl.txt to your root domain. In the txt file you just add:
User-agent: *
Disallow: /So now, if a bot reaches an https URL, it has to read the robots_ssl.txt file and upon reading that, they are denied access. That would prevent all of your https URLs from being indexed.
That might be useful to you, but if you go ahead and use it please take care to backup all your files in case anything goes wrong - your htaccess file is very important!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pages are Indexed but not Cached by Google. Why?
Hello, We have magento 2 extensions website mageants.com since 1 years google every 15 days cached my all pages but suddenly last 15 days my websites pages not cached by google showing me 404 error so go search console check error but din't find any error so I have cached manually fetch and render but still most of pages have same 404 error example page : - https://www.mageants.com/free-gift-for-magento-2.html error :- http://webcache.googleusercontent.com/search?q=cache%3Ahttps%3A%2F%2Fwww.mageants.com%2Ffree-gift-for-magento-2.html&rlz=1C1CHBD_enIN803IN804&oq=cache%3Ahttps%3A%2F%2Fwww.mageants.com%2Ffree-gift-for-magento-2.html&aqs=chrome..69i57j69i58.1569j0j4&sourceid=chrome&ie=UTF-8 so have any one solutions for this issues
Technical SEO | | vikrantrathore0 -
Google page speed testing failures
When running our domain through google's page speed insights I'm getting the error message 'An error occurred while fetching or analyzing the page' at around 65% load. I'm concerned it's affecting our organic rankings. The domain is https://www.scottscastles.com/ When testing in https://testmysite.withgoogle.com/ it is also failing at around 70% with the message 'It's taking longer than expected. You can leave this tab open and check back in a little while. We'll have your results soon.' but the results never come. I've tried testing on a few different speed testing sites without failures (https://tools.pingdom.com, https://gtmetrix.com, https://www.webpagetest.org and a few others). We’re stumped as everything appears correct and was working but now isn't. Is this Google or us, or a combination of the two? Any help greatly appreciated!
Technical SEO | | imaterus0 -
Should I use Event Schema for a page that reports on an event?
I have a question about using Schema data. Specifically: Should I use Event Schema for a page that reports on an event? I provide high-quality coverage (reporting) about new products being introduced at an industry trade show. For the event, I create a single page using the event name, and provide a great deal of information on how to attend the show, the best places to stay and other insider tips to help new attendees. Then during the show, I list the new products being introduced along with photos and videos. Should I use event schema data for this page, or does Google only want the event organizer to use that data? Any benefits or drawbacks to using event schema? Thanks! Richard
Technical SEO | | RichardInFlorida0 -
Http and https issue in Google SERP
Hi, I've noticed that Google indexing some of my pages as regular http, like this: http://www.example.com/accounts/ and some pages are being indexed as https, like this: https://www.example.com/platforms/ When I've performed site audit check in various SEO tools I got something around +450 pages duplicated and showing me pairs of the same URL pages, one time with http and one time with https. In our site there is the possibility for people to register and and open an account, later on to login to our website with their login details. In our company I'm not the one that is responsible for the site's maintenance and I would like to know if this is an issue, and if this is an issue - to know what causing it and how to fix it so I'll be able to forward the solution to the person in charge. Additionally I would like to know in general, what is the real purpose of https vs. http and to know what is the preferred method that our website should use. Currently when URLs are typed manually to the address bar, all the URLs are loading fine - with or without https written at the start of each URL. I'm not allowed to expose our site's name, this is why I wrote example.com instead, I hope you can understand that. Thank you so much for your help and I'm looking forward reading your answers.
Technical SEO | | JonsonSwartz0 -
Google Webmaster tools: Sitemap.xml not processed everyday
Hi, We have multiple sites under our google webmaster tools account with each having a sitemap.xml submitted Each site's sitemap.xml status ( attached below ) shows it is processed everyday except for one _Sitemap: /sitemap.xml__This Sitemap was submitted Jan 10, 2012, and processed Oct 14, 2013._But except for one site ( coed.com ) for which the sitemap.xml was processed only on the day it is submitted and we have to manually resubmit every day to get it processed.Any idea on why it might?thank you
Technical SEO | | COEDMediaGroup0 -
If my home page never shows up in SERPS but other pages do, does that mean Google is penalizing me?
So my website I do local SEO for, xyz.com is finally getting better on some keywords (Thanks SEOMOZ) But only pages that are like this xyz.com/better_widgets_ or xyz.com/mousetrap_removals Is Google penalizing me possibly for some duplicate content websites I have out there (working on, I know I know it is bad)...
Technical SEO | | greenhornet770 -
How narrowly geo targeted should your Google Places page be?
Hi Mozers I'm still struggling with my London based client with two locations and one business. Basically she has a location in W1W 'Westminster' and a location in 'WD!' Borehamwood. Has anyone any good resources of input concerning geotargeting. I've done some searching but can't get quite the help I'm seeking. I'd like to make the Pages cover a 5mile radius and be highly specific to their locations. Is this the right way to proceed? Thanks
Technical SEO | | catherine-2793880 -
How to remove a sub domain from Google Index!
Hello, I have a website having many subdomains having same copy of content i think its harming my SEO for that site since abc and xyz sub domains do have same contents. Thus i require to know i have already deleted required subdomain DNS RECORDS now how to have those pages removed from Google index as well ? The DNS Records no more exists for those subdomains already.
Technical SEO | | anand20100