Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best way to permanently remove URLs from the Google index?
-
We have several subdomains we use for testing applications. Even if we block with robots.txt, these subdomains still appear to get indexed (though they show as blocked by robots.txt.
I've claimed these subdomains and requested permanent removal, but it appears that after a certain time period (6 months)? Google will re-index (and mark them as blocked by robots.txt).
What is the best way to permanently remove these from the index? We can't use login to block because our clients want to be able to view these applications without needing to login.
What is the next best solution?
-
I agree with Paul, The Google is re indexing the pages because you have few linking pointing back to these sub domains. The best idea us to restrict Google crawler by using no-index , no-follow tag and remove the instruction available in the robots.txt...
This way Google will neither crawl nor follow the activity on the page and it will get permanently remove from Google Index.
-
Yup - Chris has the solution. The robots.txt disallow directive simply instructs the crawler not to crawl, it doesn't have any instructions regarding removing URLs from the index. I'm betting there are other pages linking in to the subdomains that the bots are following to find and index as the URL Removal requests are expiring.
Do note though that when you add the no-index meta-robots tag, you're going to need to remove the robots.txt disallow directive. Otherwise the crawlers won't make any attempt to crawl all the pages and so won't even discover most of the no-index requests.
Paul
[Edited to add - there's no reason you can't implement the no-index meta-tags and then also again request removal via the Webmaster Tools removal tool. Kind of a "belt & suspenders approach. The removal request will get it out quicker, and the meta-no-index will do the job of keeping it out. Remember to do this in Bing Webmaster Tools as well.]
-
Wouldn't a noindex meta tag on each page take care of it?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexing Request - Typical Time to Complete?
In Google Search Console, when you request the (re) indexing of a fetched page, what's the average amount of time it takes to re-index and does it vary that much from site to site or are manual re-index request put in a queue and served on a first come - first serve basis despite the site characteristics like domain/page authority?
Intermediate & Advanced SEO | | SEO18050 -
Google Is Indexing my 301 Redirects to Other sites
Long story but now i have a few links from my site 301 redirecting to youtube videos or eCommerce stores. They carry a considerable amount of traffic that i benefit from so i can't take them down, and that traffic is people from other websites, so basically i have backlinks from places that i don't own, to my redirect urls (Ex. http://example.com/redirect) My problem is that google is indexing them and doesn't let them go, i have tried blocking that url from robots.txt but google is still indexing it uncrawled, i have also tried allowing google to crawl it and adding noindex from robots.txt, i have tried removing it from GWT but it pops back again after a few days. Any ideas? Thanks!
Intermediate & Advanced SEO | | cuarto7150 -
Google indexed wrong pages of my website.
When I google site:www.ayurjeewan.com, after 8 pages, google shows Slider and shop pages. Which I don't want to be indexed. How can I get rid of these pages?
Intermediate & Advanced SEO | | bondhoward0 -
Is there a way to get a list of Total Indexed pages from Google Webmaster Tools?
I'm doing a detailed analysis of how Google sees and indexes our website and we have found that there are 240,256 pages in the index which is way too many. It's an e-commerce site that needs some tidying up. I'm working with an SEO specialist to set up URL parameters and put information in to the robots.txt file so the excess pages aren't indexed (we shouldn't have any more than around 3,00 - 4,000 pages) but we're struggling to find a way to get a list of these 240,256 pages as it would be helpful information in deciding what to put in the robots.txt file and which URL's we should ask Google to remove. Is there a way to get a list of the URL's indexed? We can't find it in the Google Webmaster Tools.
Intermediate & Advanced SEO | | sparrowdog0 -
What is the best way to get anchor text cloud in line?
So I am working on a website, and it has been doing seo with keyword links for a a few years. The first branded terms comes in a 7% in 10th in the list on Ahefs. The keyword terms are upwards of 14%. What is the best way to get this back in line? It would take several months to build keyword branded terms to make any difference - but it is doable. I could try link removal, but less than 10% seem to actually get removed -- which won't make a difference. The disavow file doesn't really seem to do anything either. What are your suggestions?
Intermediate & Advanced SEO | | netviper0 -
Best way to block a sub-domain from being indexed
Hello, The search engines have indexed a sub-domain I did not want indexed its on old.domain.com and dev.domain.com - I was going to password them but is there a best practice way to block them. My main domain default robots.txt says :- Sitemap: http://www.domain.com/sitemap.xml global User-agent: *
Intermediate & Advanced SEO | | JohnW-UK
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /trackback/
Disallow: /feed/
Disallow: /comments/
Disallow: /category//
Disallow: */trackback/
Disallow: */feed/
Disallow: /comments/
Disallow: /?0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Site Indexed by Google but not Bing or Yahoo
Hi, I have a site that is indexed (and ranking very well) in Google, but when I do a "site:www.domain.com" search in Bing and Yahoo it is not showing up. The team that purchased the domain a while back has no idea if it was indexed by Bing or Yahoo at the time of purchase. Just wondering if there is anything that might be preventing it from being indexed? Also, Im going to submit an index request, are there any other things I can do to get it picked up?
Intermediate & Advanced SEO | | dbfrench0