Google is indexing blocked content in robots.txt

elisainteractive

Hi,Google is indexing some URLs that i don't want to be indexed and also is indexing the same URLs with https. This URLs are blocked in the file robots.txt.I've tried to block this URLs through Google WebmasterTools but Google doesn't let me do it because this URL are httpsThe file robots.txt is correct so, what can i do to avoid this content to be indexed?

bjs2010

I think you will find that the URL´s in Google´s index are either:

indexed prior to putting in the robots.txt disallow in place - check in the google serp and click on "in cache" to see the date.
Heavily linked to by other external domains.
Both of the above.

@cleverphd has a great solution. Follow that.

CleverPhD

This will sound backwards but it works.

Add the meta noindex tag to all pages you want out of the index.
Take those same pages out of the robots.txt and allow them to be crawled.

The meta noindex tells Google to remove the page from the index. It is preferred over using robots.txt

http://moz.com/learn/seo/robotstxt

The robot.txt - blocks Google from crawling the page, but things can still show up if there are other pages linking to the page you are trying to remove.

http://www.youtube.com/watch?v=KBdEwpRQRD0

You have to allow Google to crawl the pages (by taking them out of the robots.txt) so it can read the noindex meta tags that then tell Google to take them out of the index.

elisainteractive

Thank you, but that is not the problem. The file robots.txt is done since a long time ago.

EastEssence22

It seems you have added/modified Robot.txt file later. Wait for some time, Say 15 days.
Also ensure syntax for robot.txt

Regards,

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Google is indexing blocked content in robots.txt

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Recover google INdexing issue after fixing malware attack.

Google is indexing our old domain

Google Indexing Pages with Made Up URL

One robots.txt file for multiple sites?

Advice on improve this content page for seo and google

Timely use of robots.txt and meta noindex

Do you get credit for an external link that points to a page that's being blocked by robots.txt

How can I get a listing of just the URLs that are indexed in Google