Indexed, though blocked by robots.txt: Need to bother?

vtmoz

Hi,

We have intentionally blocked some of the website files which were indexed for years. Now we receive a message "Indexed, though blocked by robots.txt" in GSC. We can ignore as per my knowledge? Are any actions required about this? We thought of blocking them with meta tags but these are PDF files.

Thanks

Gaston Riera

Hi there!

What Google is telling you is that you are indexing URLs that you probably are not wanting to be indexed, or the other way around, that important pages are being blocked but indexed for other reasons.

If I might ask, why did you blocked through robots.txt those files?
There most 2 answers are:
1- Wanted to remove those from search results. If this is your case, you've solved only a part of the problem. What you should have done is (previously allowing robots to crawl those urls) apply noindex rules (keep in mind that can be set up in the HTTP header, as long as not html files cant have meta robots tag), then after a sufficient time block them in robots.txt.
_2- Optimize how GoogleBot (crawiling) time. _Being this case, then you've done it correctly and there is nothing to worry.

Hope this help.
Best luck.
GR

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Indexed, though blocked by robots.txt: Need to bother?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Do we need to Disallow profiles from discussions or forums?

Why google stop index my home page

Google indexing https sites by default now, where's the Moz blog about it!

Adding the link masking directory to robots.txt?

Homepage Index vs Home vs Default?

Panda, Negative SEO and now Penguin - help needed

Urgent input needed on huge drop in Google

Any ideas why our category pages got de-indexed?