Block all but one URL in a directory using robots.txt?

nicole.healthline

Is it possible to block all but one URL with robots.txt?

for example domain.com/subfolder/example.html, if we block the /subfolder/ directory we want all URLs except for the exact match url domain.com/subfolder to be blocked.

Cyrus-Shepard

Robots.txt files are sequential, which means they follow directives in the order they appear. So if two directives conflict, they will follow the last one.

So the simple way to do this is to disallow all files first, then allow the directory you want next. It would look something like this:

User-agent: *
Disallow: /

User-agent: *
Allow: /test

Caveat: This is NOT the way robots.txt is supposed to work. By design, robots.txt is designed for disallowing, and technically you shouldn't ever have to use it for allowing. That said, this should work pretty well.

You can check your work in Google Webmaster, which has a robots.txt checker. Site Configuration > Crawler Access. Just type in your proposed robots.txt, then a test URL and you should be good to go.

Hope this helps!

sesertin

According to my knowledge this possibility does not exist. One fast method to get over this is to get a crawler program to crawl your urls, so that you can quickly copy out all url in the folder paste in in the robots.txt and leave aout the one that you want in the index.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Block all but one URL in a directory using robots.txt?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Is it best to 301 redirect or use canonical Url when consolidating two pages?

¿Disallow duplicate URL?

SEO Dilution: Key Words in Sub Directories v Using a Hyphen in a Single Directory

Robots.txt & Duplicate Content

Is there any SEO advantage to sharing links on twitter using google's url shortener goo.gl/

Google showing high volume of URLs blocked by robots.txt in in index-should we be concerned?

Will disallowing in robots.txt noindex a page?

Sudden increase in number of indexed URLs. How ca I know what URLs these are?