How to Disallow Specific Folders and Sub Folders for Crawling?

CommercePundit

Today, I have checked indexing for my website in Google. I found very interesting result over there. You can check that result by following result of Google.

Google Search Result

I aware about use of robots.txt file and can disallow images folder to solve this issue.

But, It may block my images to get appear in Google image search.

So, How can I fix this issue?

RyanKent

You can, but then the content will be removed from Google's index for 90 days. I am not sure what effect this would have on pages with the images. It shouldn't have any effect, but I would hate for you to have rankings in any way affected for 90 days.

I have no experience in having images indexed in this manner. Perhaps someone else has more knowledge to share on this topic.

CommercePundit

Can I use Remove URL facility from Google webmaster tools?

RyanKent

I checked your URL: http://www.lampslightingandmore.com/images/. The folder is now properly restricted and the images can no longer be seen using this method. Going forward, Google will not be able to index new images in the same manner your other images were indexed.

With respect to the images which have been indexed, I am not certain how Google will respond. The image links are still valid so they may keep them. On the other hand, the links are gone so they may remove them. If it were my site, I would wait 30 days to see if Google removed the results.

Another way you can resolve the issue is to change the file path to your images from /images to /image. This will immediately break all the links. You would need to ensure all the links on your site are updated properly. It still may take Google a month to de-index those results but it would certainly happen in that case.

CommercePundit

I have added Options -Indexes for images folder in htaccess file.

But, I still able to find out images folder in Google indexing.

Can I check? Is it working properly or not? I don't want to index or display images folder in web search any more.

CommercePundit

I am going to add following code to my htaccess page.

Options -Indexes

Will it work for me or not?

RyanKent

If you have a development team, they should instantly understand the problem.

A simple e-mail to any developer

E-mail title: Please fix

http://www.lampslightingandmore.com/images/

That's it. No other text should be needed. A developer should be able to look at the page and understand the index was left open and how to fix it. If you wish to be nicer then a simple "my index is open for the world to see, please don't allow public access to my server folders" should suffice.

CommercePundit

Yes, I have similar problem with my code structure. Yesterday, I have set Relative path for all URLs. But, I am not sure about replacing of image name in code after make change in folder.

So, I don't want to go with that manner. I also discussed with my development team and recommend to go with htaccess method.

But, give me caution to follow specific method otherwise it may create big issue for crawling or indexing. Right??

RyanKent

The link you shared is perfect. Near the top there is a link for OPTIONS. Click on it and you will be on this page: http://httpd.apache.org/docs/1.3/mod/core.html#options

I want to very clearly state you should not make changes to your .htaccess file unless you are comfortable working with code. The slightest mistake and your entire site becomes unavailable. You can also damage the security of your site.

With that said, if you decide to proceed anyway you can add the text I shared to the top of your .htaccess file. You definitely should BACK UP the file before making any changes.

The suggestion vishalkialani made was to rename your /images folder to something else, perhaps /image. The problem is that if your site was not dynamically coded, you would break your image links.

vishalkhialani

In addition to what Ryan mentioned I would rename that folder on your server. That will make google's index outdated and you won't get any visitors on the server

CommercePundit

I can't getting you.

vishalkhialani

also you can rename it so when google 's index shows up the results you won't get any hits.

if thats what you want.

CommercePundit

Yes, I checked article to know more about it.

http://httpd.apache.org/docs/1.3/howto/htaccess.html

But, I am not able to find my solution. Can you suggest me specific article which suppose to help me more in same direction?

RyanKent

Hello.

You have left your site open in a manner which is not recommended. Please take a look at the following URL: http://www.lampslightingandmore.com/images/. On a properly secured server, you should receive a 404 Page Not Found or Access Denied type of error. Since the folder is left open, a Google crawler found it and you are seeing the results.

The means to secure your site varies based on your software configuration. If you are on an Apache web server (the most common setup) then these settings are controlled by your htaccess file. I am not an htaccess expert but I believe adding the following code to your .htaccess file at the top will fix the issue:

Options -Indexes

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How to Disallow Specific Folders and Sub Folders for Crawling?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Crawl Stats Decline After Site Launch (Pages Crawled Per Day, KB Downloaded Per Day)

Improving Crawl Efficieny

What does Disallow: /french-wines/?* actually do - robots.txt

Flat architecture or deep folders?

Best practice for disallowing URLS with Robots.txt

Robots Disallow Backslash - Is it right command

Why the archive sub pages are still indexed by Google?

Robots.txt: Link Juice vs. Crawl Budget vs. Content 'Depth'