How to Disallow Specific Folders and Sub Folders for Crawling?
-
Today, I have checked indexing for my website in Google. I found very interesting result over there. You can check that result by following result of Google.
I aware about use of robots.txt file and can disallow images folder to solve this issue.
But, It may block my images to get appear in Google image search.
So, How can I fix this issue?
-
You can, but then the content will be removed from Google's index for 90 days. I am not sure what effect this would have on pages with the images. It shouldn't have any effect, but I would hate for you to have rankings in any way affected for 90 days.
I have no experience in having images indexed in this manner. Perhaps someone else has more knowledge to share on this topic.
-
Can I use Remove URL facility from Google webmaster tools?
-
I checked your URL: http://www.lampslightingandmore.com/images/. The folder is now properly restricted and the images can no longer be seen using this method. Going forward, Google will not be able to index new images in the same manner your other images were indexed.
With respect to the images which have been indexed, I am not certain how Google will respond. The image links are still valid so they may keep them. On the other hand, the links are gone so they may remove them. If it were my site, I would wait 30 days to see if Google removed the results.
Another way you can resolve the issue is to change the file path to your images from /images to /image. This will immediately break all the links. You would need to ensure all the links on your site are updated properly. It still may take Google a month to de-index those results but it would certainly happen in that case.
-
I have added Options -Indexes for images folder in htaccess file.
But, I still able to find out images folder in Google indexing.
Can I check? Is it working properly or not? I don't want to index or display images folder in web search any more.
-
I am going to add following code to my htaccess page.
Options -Indexes
Will it work for me or not?
-
If you have a development team, they should instantly understand the problem.
A simple e-mail to any developer
E-mail title: Please fix
http://www.lampslightingandmore.com/images/
That's it. No other text should be needed. A developer should be able to look at the page and understand the index was left open and how to fix it. If you wish to be nicer then a simple "my index is open for the world to see, please don't allow public access to my server folders" should suffice.
-
Yes, I have similar problem with my code structure. Yesterday, I have set Relative path for all URLs. But, I am not sure about replacing of image name in code after make change in folder.
So, I don't want to go with that manner. I also discussed with my development team and recommend to go with htaccess method.
But, give me caution to follow specific method otherwise it may create big issue for crawling or indexing. Right??
-
The link you shared is perfect. Near the top there is a link for OPTIONS. Click on it and you will be on this page: http://httpd.apache.org/docs/1.3/mod/core.html#options
I want to very clearly state you should not make changes to your .htaccess file unless you are comfortable working with code. The slightest mistake and your entire site becomes unavailable. You can also damage the security of your site.
With that said, if you decide to proceed anyway you can add the text I shared to the top of your .htaccess file. You definitely should BACK UP the file before making any changes.
The suggestion vishalkialani made was to rename your /images folder to something else, perhaps /image. The problem is that if your site was not dynamically coded, you would break your image links.
-
In addition to what Ryan mentioned I would rename that folder on your server. That will make google's index outdated and you won't get any visitors on the server
-
I can't getting you.
-
also you can rename it so when google 's index shows up the results you won't get any hits.
if thats what you want.
-
Yes, I checked article to know more about it.
http://httpd.apache.org/docs/1.3/howto/htaccess.html
But, I am not able to find my solution. Can you suggest me specific article which suppose to help me more in same direction?
-
Hello.
You have left your site open in a manner which is not recommended. Please take a look at the following URL: http://www.lampslightingandmore.com/images/. On a properly secured server, you should receive a 404 Page Not Found or Access Denied type of error. Since the folder is left open, a Google crawler found it and you are seeing the results.
The means to secure your site varies based on your software configuration. If you are on an Apache web server (the most common setup) then these settings are controlled by your htaccess file. I am not an htaccess expert but I believe adding the following code to your .htaccess file at the top will fix the issue:
Options -Indexes
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Structure for geo location for specific page
On hackerearth.com/challenges page, there is an option to select languages. This option is in the footer. Once you select the language the url changes. Ex - if we select French, the URL changes to hackereath.com/fr/challenges. In case we decide to change the URL of this page with Geo, what should be the URL structure which accommodates languages as well. My research says that it would good to keep the url like domainname.com/page/language.
Intermediate & Advanced SEO | | Rajnish_HE0 -
What are the best practices for geo-targeting by sub-folders?
My domain is currently targeting the US, but I'm building out sub-folders that will need to geo-target France, England, and Spain. Each country will have it's own sub-folder, and professionally translated (domain.com/france). Other than the hreflang tags, what are other best practices I can implement? Can Google Webmaster tools geo-target by subfolder? Any suggestions would be appreciated. Thanks Justin
Intermediate & Advanced SEO | | Rhythm_Agency0 -
Mobile Googlebot vs Desktop Googlebot - GWT reports - Crawl errors
Hi Everyone, I have a very specific SEO question. I am doing a site audit and one of the crawl reports is showing tons of 404's for the "smartphone" bot and with very recent crawl dates. If our website is responsive, and we do not have a mobile version of the website I do not understand why the desktop report version has tons of 404's and yet the smartphone does not. I think I am not understanding something conceptually. I think it has something to do with this little message in the Mobile crawl report. "Errors that occurred only when your site was crawled by Googlebot (errors didn't appear for desktop)." If I understand correctly, the "smartphone" report will only show URL's that are not on the desktop report. Is this correct?
Intermediate & Advanced SEO | | Carla_Dawson0 -
Can't crawl website with Screaming frog... what is wrong?
Hello all - I've just been trying to crawl a site with Screaming Frog and can't get beyond the homepage - have done the usual stuff (turn off JS and so on) and no problems there with nav and so on- the site's other pages have indexed in Google btw. Now I'm wondering whether there's a problem with this robots.txt file, which I think may be auto-generated by Joomla (I'm not familiar with Joomla...) - are there any issues here? [just checked... and there isn't!] If the Joomla site is installed within a folder such as at e.g. www.example.com/joomla/ the robots.txt file MUST be moved to the site root at e.g. www.example.com/robots.txt AND the joomla folder name MUST be prefixed to the disallowed path, e.g. the Disallow rule for the /administrator/ folder MUST be changed to read Disallow: /joomla/administrator/ For more information about the robots.txt standard, see: http://www.robotstxt.org/orig.html For syntax checking, see: http://tool.motoricerca.info/robots-checker.phtml User-agent: *
Intermediate & Advanced SEO | | McTaggart
Disallow: /administrator/
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /modules/
Disallow: /plugins/
Disallow: /tmp/0 -
Subdomain or folder for a section not focused on my core business
Hello there, I'm installing your analytics tool and it seems really great. I'm gonna use it for sure but I've a question that is more strategic and it's something the tool can't help me with 😛 I've a website active from 2008 and really well known in my country as a service website... we're like your "advisor" for utilities and insurances. The reason why is "savings" but really focused on utilities (broadband, gas, electricity) and check accounts or insurances. I’ve always used folders in my URLs instead of subdomains (for example www.site.com/section1 or www.site.com/section2 ). In this period I’m planning to open a new website section related to saving but not really close with what we really do in the rest of the website. This section is about coupons, vouchers and little offers. The problem is that with that section I’m going to write really a lot (a lot) of content trying to gain a lot of external links. It’s obvious that I already have a lot of contents about my core business and I’m going to write contents for original categories too. This section is anyway secondary for my business and my worry is that Google can identify me in the future as a website mainly focused on this new product. I’m really well indexed so I don’t want this decision to have any effect on my original situation. Finally the question 😛 Is it better to maintain for this section the same website structure with folders or indentify it as a subdomain to remark that it’s going to be like a totally different site with his dedicated news and all the rest? That’s why I’m evaluating a subdomain but I’m not really convinced cause subdomains can be considered as a different approach compared to original structure and of course using folder can be useful to gain root’s site rank. On the other hand, what can Google think about my core business? Thanks a lot for your help
Intermediate & Advanced SEO | | Uby850 -
Will disallowing in robots.txt noindex a page?
Google has indexed a page I wish to remove. I would like to meta noindex but the CMS isn't allowing me too right now. A suggestion o disallow in robots.txt would simply stop them crawling I expect or is it also an instruction to noindex? Thanks
Intermediate & Advanced SEO | | Brocberry0 -
What is the best tool to crawl a site with millions of pages?
I want to crawl a site that has so many pages that Xenu and Screaming Frog keep crashing at some point after 200,000 pages. What tools will allow me to crawl a site with millions of pages without crashing?
Intermediate & Advanced SEO | | iCrossing_UK0 -
Content on New Domain or Sub Directory of Existing Domain?
I have a client with a well aged, high DA site. They rank well for their wedding photography business in several cities. They are launching a new service which is related to photography (photobooths and flipbooks) which they built and developed content on a new domain. The existing domain has 0 links with a DA of 1. The site is brand new.. Is there any drawback to moving the existing content on the new domain to a sub directory of the high authority domain? EX: http://domain.com/newcompany The look, feel, and design of the new site / service is much different than the high DA site. My thoughts are that this will give them an automatic step up, especially since they will be marketing this in several major cities. Also, since the design will be different, if it is good to move to the subdir, should we put the new company name in the subdir folder or something keyword friendly like domain.com/photobooth as opposed to domain.com/newcompanyname. Any thoughts would be greatly appreciated.
Intermediate & Advanced SEO | | itrogers0