How to Disallow Specific Folders and Sub Folders for Crawling?
-
Today, I have checked indexing for my website in Google. I found very interesting result over there. You can check that result by following result of Google.
I aware about use of robots.txt file and can disallow images folder to solve this issue.
But, It may block my images to get appear in Google image search.
So, How can I fix this issue?
-
You can, but then the content will be removed from Google's index for 90 days. I am not sure what effect this would have on pages with the images. It shouldn't have any effect, but I would hate for you to have rankings in any way affected for 90 days.
I have no experience in having images indexed in this manner. Perhaps someone else has more knowledge to share on this topic.
-
Can I use Remove URL facility from Google webmaster tools?
-
I checked your URL: http://www.lampslightingandmore.com/images/. The folder is now properly restricted and the images can no longer be seen using this method. Going forward, Google will not be able to index new images in the same manner your other images were indexed.
With respect to the images which have been indexed, I am not certain how Google will respond. The image links are still valid so they may keep them. On the other hand, the links are gone so they may remove them. If it were my site, I would wait 30 days to see if Google removed the results.
Another way you can resolve the issue is to change the file path to your images from /images to /image. This will immediately break all the links. You would need to ensure all the links on your site are updated properly. It still may take Google a month to de-index those results but it would certainly happen in that case.
-
I have added Options -Indexes for images folder in htaccess file.
But, I still able to find out images folder in Google indexing.
Can I check? Is it working properly or not? I don't want to index or display images folder in web search any more.
-
I am going to add following code to my htaccess page.
Options -Indexes
Will it work for me or not?
-
If you have a development team, they should instantly understand the problem.
A simple e-mail to any developer
E-mail title: Please fix
http://www.lampslightingandmore.com/images/
That's it. No other text should be needed. A developer should be able to look at the page and understand the index was left open and how to fix it. If you wish to be nicer then a simple "my index is open for the world to see, please don't allow public access to my server folders" should suffice.
-
Yes, I have similar problem with my code structure. Yesterday, I have set Relative path for all URLs. But, I am not sure about replacing of image name in code after make change in folder.
So, I don't want to go with that manner. I also discussed with my development team and recommend to go with htaccess method.
But, give me caution to follow specific method otherwise it may create big issue for crawling or indexing. Right??
-
The link you shared is perfect. Near the top there is a link for OPTIONS. Click on it and you will be on this page: http://httpd.apache.org/docs/1.3/mod/core.html#options
I want to very clearly state you should not make changes to your .htaccess file unless you are comfortable working with code. The slightest mistake and your entire site becomes unavailable. You can also damage the security of your site.
With that said, if you decide to proceed anyway you can add the text I shared to the top of your .htaccess file. You definitely should BACK UP the file before making any changes.
The suggestion vishalkialani made was to rename your /images folder to something else, perhaps /image. The problem is that if your site was not dynamically coded, you would break your image links.
-
In addition to what Ryan mentioned I would rename that folder on your server. That will make google's index outdated and you won't get any visitors on the server
-
I can't getting you.
-
also you can rename it so when google 's index shows up the results you won't get any hits.
if thats what you want.
-
Yes, I checked article to know more about it.
http://httpd.apache.org/docs/1.3/howto/htaccess.html
But, I am not able to find my solution. Can you suggest me specific article which suppose to help me more in same direction?
-
Hello.
You have left your site open in a manner which is not recommended. Please take a look at the following URL: http://www.lampslightingandmore.com/images/. On a properly secured server, you should receive a 404 Page Not Found or Access Denied type of error. Since the folder is left open, a Google crawler found it and you are seeing the results.
The means to secure your site varies based on your software configuration. If you are on an Apache web server (the most common setup) then these settings are controlled by your htaccess file. I am not an htaccess expert but I believe adding the following code to your .htaccess file at the top will fix the issue:
Options -Indexes
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Client has an inexplicable jump in crawled pages being reported in Google Search Console
Recently a client of mine noticed an inexplicable jump in crawled pages being reported in Google Search Console. We researched the following culprits and found nothing: Rel=canonicals are put in place No SSL/non SSL duplication We used a tool to extrapolate search query page data from Google Search Insights; nothing unusual No dynamic pages being made on the website All necessary landing pages are in the XML sitemap Could this be a glitch in GSC? We are wondering what the heck is going on. 7eaeS
Intermediate & Advanced SEO | | BigChad20 -
Crawl Test Question
Good Morning, I am just looking for a little bit of advice, I ran a crawl report on our website www.swiftcomm.co.uk. I have resolved most of the issues myself, however I have two questions;- Screenshot image http://imgur.com/VlFEiZ2 Highlighted blue, we have two homepages www.swiftcomm.co.uk and www.swiftcomm.co.uk/ both are set with a Rel-Canonical Target of www.swiftcomm.co.uk/. Will this cause me any SEO issues and or other potential issue? If this may cause an issue how would I go about resolving? Highlighted yellow, Our contact and referral-form are showing as duplicate title and meta description. Both of these pages have separate title and meta desc which it does seem to be detecting. If I search the page in google it returns the correct title and meta desc. The only common denominator behind these pages is that both have php pages behind them for the contact form. Do you think that the moz crawl may be detecting the php page over the html? Could this be cause any issues when search engines crawl the site? Kind Regards Jonathan Mack VlFEiZ2
Intermediate & Advanced SEO | | JMack9860 -
Sub-domain or not???
Hi, We're setting up a forum for our users (our target audience responds extremely well to forums). I was wondering if it should be set up on a sub-domain or not. I'm leaning towards sub-domain, but our devs say this will impact how they approach it so I'd like to give them an answer asap so we can proceed with planning it! Thanks, Amelia
Intermediate & Advanced SEO | | CommT0 -
When you add 10.000 pages that have no real intention to rank in the SERP, should you: "follow,noindex" or disallow the whole directory through robots? What is your opinion?
I just want a second opinion 🙂 The customer don't want to loose any internal linkvalue by vaporizing link value though a big amount of internal links. What would you do?
Intermediate & Advanced SEO | | Zanox0 -
Why is SEO Moz only crawling my homepage?
When I send the SEO Bot to crawl my website, it only crawls the homepage? Does anyone know why this is happening? Thanks, Andy
Intermediate & Advanced SEO | | ChrisCurdDesign0 -
Why do old URL format are still being crawled by Rogerbot?
Hi, In the early days of my blog, I used permalinks with the following format: http://www.mysitesamp.com/2009/02/04/heidi-cortez-photo-shoot/ I then decided to change this format using .htaccess to this format: http://www.mysitesamp.com//heidi-cortez-photo-shoot/ My question is, why do rogerbot still crawls my old URL format since these urls' no longer exists in my website or blog.
Intermediate & Advanced SEO | | Trigun0 -
Best way to block a search engine from crawling a link?
If we have one page on our site that is is only linked to by one other page, what is the best way to block crawler access to that page? I know we could set the link to "nofollow" and that would prevent the crawler from passing any authority, and we can set the page to "noindex" to prevent it from appearing in search results, but what is the best way to prevent the crawler from accessing that one link?
Intermediate & Advanced SEO | | nicole.healthline0 -
Could Sub domains damage our SEO?
Hi there, We're currently looking into integrating a new internal search function to our site which will involve housing the search results on a sub domain of our site. We have no intention of these search result pages becoming landing pages for organic traffic but would the inclusion of a sub domain affect the optimization of the main domain? i.e. could it effect our authority? Nige
Intermediate & Advanced SEO | | NigelJ0