How to Hide Directories in Search?
-
I noticed bad 404 error links in Google Webmaster Tools and they were pointing to directories that do not have an actual page, but hold information.
Ex: there are links pointing to our PDF folder which holds all of our pdf documents. If i type in , example.com/pdf/ it brings up a unformated webpage that displays all of our PDF links.
How do I prevent this from happening. Right now I am blocking these in my robots.txt file, but if i type them in, they still appear.
Or should I not worry about this?
-
Yes, a visit to example.com/dir should now return a 404 error (if you haven't done any redirecting/canonicalizing). This will increase your 404 count in Web Master tools but it's far preferable to the alternative. If you're not redirecting the robots.txt will eventually work and hopefully the links will just fall out of WMT.
-
My hosting company turned off directory browsing and now everything is how it should be. So to my understanding, if the server sees a file that does not have a index file, it should not be view able and should be forbidden. This shoujld not affect us from an SEO standpoint should it? My hosting company said they disabled all directories in our site, however everything still works, except for the forbidden file directories.
-
Basically it shouldn't really have an affect; those unformatted file listings are literally the web server automatically saying 'here's the files that are in this folder', there's no meta tags, description, on page elements, etc.
If you have these pages and they're ranking well, you generally don't want them to be. The automatic file browsing pages don't have your name, your company, etc. in them, and they're generally pretty ugly. They also theoretically could be 'stealing' juice from your 'real' pages, if your internal structure isn't flowing relevance properly.
Basically what I'm saying is that if these pages are having some kind of SEO effect, you probably don't want them to be since they're so basic.
Also I can't overstate the security concerns that directory browsing might be introducing. If someone can directory browse to where your code lives (.php, .aspx.vb, whatever) they may be able to read it. Code sometimes has important things like logins, passwords, merchant account ids, etc. in it that you definitely don't want people reading.
-
Agreed with Valerie that step 1 is to turn off those directory listing pages - that can be a security issue and you don't necessarily want people to see/access the whole list. Also, make doubly sure you don't have any internal links to that directory (Google crawled it somehow).
Generally, Robots.txt should prevent crawling, but it's not foolproof, and it's pretty bad about removing pages once they're indexed. If you can block the page from browsing and return a 404 for the root page, that should be fine. The other option would be to have the page removed in Google Webmaster Tools. You could request removal for the entire folder, but I'm guessing that you may want the actual PDFs indexed.
-
Will turning of directory browsing affect Search for all directories?
-
I really don't want to 301 redirect them as they are just holding files. This is happening with my includes file too. that holds our header, footer, navigation etc. I can check with our hosting company to find out.
-
I'd create an index.html for the directory, and then redirect it somewhere. This way, you're capturing the inbound links and then rescuing some of the inbound juice.
Otherwise, you can also check out this post for more info on other solutions and modifying your htaccess file to prevent the directory view - http://perishablepress.com/better-default-directory-views-with-htaccess/
-
Blocking it in robots.txt will work to hide it from search engines.
If you want to hide it from users or people to who type in the url, you can simply drop a blank "index.html" in the /pdf folder.
-
I would suggest 301'ing them to their /index.htm or /pdf.htm equivalents. If you don't know, a 301 is a signal to a web browser (or search crawler) saying "this page has permanently moved, please go to (otherpage.htm) instead".
Here's a good SEOMoz article explaining it a bit more:
http://www.seomoz.org/learn-seo/redirection
What might be more of a concern, is it sounds like your web server has directory browsing enabled. This could be a security issue (depending on your web server setup). Generally you don't want to expose directories if you don't have to because it gives a potential attacker insight into your system setup. Here's an example how to do it in Apache:
www.camelrichard.org/topics/Apache/Turn_OffDirectoryBrowsing
And IIS:
technet.microsoft.com/en-us/library/cc731109(v=ws.10).aspx
If you like I can confirm if you have open directories if you give me the link, either here or through private message.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Keyword not provided now in search console
Hello, Is the not provided now available in google search console ? It seems that it is or is it a totally different thing in the search console ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Whats the best way to set up a directory listing website
Hello all, I am building a website that lists homeschool events and field trips across various states (locker-time.com) and I have a few questions on setting it up correctly. Both the events and field trips are searchable by distance. For clarification, events are associated with a specific date and time and field trips are not. I currently have a link that says homeschool events and you enter your zip to find things close by. Is it better to create a separate page for each state I am targeting instead? So the link would be homeschool events and then a sub-link that says homeschool events in GA and the GA page brings up all the events in GA, still searchable by zip. Or does it matter? I was thinking if its a separate page, I could put keyword rich copy on top, but then clicking on the menu and choosing the appropriate sub-menu is an additional step for users on the site and as the number of states increase, that sub-menu could get pretty big. The search results pages lists the post title of any events or field trips found and the links go to a page on my website with more information, such as the location, details on the event / field trip and a link to their website. I am wondering for SEO purposes, is this the right way to do it? Or I could set up the results page to show an excerpt and some listing info and then link directly to their website. Does it matter? I was thinking a page on my own website since then I could add images (but that might end up sucking up all my hosting space). As I am adding these listings to my website, I simply copied/pasted the details on the event. Now that I'm thinking about it, original content is best, so should I stop doing that and rewrite the description in my own words? Since the events are date specific events and when they pass, they are no longer on the site, does it matter as much for the events? The field trips do not have dates associated with them, so I can probably work on creating my own descriptions for those. Just not sure if I should bother with events that are more short term. Thanks in advance for ANY advice or suggestions. I'm so looking forward to getting this all set up correctly! I find working on this SEO stuff such fun! Jeanette
Intermediate & Advanced SEO | | fatcreat0 -
Noindex search pages?
Is it best to noindex search results pages, exclude them using robots.txt, or both?
Intermediate & Advanced SEO | | YairSpolter0 -
Local search vs. Organic Listings
Hi ~ I was interested to see if anyone feels there might be an advantage to keeping a business out of Google's Local Search listing area or at least trying to keep it out of the 7-pack display? It seems to me that sites who are not listed in the 7-pack can often be ranked above the maps/7-pack area in the regular organic listings. Also, is there anyway for a homepage to be listed on the 1st page in both the local search and organic listings? Thanks!
Intermediate & Advanced SEO | | hhdentist0 -
Where is the SEOmoz search operator guide?
It was available on this URL: http://www.seomoz.org/article/the-professionals-guide-to-advanced-search-operators but I can't seem to find it anymore. Anyone know where it is?
Intermediate & Advanced SEO | | Chuck-Boom0 -
Search Refinement URLs
My site is using search refinement and I am concerned about the URL adding additional characters when it's refined. My current URL is: http://www.autopartscheaper.com/Air-Conditioning-Heater-Parts-s/10280.htm and when someone chooses their specific year, make, and model then it changes to: http://www.autopartscheaper.com/Air-Conditioning-Heater-Parts-s/10280.htm?searching=Y&Cat=10280&RefineBy_7371=7708. Will this negatively affect SEO for this URL? Will the URL be counted twice? Any help would be great!
Intermediate & Advanced SEO | | BrandLabs0 -
Best practice for removing indexed internal search pages from Google?
Hi Mozzers I know that it’s best practice to block Google from indexing internal search pages, but what’s best practice when “the damage is done”? I have a project where a substantial part of our visitors and income lands on an internal search page, because Google has indexed them (about 3 %). I would like to block Google from indexing the search pages via the meta noindex,follow tag because: Google Guidelines: “Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines.” http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769 Bad user experience The search pages are (probably) stealing rankings from our real landing pages Webmaster Notification: “Googlebot found an extremely high number of URLs on your site” with links to our internal search results I want to use the meta tag to keep the link juice flowing. Do you recommend using the robots.txt instead? If yes, why? Should we just go dark on the internal search pages, or how shall we proceed with blocking them? I’m looking forward to your answer! Edit: Google have currently indexed several million of our internal search pages.
Intermediate & Advanced SEO | | HrThomsen0 -
Branded Searches -- Should I Name My Products Differently?
I know that branded searches are a large component of whether sites were hit by Panda or not, and I wonder if moving forward, I should always include the name of my site (domain) in the name of the product. For example, if I have a product with a unique name such as 'history maps' should I change the name to include my brand name, i.e '[domain] history maps'? Or, if users search for the unique product name, is that sufficient?
Intermediate & Advanced SEO | | nicole.healthline1