How to Hide Directories in Search?
-
I noticed bad 404 error links in Google Webmaster Tools and they were pointing to directories that do not have an actual page, but hold information.
Ex: there are links pointing to our PDF folder which holds all of our pdf documents. If i type in , example.com/pdf/ it brings up a unformated webpage that displays all of our PDF links.
How do I prevent this from happening. Right now I am blocking these in my robots.txt file, but if i type them in, they still appear.
Or should I not worry about this?
-
Yes, a visit to example.com/dir should now return a 404 error (if you haven't done any redirecting/canonicalizing). This will increase your 404 count in Web Master tools but it's far preferable to the alternative. If you're not redirecting the robots.txt will eventually work and hopefully the links will just fall out of WMT.
-
My hosting company turned off directory browsing and now everything is how it should be. So to my understanding, if the server sees a file that does not have a index file, it should not be view able and should be forbidden. This shoujld not affect us from an SEO standpoint should it? My hosting company said they disabled all directories in our site, however everything still works, except for the forbidden file directories.
-
Basically it shouldn't really have an affect; those unformatted file listings are literally the web server automatically saying 'here's the files that are in this folder', there's no meta tags, description, on page elements, etc.
If you have these pages and they're ranking well, you generally don't want them to be. The automatic file browsing pages don't have your name, your company, etc. in them, and they're generally pretty ugly. They also theoretically could be 'stealing' juice from your 'real' pages, if your internal structure isn't flowing relevance properly.
Basically what I'm saying is that if these pages are having some kind of SEO effect, you probably don't want them to be since they're so basic.
Also I can't overstate the security concerns that directory browsing might be introducing. If someone can directory browse to where your code lives (.php, .aspx.vb, whatever) they may be able to read it. Code sometimes has important things like logins, passwords, merchant account ids, etc. in it that you definitely don't want people reading.
-
Agreed with Valerie that step 1 is to turn off those directory listing pages - that can be a security issue and you don't necessarily want people to see/access the whole list. Also, make doubly sure you don't have any internal links to that directory (Google crawled it somehow).
Generally, Robots.txt should prevent crawling, but it's not foolproof, and it's pretty bad about removing pages once they're indexed. If you can block the page from browsing and return a 404 for the root page, that should be fine. The other option would be to have the page removed in Google Webmaster Tools. You could request removal for the entire folder, but I'm guessing that you may want the actual PDFs indexed.
-
Will turning of directory browsing affect Search for all directories?
-
I really don't want to 301 redirect them as they are just holding files. This is happening with my includes file too. that holds our header, footer, navigation etc. I can check with our hosting company to find out.
-
I'd create an index.html for the directory, and then redirect it somewhere. This way, you're capturing the inbound links and then rescuing some of the inbound juice.
Otherwise, you can also check out this post for more info on other solutions and modifying your htaccess file to prevent the directory view - http://perishablepress.com/better-default-directory-views-with-htaccess/
-
Blocking it in robots.txt will work to hide it from search engines.
If you want to hide it from users or people to who type in the url, you can simply drop a blank "index.html" in the /pdf folder.
-
I would suggest 301'ing them to their /index.htm or /pdf.htm equivalents. If you don't know, a 301 is a signal to a web browser (or search crawler) saying "this page has permanently moved, please go to (otherpage.htm) instead".
Here's a good SEOMoz article explaining it a bit more:
http://www.seomoz.org/learn-seo/redirection
What might be more of a concern, is it sounds like your web server has directory browsing enabled. This could be a security issue (depending on your web server setup). Generally you don't want to expose directories if you don't have to because it gives a potential attacker insight into your system setup. Here's an example how to do it in Apache:
www.camelrichard.org/topics/Apache/Turn_OffDirectoryBrowsing
And IIS:
technet.microsoft.com/en-us/library/cc731109(v=ws.10).aspx
If you like I can confirm if you have open directories if you give me the link, either here or through private message.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404 vs 410 Across Search Engines
We are removing a large number of URLs permanently. We care about rankings for search engines other than Google such as Yahoo-Bing, who don't even list https status 410 code option: https://docs.microsoft.com/en-us/bingmaps/spatial-data-services/status-codes-and-error-handling Does anyone know how search engines other than Google handle 410 vs 404 status? For pages permanently being removed John Mueller at Google has stated "From our point of view, in the mid term/long term, a 404 is the same as a 410 for us. So in both of these cases, we drop those URLs from our index. We generally reduce crawling a little bit of those URLs so that we don’t spend too much time crawling things that we know don’t exist. The subtle difference here is that a 410 will sometimes fall out a little bit faster than a 404. But usually, we’re talking on the order of a couple days or so. So if you’re just removing content naturally, then that’s perfectly fine to use either one." Any information or thoughts? Thanks
Intermediate & Advanced SEO | | sb10300 -
When i search for my domain name - google asks "did you mean" - why?
Hi all, I just noticed something quite odd - if i do a search for my domain name (see: http://goo.gl/LBc1lz) google shows my domain as first result, but it also asks "did i mean" and names another website with very similar name. the other site has far lower PA/DA according to Moz, any ideas why google is doing this? and more inportantly how i could stop it? please advise James
Intermediate & Advanced SEO | | isntworkdull0 -
Should party services directory switch to nofollow links? What are the implications?
Hello Mozzers, I'm looking at a niche party services directory (b2c). However, they're not using nofollow tags on backlinks from their paid entries (free entries only get phone numbers and not backlinks). If they suddenly switch all the paid-for backlinks in their directory to nofollow, might that have some kind of negative impact. Switching sounds like the best way forward, but I want to avoid any unintended consequences. Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Does site size (page count) effect search ranking?
If a company has a handful of large sites that function as collection of unique portals into client-specific content (password protected), will it have any positive effect on search ranking to migrate all of the sites to one URL structure.
Intermediate & Advanced SEO | | trideagroup0 -
Fixing A Page Google Omits In Search
Hi, I have two pages ranking for the same keyword phrase. Unfortunately, the wrong page is ranking higher, and the other page, only ranks when you include the omitted results. When you have a page that only shows when its omitted, is that because the content is too similar in google's eyes? Could there be any other possible reason? The content really shouldn't be flagged as duplicate, but if this is the only reason, I can change it around some more. I'm just trying to figure out the root cause before I start messing with anything. Here are the two links, if that's necessary. http://www.kempruge.com/personal-injury/ http://www.kempruge.com/location/tampa/tampa-personal-injury-legal-attorneys/ Best, Ruben
Intermediate & Advanced SEO | | KempRugeLawGroup0 -
What to do when Demoted Sitelinks appear on search results under my main link?
Hello all, I had some links that i didn't want them to appear under search results (under my main domain) . Using websmaster 'sitelinks' i demoted those links and it has been almost a month and i can see those unwanted links on SERPS. Those pages don't even have high traffic, I am not quite sure why even they appear on Google. Is there anything else i can do to remove them under main domain search results. Thanks Seda
Intermediate & Advanced SEO | | Rubix0 -
Hiding tag and category root in wordrpess = plunge rank
I deleted the "tag" portion of my tag urls's that were ranking pretty high....so: www.businessinteriors.co.uk/tag/office-fit-out-bristol became www.businessinteriors.co.uk/office fit-out-bristol The old tag page ranked 7 before the change and even 3rd at one stage. The new name page without the tag has re-appeared at 23.... So quite a plunge in ranking from the change and this is across the board for all my tags (200) that were ranking high and I wanted to improve. Have I made a major error? Or will they naturally start coming back to where they were before? Weirdly some of the changes have had a positive impact - so ranking has gone up slightly in some areas..but completely out done by the plungers :-s
Intermediate & Advanced SEO | | bizint0 -
Site Search Tracking Of Non Existing Products
I am working towards optimizing the site search box of an ecommerce website and I wish to track the keywords which users are searching but which are yielding no results. Please see the image for the same. I wish to assimilate data on the same which would then allow me to add products which users are searching but which the site doesn't have. However my problem is that I don't know how you could obtain this data in analytics because these results manifest itself in the form of searchresults.php. I know that analyzing search refinements and percentage of exits in Google Analytics is an option but I want a more compact and simpler solution to the problem where I could see exactly all the data in one place. Does anyone have suggestions on how this can be done? Thanks in advance, Y35Mj.png
Intermediate & Advanced SEO | | pulseseo0