How to Hide Directories in Search?
-
I noticed bad 404 error links in Google Webmaster Tools and they were pointing to directories that do not have an actual page, but hold information.
Ex: there are links pointing to our PDF folder which holds all of our pdf documents. If i type in , example.com/pdf/ it brings up a unformated webpage that displays all of our PDF links.
How do I prevent this from happening. Right now I am blocking these in my robots.txt file, but if i type them in, they still appear.
Or should I not worry about this?
-
Yes, a visit to example.com/dir should now return a 404 error (if you haven't done any redirecting/canonicalizing). This will increase your 404 count in Web Master tools but it's far preferable to the alternative. If you're not redirecting the robots.txt will eventually work and hopefully the links will just fall out of WMT.
-
My hosting company turned off directory browsing and now everything is how it should be. So to my understanding, if the server sees a file that does not have a index file, it should not be view able and should be forbidden. This shoujld not affect us from an SEO standpoint should it? My hosting company said they disabled all directories in our site, however everything still works, except for the forbidden file directories.
-
Basically it shouldn't really have an affect; those unformatted file listings are literally the web server automatically saying 'here's the files that are in this folder', there's no meta tags, description, on page elements, etc.
If you have these pages and they're ranking well, you generally don't want them to be. The automatic file browsing pages don't have your name, your company, etc. in them, and they're generally pretty ugly. They also theoretically could be 'stealing' juice from your 'real' pages, if your internal structure isn't flowing relevance properly.
Basically what I'm saying is that if these pages are having some kind of SEO effect, you probably don't want them to be since they're so basic.
Also I can't overstate the security concerns that directory browsing might be introducing. If someone can directory browse to where your code lives (.php, .aspx.vb, whatever) they may be able to read it. Code sometimes has important things like logins, passwords, merchant account ids, etc. in it that you definitely don't want people reading.
-
Agreed with Valerie that step 1 is to turn off those directory listing pages - that can be a security issue and you don't necessarily want people to see/access the whole list. Also, make doubly sure you don't have any internal links to that directory (Google crawled it somehow).
Generally, Robots.txt should prevent crawling, but it's not foolproof, and it's pretty bad about removing pages once they're indexed. If you can block the page from browsing and return a 404 for the root page, that should be fine. The other option would be to have the page removed in Google Webmaster Tools. You could request removal for the entire folder, but I'm guessing that you may want the actual PDFs indexed.
-
Will turning of directory browsing affect Search for all directories?
-
I really don't want to 301 redirect them as they are just holding files. This is happening with my includes file too. that holds our header, footer, navigation etc. I can check with our hosting company to find out.
-
I'd create an index.html for the directory, and then redirect it somewhere. This way, you're capturing the inbound links and then rescuing some of the inbound juice.
Otherwise, you can also check out this post for more info on other solutions and modifying your htaccess file to prevent the directory view - http://perishablepress.com/better-default-directory-views-with-htaccess/
-
Blocking it in robots.txt will work to hide it from search engines.
If you want to hide it from users or people to who type in the url, you can simply drop a blank "index.html" in the /pdf folder.
-
I would suggest 301'ing them to their /index.htm or /pdf.htm equivalents. If you don't know, a 301 is a signal to a web browser (or search crawler) saying "this page has permanently moved, please go to (otherpage.htm) instead".
Here's a good SEOMoz article explaining it a bit more:
http://www.seomoz.org/learn-seo/redirection
What might be more of a concern, is it sounds like your web server has directory browsing enabled. This could be a security issue (depending on your web server setup). Generally you don't want to expose directories if you don't have to because it gives a potential attacker insight into your system setup. Here's an example how to do it in Apache:
www.camelrichard.org/topics/Apache/Turn_OffDirectoryBrowsing
And IIS:
technet.microsoft.com/en-us/library/cc731109(v=ws.10).aspx
If you like I can confirm if you have open directories if you give me the link, either here or through private message.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Image Search - Is there a way to influence the related icons at the top of the image search results?
Google recently added related icons at the top of the image search results page. Some of the icons may be unrelated to the search. Are there any best practices to influence what is positioned in the related image icons section? Thank you.
Intermediate & Advanced SEO | | JaredBroussard1 -
Mapping ALL search data for a broad topic
Hi All As our company becomes a bigger and bigger entity I'm trying to figure out how I can create more autonomy. One of the key areas that needs fixing is briefing the writers on articles based on keywords. We're not just trying to go after the low hanging fruit or the big money keywords but actually comprehensively cover every topic and provide actual good quality up to date info (surprisingly rare in a competitive niche) and eventually cover pretty much every topic there is. We generally work on a 3 tier system on a folder level, topics and then sub-topics. The challenge is getting an agency to: a) be able to pull all of the data without being knowledgeable in our specific industry. We're specialists and, thus, target people that need specialist expertise as well as more mainstream stuff (the stuff that run of the mill people wouldn't know about). b) know where it all fits topically as we kind of organise the content on a heirarchy basis. And we generally cover multiple smaller topics within articles. Am I asking for the impossible here? It's the one area of the business I feel the most nervous about creating autonomy with. Can we become be as extensive and comprehensive as a wiki-type website without having somebody within the business that knows it providing the keyword research. I did a searh for all data using the main two seed keywords for this subject on ahrefs and it came up with 168000 lines of spreadsheet data. Obviously this went way beyond the maximum I was allowed to export. Interested in feedback and, if any agencies are up for the challenge, do let me know! I've been using moz pro for a long time but have never posted and apologise if what I'm describing is being explained badly here. Requirements Keywords to cover all (broad niche) related queries in the UK, no relevant uk (broad niche) keywords will be missed Organised in a way that can be interpreted as article brief and folder structure instructions. Questions How would you ensure you cover every single keyword? Assuming no specialist X knowledge, how will you be able to map content and know which search queries belong in which topics and in what order. Also (where there is keyword leakage from other regions) how will you know which are UK terms and which aren’t? With minimal X knowledge – how will you know whether you’ve missed an opportunity or not (what you don’t know you don’t know) What specific resources will you require from us in order for this to work? What format will the data be provided to us in - how will you present the finished work so that it can be turned into article briefs?
Intermediate & Advanced SEO | | d.bird0 -
Localized Domain Issue - Can I use Search Console to solve this?
Struggling through trying to resolve a complicated search issue - would appreciate any community input or suggestions. The Background Info We have several brand sites and each one has both a .ca and .com domain. For some reason, our website platform was created in a way that hundreds of pages on the .com domain have an equivalent page on the .ca domain, which are all 301'ed to the appropriate .com pages. Example below for clarity: www.domain.ca/gadget/brand - 301 Redirected to: www.domain.com/gadget/brand www.domain.ca/gadget/en/brandcanada = Proper .ca Canadian URL (where en is the language - fr exists as well) The Problem Because these .com pages exist under the .ca domain as well, they have started to outrank the correct .ca pages on Google. This has led to Canadian customers finding incorrect information, pricing, and reviews for these products - causing all sorts of customer service issues and therefore affecting our sales. I am being told that to properly fix the issue, and remove the incorrect URLs under the .ca domain would be prohibitively expensive in terms of resources, so I'm left trying to fix this via means available to me (i.e. anything but a change to how the platform is currently setup). The Attempted Fix I've submitted proper sitemaps for the .ca brand sites, and we have also created a robots.txt file to be accessed only when the site is crawled through the .ca domain. In that robots.txt, we have Disallowed crawling of any /gadget/brand/ URLs for the .ca domain. This was done a week ago and I am still seeing the .com URL show up in search results. The Question Should I be submitting any www.brand.ca/gadget/brand/ URLs to be temporarily removed from Google? Because of the 301 redirect in place from www.brand.ca/gadget/brand to www.brand.com/gadget/brand, I am hesitant to do so, as I do not want the .com URL removed. Will Google simply remove the .ca URL and not follow the 301 redirect to remove that URL as well? Any additional insight or feedback would be awesome as well.
Intermediate & Advanced SEO | | Trevor-O0 -
Search Causing Duplicate Content
I use Opencart and have found that a lot of my duplicate content (mainly from Products) which is caused by the Search function. Is there a simple way to tell Google to ignore the Search function pathway? Or is this particular action not recommended? Here are two examples: http://thespacecollective.com/index.php?route=product/search&tag=cloth http://thespacecollective.com/index.php?route=product/search
Intermediate & Advanced SEO | | moon-boots0 -
Targeting two search terms with same intent - one or more pages for SEO benefits?
I'd like some professional opinions on this topic. I'm looking after the SEO for my friends site, and there are two main search terms we are looking to boost in search engines. The company sells Billboard advertising space to businesses in the UK. Here are the two search terms we're looking to target: Billboard Advertising - 880 searches P/M Outdoor Advertising - 720 searches P/M It would usually make sense to make a separate page to target the keyword "billboard advertising" on its own fully optimised landing page with more information on the topic and with a targeted URL: www.website.com/billboard-advertising/ and the homepage to target "outdoor advertising" as it's an outdoor advertising agency. But there's a problem, as both search terms are highly related and have the same intent, I'm worried that if we create a separate page to target the billboard advertising, it will conflict with the homepage targeting outdoor advertising. Also, the main competitors who are currently ranked position 1-3, are ranking with their home pages and not optimised landing pages to target the exact search term "billboard advertising". Any advice on this?
Intermediate & Advanced SEO | | Jseddon920 -
Finding Ranking for search term and increasing ranking
Hi. The company that I'm working with would like to rank highly in google for certain generic search terms (dentist, dentists, etc.). Certain websites the company has used to rank highly in google for generic keywords, but has not for years now since google has revised their algorithm so many times. Moz lists that the company websites are not found in the top 51+ results in google. My first question is: **Is there a way, apart from manually searching the results, to find the ranking position of the website in google? **Ideally, I would like to find a program that will do this. Second, I've been reading a lot of the great articles and comments on Moz, and I've been learning a lot more about SEO. My focus has shifted to spending more attention on User Experience and Social Media instead of placing the exact keywords in the pages / tags of the website. What area(s) should I be focusing on to best increase the ranking of the company website for certain generic terms? Ideally, I'd like to create good quality content, so that users will not instantly click away. I appreciate any thoughts or comments. Thank you in advance!
Intermediate & Advanced SEO | | americasmiles0 -
Search traffic decline after redesign and new URL
Howdy Mozzers I’ve been a Moz fan since 2005, and been doing SEO since. This is my first major question to the community! I just started working for a new company in-house, and we’ve uncovered a serious problem. This is a bit of a long one, so I’m hoping you’ll stick it out with me! ***Since the images aren't working, here's a link to the google doc with images. https://docs.google.com/document/d/1I-iLDjBXI4d59Kl3uRMwLvpihWWKF3bQFTTNRb1R3ZM/edit?usp=sharing Background The site has gone through a few changes in the past few years. Drupal 5 and 6 hosted at bcbusinessonline.ca and now on Drupal 7 hosted at bcbusiness.ca. The redesigned responsive design site launched on January 9th, 2013. This includes changing the structure of the URL’s, such as categories, tags, and articles. We submitted a change of address through GWT shortly after the change. Problem Organic site traffic is down 50% over the last three months. Below, Google analytics, and Google Webmaster Tools shows the decline. *They used the same UA number for Google analytics, so that’s why the data is continuous
Intermediate & Advanced SEO | | Canada_wide_mediaOrganic traffic to the site. January 2011 - Dips in January are because of the business crowd on holidays.
Google Webmaster Tools data exported for bcbusiness.ca starting as far back as I could get. Redirects During the switch, the site went from bcbusinessonline.ca to bcbusiness.ca. They were implemented as 302’s on January 9th, 2013 to test, then on January 15th, they were all made 301’s. Here is how they were set up: Original: http://www.bcbusinessonline.ca/bcb/bc-blogs/conference/2010/10/07/11-phrases-never-use-your-resume --301-- http://www.bcbusiness.ca/bcb/bc-blogs/conference/2010/10/07/11-phrases-never-use-your-resume --301-- http://www.bcbusiness.ca/careers/11-phrases-never-to-use-on-your-resume Canonical issue On bcbusiness.ca, there are article pages (example) that are paginated. All of the page 2 to page N were set to the first page of the article. We addressed this issue by removing the canonical tag completely from the site on April 16th, 2013. Then, by walking through the Ayima Pagination Guide we decided for immediate and least work choice was to noindex, follow all the pages that simply list articles (example). Google Algorithm Changes (Penguin or Panda) According to SEOmoz Google Algorithm Changes there is no releases that could have impacted our site at the February 20th ballpark. However - Sitemap We have a sitemap submitted to Google Webmaster Tools, and currently have 4,229 pages indexed of 4,312 submitted. But there are a few pages we looked at that there is an inconsistency between what GWT is reporting and what a “site:” search reports. Why would the submit to index button be showing, if it’s in the index?
That page is in the sitemap. Updated: 2012-11-28T22:08Z Change Frequency: Yearly Priority: 0.5
*GWT Index Stats from bcbusiness.ca What we looked at so far The redirects are all currently 301’s GWT is reporting good DNS, Server Connectivity, and Robots.txt Fetch We don’t have noindex or nofollow on pages where we haven’t intended them to be. Robots.txt isn’t blocking GoogleBot, or any pages we want to rank. We have added nofollow to all ‘Promoted Content’ or paid advertising / advertorials We had TextLinkAds on our site at one point but I removed them once I satarted working here (April 1). Sitemaps were linking to the old URL, but now updated (April)
1 -
SEO Correlation Between Code and Search Engine Rankings
I posted this on my blog and wanted to get everyones opinion on this (http://palatnikfactor.com/2011/06/07/seo-correlation-between-code-and-search-engine-rankings/) I’m always looking to see what top ranking websites may be doing to get the rankings they do. One of the tasks of any SEO I guess is to really analyze competitors, right? I want to really stress that what I am writing here is completely opinion based and have not (due to time) validated this correlation enough but would like to get the discussion started. Nevertheless, I did enough research to see that there may be a correlation between code validation and top ranking websites, at least for certain queries where the number of real big players/brands is limited or non-existent. So, what do I mean? http://validator.w3.org/ validates code on websites. This tool shows you errors and warnings that may be making it harder for search engines to crawl your website. Looking at top competitors for certain niches, I was surprised to find that top sites had very few errors compared to 2+ page rankings. That’s not to say that all the sites on the first page had fewer errors (cleaner code) than websites in the 2<sup>nd</sup> page plus. However, again, top ranking websites for keywords that I was looking at had cleaner code which may have a correlation in regards to organic rankings. What’s your take? Does this have any effect in regards to SEO?
Intermediate & Advanced SEO | | PaulDylan0