Robot.txt : How to block a specific file type in several subdirectories ?
-
Hello everyone !
I need help setting up a robot.txt.
I'm trying to block all pdf files in particular directories so I'm using this command. In the example below the line is blocking all .gif in the entire site.
Block files of a specific file type (for example,
.gif
) | Disallow: /*.gif$2 questions :
- Can I use this command to specify one particular directory in which I want to block pdf files ? Will this line be recognized by googlebots ?
Disallow: /fileadmin/xxxxxxx/xxx/xxxxxxx/*.pdf$
- Then I realized that I would have to write as many lines as many directories there are in which I want to block pdf files.
Let's say I want to block pdf files in all these 3 directories
/fileadmin/directory1
/fileadmin/directory1/sub1
/fileadmin/directory1/sub1/pdf
Is there a pattern-matching rule I could use to blocks access to pdf files in all subdirectories instead of writing 3x the above line for each subdirectory ? For exemple :
Disallow: /fileadmin/directory1*/
Many thanks in advance for any insight you may have.
-
Hey thank you for your answer, really appreciate it.
-
Use this code -
Disallow: /*.f$
If you want to block only one folder then use this -
Disallow: /folder1/.*f$
This rule will help to block both files only .pdf and .gif
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I am looking for best way to block a domain from getting indexed ?
We have a website http://www.example.co.uk/ which leads to another domain (https://online.example.co.uk/) when a user clicks,in this case let us assume it to be Apply now button on my website page. We are getting meta data issues in crawler errors from this (https://online.example.co.uk/) domain as we are not targeting any meta content on this particular domain. So we are looking to block this domain from getting indexed to clear this errors & does this effect SERP's of this domain (**https://online.example.co.uk/) **if we use no index tag on this domain.
Technical SEO | | Prasadgotteti0 -
Specific vs. Home Page Backlinks
So, I'm getting ready to start a campaign to get some backlinks. Pretty sure this is a silly NOOB question, but what is better: To get backlinks directed to my home page To get backlinks directed to she specific product/topic being discussed in the backlink. Thanks in advance for any help. My GUESS on the whole topic is that linking to a specific product page from a backlink with anchor text is best practice. It will boost the Page Authority of that page while boosting the overall Domain Authority . . . a win win.
Technical SEO | | damon12121 -
What's wrong with this robots.txt
Hi. really struggling with the robots.txt file
Technical SEO | | Leonie-Kramer
this is it: User-agent: *
Disallow: /product/ #old sitemap
Disallow: /media/name.xml When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong. Can someone take a look at this and give me the solution.
Thanx in advance! Leonie1 -
Difficulties in positiong a specific keyword
Hi, I am responsible for the blinds and curtains brand Kaaten and the its online store that operates in the Spanish market, during the last 6 months we have increasing substantially the traffic and improving the positioning of our main keywords (in most of the cases we postion right under Ikea and Leroy Merlin). However, there is a keyword that I cannot get to position, not even in the top 50 positions - that is the word "cortinas". I do not understand why is that happening as in a word like "estores" with similar monthly searches and competition we rank 3rd. Looking at the Home site and analysing both keywords the density is similar, so I do not see many differences there. I also cleaned up a bit the content to remove some "cortinas" keywords and thus avoid the keyword stuffing, but no results so far. Does any of you have any suggestion that what I can do to position better in that particular keyword? Look forward for your response and thanks for your help.
Technical SEO | | oriolbandalux0 -
How should I properly setup my .htaccess file?
I have searched google for 'how to setup .htaccess file' and it seems that every website has some variation. For example: RewriteCond %{HTTP_HOST} ^yoursite.com RewriteRule ^(.*)$ http://www.yoursite.com/$1 [R=permanent,L] On SEOMOZ someone posted this: RewriteCond %{HTTP_HOST} !^www.yoursite.com [NC] RewriteRule (.*) http://www.yoursite.com/$1 [L,R=301] On yet another website, I found this: RewriteEngine On RewriteCond %{HTTP_HOST} !^your-site.com$ [NC] RewriteRule ^(.*)$ http://your-site.com/$1 [L,R=301] As you can see there are slight differences. Which one do I use? I'm on Apache CentOS and I have HTML5 websites and several Joomla! wesites. Would the HTACCESS File be different for both?
Technical SEO | | maxduveen0 -
Allow or Disallow First in Robots.txt
If I want to override a Disallow directive in robots.txt with an Allow command, do I have the Allow command before or after the Disallow command? example: Allow: /models/ford///page* Disallow: /models////page
Technical SEO | | irvingw0 -
Search engines have been blocked by robots.txt., how do I find and fix it?
My client site royaloakshomesfl.com is coming up in my dashboard as having Search engines have been blocked by robots.txt, only I have no idea where to find it and fix the problem. Please help! I do have access to webmaster tools and this site is a WP site, if that helps.
Technical SEO | | LeslieVS0 -
Robots.txt usage
Hey Guys, I am about make an important improvement to our site's robots.txt we have large number of properties on our site and we have different views for them. List, gallery and map view. By default list view shows up and user can navigate through gallery view. We donot want gallery pages to get indexed and want to save our crawl budget for more important pages. this is one example of our site: http://www.holiday-rentals.co.uk/France/r31.htm When you click on "gallery view" URL of this site will remain same in your address bar: but when you mouse over the "gallery view" tab it will show you URL with parameter "view=g". there are number of parameters: "view=g, view=l and view=m". http://www.holiday-rentals.co.uk/France/r31.htm?view=l http://www.holiday-rentals.co.uk/France/r31.htm?view=g http://www.holiday-rentals.co.uk/France/r31.htm?view=m Now my question is: I If restrict bots by adding "Disallow: ?view=" in our robots.txt will it effect the list view too? Will be very thankful if yo look into this for us. Many thanks Hassan I will test this on some other site within our network too before putting it to important one's. to measure the impact but will be waiting for your recommendations. Thanks
Technical SEO | | holidayseo0