Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robot.txt : How to block a specific file type in several subdirectories ?
-
Hello everyone !
I need help setting up a robot.txt.
I'm trying to block all pdf files in particular directories so I'm using this command. In the example below the line is blocking all .gif in the entire site.
Block files of a specific file type (for example,
.gif
) | Disallow: /*.gif$2 questions :
- Can I use this command to specify one particular directory in which I want to block pdf files ? Will this line be recognized by googlebots ?
Disallow: /fileadmin/xxxxxxx/xxx/xxxxxxx/*.pdf$
- Then I realized that I would have to write as many lines as many directories there are in which I want to block pdf files.
Let's say I want to block pdf files in all these 3 directories
/fileadmin/directory1
/fileadmin/directory1/sub1
/fileadmin/directory1/sub1/pdf
Is there a pattern-matching rule I could use to blocks access to pdf files in all subdirectories instead of writing 3x the above line for each subdirectory ? For exemple :
Disallow: /fileadmin/directory1*/
Many thanks in advance for any insight you may have.
-
Hey thank you for your answer, really appreciate it.
-
Use this code -
Disallow: /*.f$
If you want to block only one folder then use this -
Disallow: /folder1/.*f$
This rule will help to block both files only .pdf and .gif
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl solutions for landing pages that don't contain a robots.txt file?
My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader1 -
Google Search console says 'sitemap is blocked by robots?
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt." I don't understand why my sitemap is being blocked? My robots.txt look like this: User-Agent: *
Technical SEO | | Extima-Christian
Disallow: Sitemap: http://www.website.com/sitemap_index.xml It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?1 -
How Does Dynamic Content for a Specific URL Impact SEO?
Example URL: http://www.sja.ca/English/Community-Services/Pages/Therapy Dog Services/default.aspx The above page is generated dynamically depending on what province the visitor visits from. For example, a visitor from BC would see something quite different than a visitor from Nova Scotia; the intent is that the information shown should be relevant to the user of that province. How does this effect SEO? How (or from what location) does Googlebot decide to crawl the page? I have considered a subdirectory for each province, though that comes with its challenges as well. One such challenge is duplicate content when different provinces may have the same information for some pages. Any suggestions for this?
Technical SEO | | ey_sja0 -
International architecture: Country specific subfolders > domain mapping to tld
Hi Ive got a clients dev saying they are setting up with country/language specific subfolders (as i recommended) BUT now they are saying they want to set up on network.domain.com (for example) and then each language will have its own sub-folder BUT will be domain mapped to the TLD as and when they get them. I have asked them to clarify since sounds a bit strange since thought best to have domain.com then /uk and /us etc etc and sure ok to forward country specific TLD's to these subfolders. Its this new subdomain (network.) thats concerning me and mapping rather than forwarding (or is it the same thing) but anyone know off hand if above sounds ok or also thinks a bit strange or know issues with such a set up ? many thanks dan
Technical SEO | | Dan-Lawrence0 -
Robots.txt and Multiple Sitemaps
Hello, I have a hopefully simple question but I wanted to ask to get a "second opinion" on what to do in this situation. I am working on a clients robots.txt and we have multiple sitemaps. Using yoast I have my sitemap_index.xml and I also have a sitemap-image.xml I do put them in google and bing by hand but wanted to have it added into the robots.txt for insurance. So my question is, when having multiple sitemaps called out on a robots.txt file does it matter if one is before the other? From my reading it looks like you can have multiple sitemaps called out, but I wasn't sure the best practice when writing it up in the file. Example: User-agent: * Disallow: Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/plugins/ Sitemap: http://sitename.com/sitemap_index.xml Sitemap: http://sitename.com/sitemap-image.xml Thanks a ton for the feedback, I really appreciate it! :) J
Technical SEO | | allstatetransmission0 -
Root directory vs. subdirectories
Hello. How much more important does Google consider pages in the root directory relative to pages in a subdirectory? Is it best to keep the most important pages of a site in the root directory? Thanks!
Technical SEO | | nyc-seo0 -
301 Redirect on a PDF, DOCX files?
Hi, I have to rename many pdf and docx files. How can I implement 301 redirect on them as they are linked from 'n' number of places? Regards, Shailendra Sial
Technical SEO | | IM_Learner1 -
Duplicate content problem from an index.php file
Hi One of my sites is flagging a duplicate content problem which is affecting the search rankings. The duplicate problem is caused by http://www.mydomain.com/index.php which has a page rank of 26 How can I sort the duplicate content problem, as the main page should just be http://www.mydomain.com which has a page rank of 42 and is the stronger page with stronger links etc Many Thanks
Technical SEO | | ocelot0