Robots.txt question
-
I want to block spiders from specific specific part of website (say abc folder).
In robots.txt, i have to write -
User-agent: *
Disallow: /abc/
Shall i have to insert the last slash. or will this do
User-agent: *
Disallow: /abc
-
I will do so. And hope to get that back.
-
If you contact the help desk, they can probably help you get your old account back.
-
I am the same person with the username seoug, but lost that account. So, had to start afresh ! I was a PR0 member, but accidently deleted that account ( it was not intentional ). And now , when i tried login in, i get a message that seoug name is already taken.
-
Thanks for clearing my doubts.
-
at least our answers agree, so no Atul is doubley sure of how to do it...
-
EGOL does it to me all the time!
-
Hi Atul,
Add the trailing slash.
/abc could be a page url. Where as /abc/ is definitely a folder.
http://www.robotstxt.org/robotstxt.html <-- Everything you ever wanted to know about robots.txt
Regards
Aran
[EDIT: Damn it, Ryan submitted whilst I was answering! Must type faster ]
-
Use the trailing slash.
More about robots.txt can be learned at this site: http://www.robotstxt.org/
The trailing slash indicates you are blocking a folder. Without the slash the object would be considered a file (i.e. page). I am not sure what the result would be if you tried to block a folder without the trailing slash. Even if it worked it would not be the correct code and may lead to various bots treating it differently.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No index tag robots.txt
Hi Mozzers, A client's website has a lot of internal directories defined as /node/*. I already added the rule 'Disallow: /node/*' to the robots.txt file to prevents bots from crawling these pages. However, the pages are already indexed and appear in the search results. In an article of Deepcrawl, they say you can simply add the rule 'Noindex: /node/*' to the robots.txt file, but other sources claim the only way is to add a noindex directive in the meta robots tag of every page. Can someone tell me which is the best way to prevent these pages from getting indexed? Small note: there are more than 100 pages. Thanks!
Technical SEO | | WeAreDigital_BE
Jens0 -
What's wrong with this robots.txt
Hi. really struggling with the robots.txt file
Technical SEO | | Leonie-Kramer
this is it: User-agent: *
Disallow: /product/ #old sitemap
Disallow: /media/name.xml When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong. Can someone take a look at this and give me the solution.
Thanx in advance! Leonie1 -
GWT returning 200 for robots.txt, but it's actually returning a 404?
Hi, Just wondering if anyone has had this problem before. I'm just checking a client's GWT and I'm looking at their robots.txt file. In GWT, it's saying that it's all fine and returns a 200 code, but when I manually visit (or click the link in GWT) the page, it gives me a 404 error. As far as I can tell, the client has made no changes to the robots.txt recently, and we definitely haven't either. Has anyone had this problem before? Thanks!
Technical SEO | | White.net0 -
Robots.txt file
How do i get Google to stop indexing my old pages and start indexing my new pages even months down the line? Do i need to install a Robots.txt file on each page?
Technical SEO | | gimes0 -
Google Places Question......
Hi Guys. I am working with a photographer they do not have a studio they shoot on location. However I noticed many photographers within their industry have their home address listed in their google places, and they too shoot on location. My client doesn't want their home address listed so I wondered what options there would be? Do you think renting mail forwarding address would suffice?
Technical SEO | | RankStealer0 -
BEST Wordpress Robots.txt Sitemap Practice??
Alright, my question comes directly from this article by SEOmoz http://www.seomoz.org/learn-seo/robotstxt Yes, I have submitted the sitemap to google, bing's webmaster tools and and I want to add the location of our site's sitemaps and does it mean that I erase everything in the robots.txt right now and replace it with? <code>User-agent: * Disallow: Sitemap: http://www.example.com/none-standard-location/sitemap.xml</code> <code>???</code> because Wordpress comes with some default disallows like wp-admin, trackback, plugins. I have also read other questions. but was wondering if this is the correct way to add sitemap on Wordpress Robots.txt http://www.seomoz.org/q/robots-txt-question-2 http://www.seomoz.org/q/quick-robots-txt-check. http://www.seomoz.org/q/xml-sitemap-instruction-in-robots-txt-worth-doing I am using Multisite with Yoast plugin so I have more than one sitemap.xml to submit Do I erase everything in Robots.txt and replace it with how SEOmoz recommended? hmm that sounds not right. User-agent: *
Technical SEO | | joony2008
Disallow:
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-login.php
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /comments **ERASE EVERYTHING??? and changed it to** <code> <code>
<code>User-agent: *
Disallow: </code> Sitemap: http://www.example.com/sitemap_index.xml</code> <code>``` Sitemap: http://www.example.com/sub/sitemap_index.xml ```</code> <code>?????????</code> ```</code>0 -
Sub-domains for keyword targeting? (specific example question)
Hey everyone, I have a question I believe is interesting and may help others as well. Our competitor heavily (over 100-200) uses sub-domains to rank in the search engines... and is doing quite well. What's strange, however, is that all of these sub-domains are just archives -- they're 100% duplicate content! An example can be seen here where they just have a bunch of relevant posts archived with excerpts. How is this ranking so well? Many of them are top 5 for keywords in the 100k+ range. In fact their #1 source of traffic is SEO for many of the pages. As an added question: is this effective if you were to actually have a quality/non-duplicate page? Thanks! Loving this community.
Technical SEO | | naturalsociety0 -
Robots.txt and 301
Hi Mozzers, Can you answer something for me please. I have a client and they have 301 re-directed the homepage '/' to '/home.aspx'. Therefore all or most of the linkjuice is being passed which is great. They have also marked the '/' as nofollow / noindex in the Robots.txt file so its not being crawled. My question is if the '/' is being denied access to the robots is it still passing on the authority for the links that go into this page? It is a 301 and not 302 so it would work under normal circumstances but as the page is not being crawled do I need to change the Robots.txt to crawl the '/'? Thanks Bush
Technical SEO | | Bush_JSM0