Robots.txt question
-
I want to block spiders from specific specific part of website (say abc folder).
In robots.txt, i have to write -
User-agent: *
Disallow: /abc/
Shall i have to insert the last slash. or will this do
User-agent: *
Disallow: /abc
-
I will do so. And hope to get that back.
-
If you contact the help desk, they can probably help you get your old account back.
-
I am the same person with the username seoug, but lost that account. So, had to start afresh ! I was a PR0 member, but accidently deleted that account ( it was not intentional ). And now , when i tried login in, i get a message that seoug name is already taken.
-
Thanks for clearing my doubts.
-
at least our answers agree, so no Atul is doubley sure of how to do it...
-
EGOL does it to me all the time!
-
Hi Atul,
Add the trailing slash.
/abc could be a page url. Where as /abc/ is definitely a folder.
http://www.robotstxt.org/robotstxt.html <-- Everything you ever wanted to know about robots.txt
Regards
Aran
[EDIT: Damn it, Ryan submitted whilst I was answering! Must type faster ]
-
Use the trailing slash.
More about robots.txt can be learned at this site: http://www.robotstxt.org/
The trailing slash indicates you are blocking a folder. Without the slash the object would be considered a file (i.e. page). I am not sure what the result would be if you tried to block a folder without the trailing slash. Even if it worked it would not be the correct code and may lead to various bots treating it differently.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Clarification regarding robots.txt protocol
Hi,
Technical SEO | | nlogix
I have a website , and having 1000 above url and all the url already got indexed in Google . Now am going to stop all the available services in my website and removed all the landing pages from website. Now only home page available . So i need to remove all the indexed urls from Google . I have already used robots txt protocol for removing url. i guess it is not a good method for adding bulk amount of urls (nearly 1000) in robots.txt . So just wanted to know is there any other method for removing indexed urls.
Please advice.0 -
One robots.txt file for multiple sites?
I have 2 sites hosted with Blue Host and was told to put the robots.txt in the root folder and just use the one robots.txt for both sites. Is this right? It seems wrong. I want to block certain things on one site. Thanks for the help, Rena
Technical SEO | | renalynd270 -
Redirect Question
We have a client that just did a redesign and development and the new design didn't really match their current structure. They said they didn't want to worry about matching site structure and never put any effort into SEO. Here is the situation: They had a blog located on a subdomain such as blog.domain.com - now there blog is located like domain.com/blog They want to create redirects for all the old the blog urls that used to be on the subdomain and not point to the domain.com/blog/post-name What is the best way of doing that - Through .htaccess?
Technical SEO | | Beardo0 -
What's wrong with this robots.txt
Hi. really struggling with the robots.txt file
Technical SEO | | Leonie-Kramer
this is it: User-agent: *
Disallow: /product/ #old sitemap
Disallow: /media/name.xml When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong. Can someone take a look at this and give me the solution.
Thanx in advance! Leonie1 -
Windows IIS 7 Redirect Question
I want to redirect the following 4 pages to the home page: http://www.phbalancedpool.com/pool-repair/pool_repair_arizona.html http://www.phbalancedpool.com/About%20Pool%20Cleaning%20Arizona/About_Page_Pool_Cleaning_Arizona.html http://www.phbalancedpool.com/specials/Pool%20Cleaning%20and%20Pool%20Repair%20Specials.html http://www.phbalancedpool.com/service-areas-in-arizona/Chandler_Gilbert_Mesa_Queen%20Creek_San%20Tan%20Valley.html This is what I am currently using for my Web.config file: <configuration></configuration> <match url=".*"></match> <add input="{HTTP_HOST}" pattern="^phbalancedpool.com$"></add> <action type="Redirect" url="http://www.phbalancedpool.com/{R:0}" <="" span="">redirectType="Permanent" /></action> <location path="pool-repair/pool_repair_arizona.html"></location> <location path="About%20Pool%20Cleaning%20Arizona/About_Page_Pool_Cleaning_Arizona.html"></location> <location path="specials/Pool%20Cleaning%20and%20Pool%20Repair%20Specials.html"></location> <location path="service-areas-in-arizona/Chandler_Gilbert_Mesa_Queen%20Creek_San%20Tan%20Valley.html"></location> Only the first one is actually redirecting and I can't figure out why. What do I need to do to fix this?
Technical SEO | | JordanJudson0 -
Domain Crawl Question
We have our domain hosted by two providers - web.com for the root and godaddy for the subdomain. Why SEOMOZ is not picking up the total pages of the entire domain?
Technical SEO | | AppleCapitalGroup0 -
Un-Indexing a Page without robots.txt or access to HEAD
I am in a situation where a page was pushed live (Went live for an hour and then taken down) before it was supposed to go live. Now normally I would utilize the robots.txt or but I do not have access to either and putting a request in will not suffice as it is against protocol with the CMS. So basically I am left to just utilizing the and I cannot seem to find a nice way to play with the SE to get this un-indexed. I know for this instance I could go to GWT and do it but for clients that do not have GWT and for all the other SE's how could I do this? Here is the big question here: What if I have a promotional page that I don't want indexed and am met with these same limitations? Is there anything to do here?
Technical SEO | | DRSearchEngOpt0 -
What is the sense of robots.txt?
Using robots.txt to prevent search engine from indexing the page is not a good idea. so what is the sense of robots.txt? just for attracting robots to crawl sitemap?
Technical SEO | | jallenyang0