Robots.txt question
-
I want to block spiders from specific specific part of website (say abc folder).
In robots.txt, i have to write -
User-agent: *
Disallow: /abc/
Shall i have to insert the last slash. or will this do
User-agent: *
Disallow: /abc
-
I will do so. And hope to get that back.
-
If you contact the help desk, they can probably help you get your old account back.
-
I am the same person with the username seoug, but lost that account. So, had to start afresh ! I was a PR0 member, but accidently deleted that account ( it was not intentional ). And now , when i tried login in, i get a message that seoug name is already taken.
-
Thanks for clearing my doubts.
-
at least our answers agree, so no Atul is doubley sure of how to do it...
-
EGOL does it to me all the time!
-
Hi Atul,
Add the trailing slash.
/abc could be a page url. Where as /abc/ is definitely a folder.
http://www.robotstxt.org/robotstxt.html <-- Everything you ever wanted to know about robots.txt
Regards
Aran
[EDIT: Damn it, Ryan submitted whilst I was answering! Must type faster ]
-
Use the trailing slash.
More about robots.txt can be learned at this site: http://www.robotstxt.org/
The trailing slash indicates you are blocking a folder. Without the slash the object would be considered a file (i.e. page). I am not sure what the result would be if you tried to block a folder without the trailing slash. Even if it worked it would not be the correct code and may lead to various bots treating it differently.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt - "File does not appear to be valid"
Good afternoon Mozzers! I've got a weird problem with one of the sites I'm dealing with. For some reason, one of the developers changed the robots.txt file to disavow every site on the page - not a wise move! To rectify this, we uploaded the new robots.txt file to the domain's root as per Webmaster Tool's instructions. The live file is: User-agent: * (http://www.savistobathrooms.co.uk/robots.txt) I've submitted the new file in Webmaster Tools and it's pulling it through correctly in the editor. However, Webmaster Tools is not happy with it, for some reason. I've attached an image of the error. Does anyone have any ideas? I'm managing another site with the exact same robots.txt file and there are no issues. Cheers, Lewis FNcK2YQ
Technical SEO | | PeaSoupDigital0 -
Question on Google's Site: Search
A client currently has two domains with the same content on each. When I pull up a Cached version of the site, I noticed that it has a Cache of the correct page on it. However, when I do a site: in Google, I am seeing the domain that we don't want Google indexing. Is this a problem? There is no canonical tag and I'm not sure how Google knows to cache the correct website but it does. I'm assuming they have this set in webmaster tools? Any help is much appreciated! Thanks!
Technical SEO | | jeff_46mile0 -
Sharing/hosting of content questions...
I just wanted to get opinion on some of the fundamentals and semantics of optimisation and content generation/distribution - your thoughts and opinions are welcome. OK, for example, lets assume (for illustration purposes) that I have a site - www.examplegolfer.com aimed at golfers with golf related content. The keywords I would like to optimise for are: golf balls golf tees lowering your golf handicap drive a golf ball further Now, I'm going to be creating informative, useful content (infographics, articles, how to guides, video demonstrations etc) centred around these topics/keywords, which hopefully our audience/prospects will find useful and bookmark, share and monition our site/brand on the web, increasing (over time) our position of these terms/keywords in the SERP's. Now, once I've researched and created my content piece, where should I place it? Let's assume it's an infographic - should this be hosted on an infographic sharing site (such as Visually) or on my site, or both? If it's hosted or embedded on my site, should this be in a blog or on the page I'm optimising for (and I've generated my keyword around)? For example, if my infographic is around golf balls, should this be embedded on the page www.examplegolfer.com/golf-balls (the page I'm trying to optimise) and if so, and it's also placed elsewhere around the internet (i.e on Visually for example), this could technically be seen as duplicated content as the infographic is on my site and on Visually (for example)? How does everyone else share/distribute/host their created content in various locations whilst avoiding the duplicated content issue? Or have I missed something? Also, how important is it to include my keyword (golf balls) in the pieces' title or anchor text? Or indeed within the piece itself? One final question - should the content by authoured/shared as the brand/company or an individual (spokesperson if you like) on behalf of the company (i.e. John Smith)? I'm all for creating great, interesting, useful content for my audience, however I want to ensure we're getting the most out of it as researching influencers, researching the piece and creating it and distributing it isn't a quick or easy job (as we all know!). Thoughts and comments welcome. Thanks!
Technical SEO | | Carl2870 -
Mobile or Responsive canonical question?
Hi guys We are in the process of expanding and are moving our site to magento enterprise. Today we met with a company pitching a seperate mobile site. While Im al for a mobile site in terms of look and user experience, from an seo point i dont believe and "m." domain is the best idea. However if we were to go with a mobile site, would adding canonical tags to the mobile urls pointing to the desktop urls be useful? For example m.trespass.co.uk/category-page has the canonical tag pointing to trespass.co.uk/category-page Im looking for someone who has direct experience wth this situation for one of their clients. Thanks Robert
Technical SEO | | Trespass0 -
Confirming Robots.txt code deep Directories
Just want to make sure I understand exactly what I am doing If I place this in my Robots.txt Disallow: /root/this/that By doing this I want to make sure that I am ONLY blocking the directory /that/ and anything in front of that. I want to make sure that /root/this/ still stays in the index, its just the that directory I want gone. Am I correct in understanding this?
Technical SEO | | cbielich0 -
Robots.txt issue - site resubmission needed?
We recently had an issue when a load of new files were transferred from our dev server to the live site, which unfortunately included the dev site's robots.txt file which had a disallow:/ instruction. Bad! Luckily I spotted it quickly and the file has been replaced. The extent of the damage seems to be that some descriptions aren't displaying and we're getting a message about robots.txt in the SERPs for a few keywords. I've done a site: search and generally it seems to be OK for 99% of our pages. Our positions don't seem to be affected right now but obviously it's not great for the CTRs on those keywords affected. My question is whether there is anything I can do to bring the updated robots.txt file to Google's attention? Or should we just wait and sit it out? Thanks in advance for your answers!
Technical SEO | | GBC0 -
Google Knowledge Graph related question
I have a client who is facing age discrimination in the film industry. (Big surprise there.) The problem is, when you type in his name, Google's new Knowledge Graph displays a brief bio about him to the right of the search results. This bio snippet includes his year of birth. Wikipedia is credited as the source for the bio information about him, and yet, his Wikipedia entry doesn't include his age or birth date. Neither does his iMDb bio. So the question is, How can he figure out where Google is getting that birthdate from? He wants to try and remove it, not falsify it. Thanks for any help you can offer.
Technical SEO | | JamesAMartin0 -
Client accidently blocked entire site with robots.txt for a week
Our client was having a design firm do some website development work for them. The work was done on a staging server that was blocked with a robots.txt to prevent duplicate content issues. Unfortunately, when the design firm made the changes live, they also moved over the robots.txt file, which blocked the good, live site from search for a full week. We saw the error (!) as soon as the latest crawl report came in. The error has been corrected, but... Does anyone have any experience with a snafu like this? Any idea how long it will take for the damage to be reversed and the site to get back in the good graces of the search engines? Are there any steps we should take in the meantime that would help to rectify the situation more quickly? Thanks for all of your help.
Technical SEO | | pixelpointpress0