I want to block search bots in crawling all my website's pages expect for homepage. Is this rule correct?
-
User-agent: *
Disallow: /*
-
-
Thanks Matt! I will surely test this one.
-
Thanks David! Will try this one.
-
Use this:
User-agent: Googlebot
Noindex: /User-agent: Googlebot
Disallow: /User-agent: *
Disallow: /This is what I use to block our dev sites from being indexed and we've had no issues.
-
Actually, there are two regex that Robots can handle - asterisk and $.
You should test this one. I think it will work (about 95% sure - tested in WMT quickly):
User-agent: *
Disallow: /
Allow: /$ -
I don't think that will work. Robots.txt doesn't handle regular expressions. You will have to explicitly list all of the folders, and files to be super sure, that nothing is indexed unless you want it to be found.
This is kind of an odd question. I haven't thought about something like this in a while. I usually want everything but a couple folders indexed. : ) I found something that may be a little more help. Try reading this.
If you're working with extensions, you can use **Disallow:/*.html$ **or php or what have you. That may get you closer to a solution.
Definitely test this with a crawler that obeys robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can't crawl website with Screaming frog... what is wrong?
Hello all - I've just been trying to crawl a site with Screaming Frog and can't get beyond the homepage - have done the usual stuff (turn off JS and so on) and no problems there with nav and so on- the site's other pages have indexed in Google btw. Now I'm wondering whether there's a problem with this robots.txt file, which I think may be auto-generated by Joomla (I'm not familiar with Joomla...) - are there any issues here? [just checked... and there isn't!] If the Joomla site is installed within a folder such as at e.g. www.example.com/joomla/ the robots.txt file MUST be moved to the site root at e.g. www.example.com/robots.txt AND the joomla folder name MUST be prefixed to the disallowed path, e.g. the Disallow rule for the /administrator/ folder MUST be changed to read Disallow: /joomla/administrator/ For more information about the robots.txt standard, see: http://www.robotstxt.org/orig.html For syntax checking, see: http://tool.motoricerca.info/robots-checker.phtml User-agent: *
Intermediate & Advanced SEO | | McTaggart
Disallow: /administrator/
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /modules/
Disallow: /plugins/
Disallow: /tmp/0 -
How do I tell if competitor's links are good?
One strategy I have seen recommended over and over is to look at your competitor's back links and see if any could be relevant for your site and worth pursuing. My question is how do I evaluate a link and not end up pursuing some penalized site? I would guess checking for Google index is a good idea since some of the webmasters may not be aware they are penalized. Is it DA and whether they are indexed alone? Many sites I have seen have DA in the teens but are legitimate in our industry. Should they not be considered due to low DA? Also I see links from directories on many competitor sites. Seems a controversial subject, but assuming the directory is industry specific, is it OK? Thanks in advance!
Intermediate & Advanced SEO | | Chris6610 -
How to find all of a website's SERPs?
Was wondering how easiest to find all of a website's existing SERPs?
Intermediate & Advanced SEO | | McTaggart0 -
Are links to on-page content crawled / have any effect on page rank?
Lets say I have a really long article that begins with links to <a name="something">anchors on the same page.</a> <a name="something"></a> <a name="something">E.g.,</a> Chapter 1, Chapter 2, etc, allowing the user to scroll down to different content. There are also other links on this page that link to other pages. A few questions: Googlebot arrives on the page. Does it crawl links that point to anchors on the same page? When link juice is divided among all the links on the page, do these links count and page rank is then lost? Thanks!
Intermediate & Advanced SEO | | anthematic0 -
How are pages ranked when using Google's "site:" operator?
Hi, If you perform a Google search like site:seomoz.org, how are the pages displayed sorted/ranked? Thanks!
Intermediate & Advanced SEO | | anthematic0 -
Charity project for local women's shelter - need help: will Google notice if you alter the document title with Javascript after the page loads?
I am doing some pro-bono work with a local shelter for female victims of domestic abuse. I am trying to help visitors to the site cover their tracks by employing a document.title change when the page loads using JavaScript. This shelter receives a lot of traffic from Google. I worry that the Google bots will see this javascript change and somehow penalize this site or modify the title in the SERPs. Has anyone had any experience with this kind of javascript maneuver? All help would be greatly appreciated!
Intermediate & Advanced SEO | | jkonowitch0 -
Does using robots.txt to block pages decrease search traffic?
I know you can use robots.txt to tell search engines not to spend their resources crawling certain pages. So, if you have a section of your website that is good content, but is never updated, and you want the search engines to index new content faster, would it work to block the good, un-changed content with robots.txt? Would this content loose any search traffic if it were blocked by robots.txt? Does anyone have any available case studies?
Intermediate & Advanced SEO | | nicole.healthline0 -
Category Pages in competition with Homepages
I am finding it a real uphill task with a few of our clients with there either product or category pages competing against other sites on main keywords. The sites either categories or product specific pages are in direct competition with other sites homepages and I am finding it increasingly more difficult to break into positions. What are other peoples experiences with this ? Do you feel the way the pages are ranked within the xml sitemap with priority could also be a factor.
Intermediate & Advanced SEO | | onlinemediadirect0