Robots.txt file
-
Does it serve any purpose if we omit robots.txt file ? I wonder if spider has to read all the pages, why do we insert robots.txt file ?
-
As Ryan said, robots.txt file is very useful when you wanna block (disallow) some pages. Indeed, if you don't want that spider crawls your page you must use robots.txt (noindex tags will let bot crawls, but not index, your page). I have got a small website but i dropped robots.txt in my folder. Maybe write just Allow: / could be useless, but you can say: "I respect protocols"
-
A good source to learn about the robots.txt file is here: http://www.robotstxt.org/
The robots.txt file is completely optional. I don't use the file at all on small sites.
The file offers a means to block crawlers which choose to honor the file's instructions from crawling all or part of a site. It also provides the location of a sitemap.
To that end, sitemaps are completely unnecessary for SEO assuming your site has proper navigation. Even if you choose to use a sitemap, you can offer the location via WMT rather then the robots.txt file.
With respect to blocking areas of your site, the primary use would be for CMS, forums, ecommerce or other sites where the software was limited and does not allow the site owner to use noindex on all pages.
As a rule, robots.txt should simply never be used except as a means of last resort. In my experience the file is overused by site owners and SEOs. One exception where I use a robots.txt is during a site's development when I do not wish the site to be crawled at all.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is robots.txt file issue?
I hope you are well. Mostly moz send me a notification that your website can,t be crawled and it says me o check robots.txt file. Now the Question is how can solve this problem and what should I write in robots.txt file? Here is my website. https://www.myqurantutor.com/ need your help brohers.... and Thanks in advance
On-Page Optimization | | matee.usman0 -
[HELP!] File Name and ALT Tags
Hi, please answer my questions: 1. Is it okay to use the same keyword on both file name and alt tags when inserting an image? Example: File Name: buy-lego-online.jpg ALT tag: buy-lego-online Will it trigger Google Panda? Will I be penalized for that? Or the file name and alt tags should be different from each other? Because when inserting an image on Wordpress, the alt tags are always the same as the file name by default. 2. For example, I have 2 images in a page (same topic/niche) and I will put "cheap-lego-for-kids" and "best-lego-for-sale" as alt tags. Considering that I repeat the word "lego", is it considered keyword stuffing? Will I be penalized for that? Thanks in advance!
On-Page Optimization | | bubblymaiko0 -
Is it better to put all your CSS in 1 file or is it no problem to use 10 files or more like on most frameworks?
Is it better to put all your CSS in 1 file or is it no problem to use 10 files or more like on most frameworks?
On-Page Optimization | | conversal0 -
Need suggestion: Should the user profile link be disallowed in robots.txt
I maintain a myBB based forum here. The user profile links look something like this http://www.learnqtp.com/forums/User-Ankur Now in my GWT, I can see many 404 errors for user profile links. This is primarily because we have tight control over spam and auto-profiles generated by bots. Either our moderators or our spam control software delete such spammy member profiles on a periodic basis but by then Google indexes those profiles. I am wondering, would it be a good idea to disallow User profiles links using robots.txt? Something like Disallow: /forums/User-*
On-Page Optimization | | AnkurJ0 -
When You Add a Robots.txt file to a website to block certain URLs, do they disappear from Google's index?
I have seen several websites recently that have have far too many webpages indexed by Google, because for each blog post they publish, Google might index the following: www.mywebsite.com/blog/title-of-post www.mywebsite.com/blog/tag/tag1 www.mywebsite.com/blog/tag/tag2 www.mywebsite.com/blog/category/categoryA etc My question is: if you add a robots.txt file that tells Google NOT to index pages in the "tag" and "category" folder, does that mean that the previously indexed pages will eventually disappear from Google's index? Or does it just mean that newly created pages won't get added to the index? Or does it mean nothing at all? thanks for any insight!
On-Page Optimization | | williammarlow0 -
Problemas with my htaccess file
Hi all, I have two doamins actived which content is the same. Regarding Google some of you guys told me that I wouldn't be penalized but If I wanted to do a redirection 301 it should be done from one domain to the domain which is my main market. And I just did it. But the problem is that although I have made the corresponding modifications in the htaccess file it doesn' work. When I write www.piensapiensa.com goes to piensapiensa.com (as I configured the WMT) and not to www.piensapiensa.es, in which my market is mainly present. Heres the code for the HTACCESS file: Options +FollowSymLinks +ExecCGI
On-Page Optimization | | juanmiguelcr
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www.piensapiensa.es [NC]
RewriteRule (.*) http://www.piensapiensa.es/$1 [R=301,L]
<ifmodule mod_rewrite.c="">RewriteEngine On # uncomment the following line, if you are having trouble
# getting no_script_name to work
#RewriteBase / # we skip all files with .something
#RewriteCond %{REQUEST_URI} ..+$
#RewriteCond %{REQUEST_URI} !.html$
#RewriteRule .* - [L] # we check if the .html version is here (caching)
RewriteRule ^$ index.html [QSA]
RewriteRule ^([^.]+)$ $1.html [QSA]
RewriteCond %{REQUEST_FILENAME} !-f # no, so we redirect to our front web controller
RewriteRule ^(.*)$ index.php [QSA,L]</ifmodule> Thanks in advanced.0 -
Site Maps / Robots.txt etc
Hi everyone I have setup a site map using a Wordpress pluggin: http://lockcity.co.uk/site-map/ Can you please tell me if this is sufficient for the search engines? I am trying to understand the difference between this and having a robots.txt - or do I need both? Many thanks, Abi
On-Page Optimization | | LockCity0 -
Duplication About PDF Files on Website
Hello, My site's URL (web address) is: http://www.vostastores.com/ Above is the Our Website URL. We are in the process of Upgrading Our Website and for that we are adding all Details of each and every products. One of the thing that we are planning to do is to get Manufacturer's product PDF files on our Website which the manufacturer already have on their website. So our Question is that Since the manufacturer has the file on their website and we want to add the same on our website, Will be there any Duplication issue? If yes, then please provide us with a Solution by which we can add the same on our website. Thanks & Regards.
On-Page Optimization | | CommercePundit0