Robots.txt file
-
Does it serve any purpose if we omit robots.txt file ? I wonder if spider has to read all the pages, why do we insert robots.txt file ?
-
As Ryan said, robots.txt file is very useful when you wanna block (disallow) some pages. Indeed, if you don't want that spider crawls your page you must use robots.txt (noindex tags will let bot crawls, but not index, your page). I have got a small website but i dropped robots.txt in my folder. Maybe write just Allow: / could be useless, but you can say: "I respect protocols"
-
A good source to learn about the robots.txt file is here: http://www.robotstxt.org/
The robots.txt file is completely optional. I don't use the file at all on small sites.
The file offers a means to block crawlers which choose to honor the file's instructions from crawling all or part of a site. It also provides the location of a sitemap.
To that end, sitemaps are completely unnecessary for SEO assuming your site has proper navigation. Even if you choose to use a sitemap, you can offer the location via WMT rather then the robots.txt file.
With respect to blocking areas of your site, the primary use would be for CMS, forums, ecommerce or other sites where the software was limited and does not allow the site owner to use noindex on all pages.
As a rule, robots.txt should simply never be used except as a means of last resort. In my experience the file is overused by site owners and SEOs. One exception where I use a robots.txt is during a site's development when I do not wish the site to be crawled at all.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using a dash or underscores in file names.
Is it better to use a dash or an underscore in file names to improve SEO? EX memory_flash.jpg or memory-flash.jpg Or does it make no difference?
On-Page Optimization | | Robotnik0 -
Internal Linking : File Name or URI/filename.html
Hi, For crawlability, what would be the best way to do internal linking. /aboutus.html or www.xyz.com/aboutus.com Would this make any difference to the website in terms of SEO or bot crawling the website? Regards, Sree
On-Page Optimization | | jungleegames0 -
When You Add a Robots.txt file to a website to block certain URLs, do they disappear from Google's index?
I have seen several websites recently that have have far too many webpages indexed by Google, because for each blog post they publish, Google might index the following: www.mywebsite.com/blog/title-of-post www.mywebsite.com/blog/tag/tag1 www.mywebsite.com/blog/tag/tag2 www.mywebsite.com/blog/category/categoryA etc My question is: if you add a robots.txt file that tells Google NOT to index pages in the "tag" and "category" folder, does that mean that the previously indexed pages will eventually disappear from Google's index? Or does it just mean that newly created pages won't get added to the index? Or does it mean nothing at all? thanks for any insight!
On-Page Optimization | | williammarlow0 -
Can we listed URL on Website sitemap page which are blocked by Robots.txt
Hi, I need your help here. I have a website, and few pages are created for country specific. (www.example.com/uk). I have blocked many country specific pages from Robots.txt file. It is advisable to listed those urls (blocked by robots.txt) on my website sitemap. (html sitemap page) I really appreciate your help. Thanks, Nilay
On-Page Optimization | | Internet-Marketing-Profs0 -
Disallow a spammed sub-page from robots.txt
Hi, I have a sub-page on my website with a lot of spam links pointing on it. I was wondering if Google will ignore that spam links on my site if i go and hide this page using the robots.txt Does that will get me out of Google's randar on that page or its useless?
On-Page Optimization | | Lakiscy0 -
Duplication About PDF Files on Website
Hello, My site's URL (web address) is: http://www.vostastores.com/ Above is the Our Website URL. We are in the process of Upgrading Our Website and for that we are adding all Details of each and every products. One of the thing that we are planning to do is to get Manufacturer's product PDF files on our Website which the manufacturer already have on their website. So our Question is that Since the manufacturer has the file on their website and we want to add the same on our website, Will be there any Duplication issue? If yes, then please provide us with a Solution by which we can add the same on our website. Thanks & Regards.
On-Page Optimization | | CommercePundit0 -
Robots.txt: excluding URL
Hi, spiders crawl some dynamic urls in my website (example: http://www.keihome.it/elettrodomestici/cappe/cappa-vision-con-tv-falmec/714/ + http://www.keihome.it/elettrodomestici/cappe/cappa-vision-con-tv-falmec/714/open=true) as different pages, resulting duplicate content of course. What is syntax for disallow these kind of urls in robots.txt? Thanks so much
On-Page Optimization | | anakyn0 -
Photogallery and Robots.txt
Hey everyone SEOMOZ is telling us that there are to many onpage links on the following page: http://www.surfcampinportugal.com/photos-of-the-camp/ Should we stop it from being indexed via Robots.txt? best regards and thanks in advance... Simon
On-Page Optimization | | Rapturecamps0