Googlebot Can't Access My Sites After I Repair My Robots File

NiallSmith

Hello Mozzers,

A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file'

My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows:

User-agent: *

Disallow: /cgi-bin/

After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage.

I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period.

Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'?

Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!

Igal_Zeifman

Oleg gave a great answer.

Still I would add 2 things here:

1. Go to GWMT and under "Health" do a "Fetch as Googlebot" test.
This will tell you what pages are reachable.

2. I`ve saw some occasions of server-level Googlebot blockage.
If your robots.txt is fine and your page contains no "no-index" tags, and yet you still getting an error message while fetching, you should get a hold on your access logs and check it for Googlebot user-agents to see if (and when) you were last visited.

This will help you pin-point the issue, when talking to your hosting provider (or 3rd party security vendor).

If unsure, you can find Googlebot information (user agent and IPs ) at Botopedia.org.

Webrevolve

A great answer

OlegKorneitchouk

Maybe the spacing is off when you posted it here, but blank lines can affect robots.txt files. Try code:

User-agent: *
Disallow: /cgi-bin/
#End Robots#

Also, check for robot blocking meta tags on the individual pages.

You can test to see if Google can access specific pages through GWT > Health > Blocked URLs (should see your robots.txt file contents int he top text area, enter the urls to test in the 2nd text area, then press "Test" at the bottom - test results will appear at the bottom of the page)

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Googlebot Can't Access My Sites After I Repair My Robots File

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Merging B2B site with B2C site

Can anyone see any issues with the canonical tags on this web site?

Investigating Google's treatment of different pages on our site - canonicals, addresses, and more.

Don't affiliate programs have an unfair impact on a company's ability to compete with bigger businesses?

Moving from a static HTML CSS site with .html files to a Wordpress Site while keeping link structure

Are htm files stronger than aspx files?

My homepage doesn't rank anymore. It's been replaced by irrelevant subpages which rank around 100-200 instead of top 5.

Site views messy in a text browser, but can see all text, is that a problem?