Googlebot Can't Access My Sites After I Repair My Robots File
-
Hello Mozzers,
A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file'
My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows:
User-agent: *
Disallow: /cgi-bin/
After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage.
I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period.
Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'?
Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!
-
Oleg gave a great answer.
Still I would add 2 things here:
1. Go to GWMT and under "Health" do a "Fetch as Googlebot" test.
This will tell you what pages are reachable.2. I`ve saw some occasions of server-level Googlebot blockage.
If your robots.txt is fine and your page contains no "no-index" tags, and yet you still getting an error message while fetching, you should get a hold on your access logs and check it for Googlebot user-agents to see if (and when) you were last visited.This will help you pin-point the issue, when talking to your hosting provider (or 3rd party security vendor).
If unsure, you can find Googlebot information (user agent and IPs ) at Botopedia.org.
-
A great answer
-
Maybe the spacing is off when you posted it here, but blank lines can affect robots.txt files. Try code:
User-agent: *
Disallow: /cgi-bin/
#End Robots#Also, check for robot blocking meta tags on the individual pages.
You can test to see if Google can access specific pages through GWT > Health > Blocked URLs (should see your robots.txt file contents int he top text area, enter the urls to test in the 2nd text area, then press "Test" at the bottom - test results will appear at the bottom of the page)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How necessary is it to disavow links in 2017? Doesn't Google's algorithm take care of determining what it will count or not?
Hi All, So this is a obvious question now. We can see sudden fall or rise of rankings; heavy fluctuations. New backlinks are contributing enough. Google claims it'll take care of any low quality backlinks without passing pagerank to website. Other end we can many scenarios where websites improved ranking and out of penalty using disavow tool. Google's statement and Disavow tool, both are opposite concepts. So when some unknown low quality backlinks are pointing and been increasing to a website? What's the ideal measure to be taken?
Intermediate & Advanced SEO | | vtmoz0 -
Weird behavior with site's rankings
I have a problem with my site's rankings.
Intermediate & Advanced SEO | | Mcurius
I rank for higher difficulty (but lower search volume) keywords , but my site gets pushed back for lower difficulty, higher volume keywords, which literally pisses me off. I thought very seriously to start new with a new domain name, cause what ever i do seems that is not working. I will admit that in past (2-3 years ago) i used some of those "seo packages" i had found, but those links which were like no more than 50, are all deleted now, and the domains are disavowed.
The only thing i can think of, is that some how my site got flagged as suspicious or something like that in google. Like 1 month ago, i wrote an article about a topic related with my niche, around a keyword that has difficulty 41%. The search term in 1st page has high authority domains, including a wikipedia page, and i currently rank in the 3rd place. In the other had, i would expect to rank easily for a keyword difficulty of 30-35% but is happening the exact opposite.The pages i try to rank, are not spammy, are checked with moz tools, and also with canirank spam filters. All is good and green. Plus the content of those pages i try to rank have a Content Relevancy Score which varies from 98% to 100%... Your opinion would be very helpful, thank you.0 -
My site is always in the top 4 on google, and sometimes goes to #2\. But the site at #1 is always at #1 .. how can i beat them?
So i'm sure this is a very generic question.. of course everyone wants to be #1. We are an ecommerce web site. We have all sorts of products, user ratings, and are loved by our customers. We sell over 3 million a year. So let me give you some data.. First of all one of the sites that keeps taking the #2 or #3 spot is amazons category for what we sell.. (i'm not sure if I should say who we are here.. as I don't want the #1 spot to realize we are trying to take them over!) Amazon of course has a domain authority of 100. But they never take the #1 spot. The other site that takes the #2 and #3 spot is not even selling anything. Happens to be a technical term's with the same name wikipedia page! (i wish google would figure out people aren't looking for that!) Anyways.. every day we bouce back and forth between #4 and #2.. but #1 never changes.. Here are the stats of us verse #1 from moz: #1: Page Authority: 56.8, Root Domains Linking to page: 158, Domain Authority: 54.6: root domains linking to the root domain 1.42k my site: Page Authority: 60.6, Root domains linking to the page: 562, Domain Authority: 52.8: root domains linking to the root domain: 1.03k So they beat us in domain authority SLIGHTLY and in root domains linking to the root domain. So SEO masters.. what do I do to fix this? Get better backlinks? But how.... I can't just email GQ and ask them to write about us can I? I'm open to all things.. Maybe i'm not using moz data correctly.. We should at least be #2. We get #2 every other day.
Intermediate & Advanced SEO | | 88mph0 -
Site Structure: How do I deal with a great user experience that's not the best for Google's spiders?
We have ~3,000 photos that have all been tagged. We have a wonderful AJAXy interface for users where they can toggle all of these tags to find the exact set of photos they're looking for very quickly. We've also optimized a site structure for Google's benefit that gives each category a page. Each category page links to applicable album pages. Each album page links to individual photo pages. All pages have a good chunk of unique text. Now, for Google, the domain.com/photos index page should be a directory of sorts that links to each category page. Alternatively, the user would probably prefer the AJAXy interface. What is the best way to execute this?
Intermediate & Advanced SEO | | tatermarketing0 -
Can we retrieve all 404 pages of my site?
Hi, Can we retrieve all 404 pages of my site? is there any syntax i can use in Google search to list just pages that give 404? Tool/Site that can scan all pages in Google Index and give me this report. Thanks
Intermediate & Advanced SEO | | mtthompsons0 -
How can Google index a page that it can't crawl completely?
I recently posted a question regarding a product page that appeared to have no content. [http://www.seomoz.org/q/why-is-ose-showing-now-data-for-this-url] What puzzles me is that this page got indexed anyway. Was it indexed based on Google knowing that there was once content on the page? Was it indexed based on the trust level of our root domain? What are your thoughts? I'm asking not only because I don't know the answer, but because I know the argument is going to be made that if Google indexed the page then it must have been crawlable...therefore we didn't really have a crawlability problem. Why Google index a page it can't crawl?
Intermediate & Advanced SEO | | danatanseo0 -
Site has no SEO done on it. It wasn't considered during design. What to do first ?
They opted for videos to explain to people what the website is about, but it ain't working for them. What steps would you take in order to get this site to rank higher without completely changing the design(changing design is out of the question they are low on funds). They also built a blog on wordpress.com and added a .me domain to it. For obvious reasons I'm not mentioning the website.
Intermediate & Advanced SEO | | ternit0