Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Oh no googlebot can not access my robots.txt file
-
I just receive a n error message from google webmaster
Wonder it was something to do with Yoast plugin.
Could somebody help me with troubleshooting this?
Here's original message
Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.
Recommended action
If the site error rate is 100%:
- Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot.
- If your robots.txt is a static page, verify that your web service has proper permissions to access the file.
- If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure.
If the site error rate is less than 100%:
- Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors.
- The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website.
After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site.
-
I can open text file but Godaddy told me robots.txt file is not on my server (root level).
Also told me that my site is not crawled because robot.txt file is not there.
Basically all of those might have resulted from plug in I was using (term optimizer)
Based on what Godaddy told me, my .htaccess file was crashed because of that and had to be recreated. So now .htaceess file is good.
Now I have to figure out is why my site is not accessible from Googlebot.
Let me know Keith if this is a quick fix or need some time to troubleshoot. You can send me a message to discuss about fees if nessary.
Thanks again
-
Hi,
You have a robots.txt file here: http://www.soobumimphotography.com/robots.txt
Can you write this again in English so it makes sense?
"I called Godaddy and told me if I used any plug ins etc. Godaddy fixed .htaccss file and my site was up and runningjust fine."
Yes google xml sitemaps will add the location of your stitemap to the robots.txt file - but there is nothing wrong with your robots.txt file.
-
I just called Godaddy and told me that I don't have robots.txt tile. Can anyone help with this issue?
So here's what happen:
I purchased Joos de Vailk's Term Optimizer to consolidate tags etc.
As soon as I installed & opened it, my site crashed.
I called Godaddy and told me if I used any plug ins etc. Godaddy fixed .htaccss file and my site was up and runningjust fine.
Isn't plugin like the Google XML Sitemaps automatically generates robots.txt file?
-
Yes, my site was down.
-
I had a .htaccess issue past 24 hour with plug in and Godaddy had fixed it for me.
I think this caused problem.
I just fetched again and still getting unreachable page. I wonder if I have bad .htaccess file
-
Was your site down during this period?
I would recommend setting up pingdom.com (free site monitoring), this will email you if your site goes down - I suspect this is a hosting related issue.
FYI, I can access your robots.txt fine from here.
-
Hi Bistoss, You should log into Google Webmaster Tools to check the day the problem occurred. It is not uncommon for host to have problems that temporarily cause access problems. In some rare cases Google itself could be having problems. For example, in July we had 1 day with a 11% failure rate, it was the host. Since then no problems. If your problems are persistent, then you may have an issue like this: http://blog.jitbit.com/2012/08/fixing-googlebot-cant-access-your-site.html old Analytic code. Other things to look at is any recent changes, specifically anything that had to do with .htaccess Be sure to use the FETCH AS GOOGLE bot after any changes to verify that Google can now crawl your site. Hope this helps
-
I also use Robots Meta Configuration plug in
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt in subfolders and hreflang issues
A client recently rolled out their UK business to the US. They decided to deploy with 2 WordPress installations: UK site - https://www.clientname.com/uk/ - robots.txt location: UK site - https://www.clientname.com/uk/robots.txt
Technical SEO | | lauralou82
US site - https://www.clientname.com/us/ - robots.txt location: UK site - https://www.clientname.com/us/robots.txt We've had various issues with /us/ pages being indexed in Google UK, and /uk/ pages being indexed in Google US. They have the following hreflang tags across all pages: We changed the x-default page to .com 2 weeks ago (we've tried both /uk/ and /us/ previously). Search Console says there are no hreflang tags at all. Additionally, we have a robots.txt file on each site which has a link to the corresponding sitemap files, but when viewing the robots.txt tester on Search Console, each property shows the robots.txt file for https://www.clientname.com only, even though when you actually navigate to this URL (https://www.clientname.com/robots.txt) you’ll get redirected to either https://www.clientname.com/uk/robots.txt or https://www.clientname.com/us/robots.txt depending on your location. Any suggestions how we can remove UK listings from Google US and vice versa?0 -
Robots.txt & meta noindex--site still shows up on Google Search
I have set up my robots.txt like this: User-agent: *
Technical SEO | | RoxBrock
Disallow: / and I have this meta tag in my on a Wordpress site, set up with SEO Yoast name="robots" content="noindex,follow"/> I did "Fetch as Google" on my Google Search Console My website is still showing up in the search results and it says this: "A description for this result is not available because of this site's robots.txt" This site has not shown up for years and now it is ranking above my site that I want to rank for this keyword. How do I get Google to ignore this site? This seems really weird and I'm confused how a site with little content, that has not been updated for years can rank higher than a site that is constantly updated and improved.1 -
Multiple robots.txt files on server
Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate)
Technical SEO | | mjukhud
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0 -
Good robots txt for magento
Dear Communtiy, I am trying to improve the SEO ratings for my website www.rijwielcashencarry.nl (magento). My next step will be implementing robots txt to exclude some crawling pages.
Technical SEO | | rijwielcashencarry040
Does anybody have a good magento robots txt for me? And what need i copy exactly? Thanks everybody! Greetings, Bob0 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
How to allow googlebot past paywall
Does anyone know of any ways or ideas to allow Google/Bing etc. to index your content, but have it behind a paywall for users?
Technical SEO | | MirandaP0 -
Robots.txt and canonical tag
In the SEOmoz post - http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts, it's being said - If you have a robots.txt disallow in place for a page, the canonical tag will never be seen. Does it so happen that if a page is disallowed by robots.txt, spiders DO NOT read the html code ?
Technical SEO | | seoug_20050 -
Is blocking RSS Feeds with robots.txt necessary?
Is it necessary to block an rss feed with robots.txt? It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html) And, google says here that it's important not to block RSS feeds (http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html) I'm just checking!
Technical SEO | | nicole.healthline0