What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
-
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT.
I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap.
Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
-
No, but they might write to it, modify it, or do all sorts of other nasty stuff I've seen hackers do when they get a hold of any writeable file on a system.
-
lol it's a robots text file. what are they going to do. Steal it?
I should have clarified do a 777 to make sure that is not your problem, then yes change the permission to be tighter
-
Eesh I don't recommend 777. 644 or, if you're going to change it right back, 755 at most.
-
File permission maybe? Change it to 777 and try it again
-
If you have shell access on Linux you can use wget or GET or run lynx.
If google is getting the wrong robots file then your web server must be sending out something other than what you think is the robots file.
What happens if you do this in your browser:
-
Looking in my log files, Google hits robots.txt just about every time it crawls our site.
What are you trying to accomplish using fetch as Googlebot? Any chance CURL could do the job for you, or another tool that ignores robots.txt?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Weird 404 errors in Webmaster Tools
Hi, In a regular check with Webmaster Tools, I have noticed some weird 404 errors, for example, my domain URL is something like http://domainname.com/, the 404 error points to some weird URLs like http://domainname.com/james-bond&page=2/ and http://domainname.com/juegos-de&page=3/, at first I have tried to block them by robots.txt, but now I am getting these kind of 404 errors a lot, and don't think blocking them all is a perfect solution. Can anyone help me out with the issue? Thank you in advance.
Technical SEO | | nishthaj
cheers.0 -
How long does it take for Webmaster Tools to index a site?
I submitted my client's site about a week ago. It had 138 links, it's still at 43 links. Should it be taking that long to index? Thanks! Luciana
Technical SEO | | Luciana_BAH1 -
Does Google add parameters to the URL parameters in webmaster tools/
I am seeing new parameters added (and sometimes removed) from the URL Parameter tool. Is there anything that would add parameters to the tool? Or does it have to be someone internally? FYI - They always have no date in the configured column, no effect set, and crawl is set to Let Google decide.
Technical SEO | | merch_zzounds0 -
Weird Cigarette URLs showing up in Google Webmaster Tools
Hi there, I'm noticing a bunch of URLs showing up in my google webmaster tools that are all cigarette related (they are appearing as 404s in the crawl error report). They are throwing 404 errors which is why they are listed here... Anyone have any idea of what this could be? I recently switched from Wordpress to Shopify and these weird URLs just started appearing on my webmaster tools in the last week. Kinda bizarre / a little alarming! Thanks,
Technical SEO | | TheBatesMillStore
Bianca0 -
Has Google Stopped Listing URLs with Crawl Errors in Webmaster Tools?
I went to Google Webmaster Tools this morning and found that one of my clients had 11 crawl errors. However, Webmaster Tools is not showing which URLs are having experiencing the errors, which it used to do. (I checked several other clients that I manage and they list crawl errors without showing the specific URLs. Does anyone know how I can find out which URLs are experiencing problems? (I checked with Bing Webmaster Tools and the number of errors are different).
Technical SEO | | TopFloor0 -
Weird 404 Errors in Webmaster Tools
Hi, In a regular check with Webmaster Tools, I have noticed a sudden increase in the number of "not found-404" errors. So I have been looking at them and noticed something weird has been going on. There are well over 100 pages with 404-errors. The funny thing is, none of the ULR's are correct, For example, if the actual url is something like www.domain.com/latest-reviews , the 404-error points to a non-existent URL like www.domain.com/latest-re And when I checked where they were linked from, they are all from these spammy sites. Anyone know what could be causing these links, why would anyone link on purpose to a non-existent page? cheers,
Technical SEO | | Gamer070 -
How to remove crawl errors in google webmaster tools
In my webmaster tools account it says that I have almost 8000 crawl errors. Most of which are http 403 errors The urls are http://legendzelda.net/forums/index.php?app=members§ion=friends&module=profile&do=remove&member_id=224 http://legendzelda.net/forums/index.php?app=core&module=attach§ion=attach&attach_rel_module=post&attach_id=166 And similar urls. I recently blocked crawl access to my members folder to remove duplicate errors but not sure how i can block access to these kinds of urls since its not really a folder thing. Any idea on how to?
Technical SEO | | NoahGlaser780 -
Robots.txt File Redirects to Home Page
I've been doing some site analysis for a new SEO client and it has been brought to my attention that their robots.txt file redirects to their homepage. I was wondering: Is there a benfit to setup your robots.txt file to do this? Will this effect how their site will get indexed? Thanks for your response! Kyle Site URL: http://www.radisphere.net/
Technical SEO | | kchandler0