What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
-
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT.
I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap.
Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
-
No, but they might write to it, modify it, or do all sorts of other nasty stuff I've seen hackers do when they get a hold of any writeable file on a system.
-
lol it's a robots text file. what are they going to do. Steal it? I should have clarified do a 777 to make sure that is not your problem, then yes change the permission to be tighter
-
Eesh I don't recommend 777. 644 or, if you're going to change it right back, 755 at most.
-
File permission maybe? Change it to 777 and try it again
-
If you have shell access on Linux you can use wget or GET or run lynx.
If google is getting the wrong robots file then your web server must be sending out something other than what you think is the robots file.
What happens if you do this in your browser:
-
Looking in my log files, Google hits robots.txt just about every time it crawls our site.
What are you trying to accomplish using fetch as Googlebot? Any chance CURL could do the job for you, or another tool that ignores robots.txt?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why Can't Googlebot Fetch Its Own Map on Our Site?
I created a custom map using google maps creator and I embedded it on our site. However, when I ran the fetch and render through Search Console, it said it was blocked by our robots.txt file. I read in the Search Console Help section that: 'For resources blocked by robots.txt files that you don't own, reach out to the resource site owners and ask them to unblock those resources to Googlebot." I did not setup our robtos.txt file. However, I can't imagine it would be setup to block google from crawling a map. i will look into that, but before I go messing with it (since I'm not familiar with it) does google automatically block their maps from their own googlebot? Has anyone encountered this before? Here is what the robot.txt file says in Search Console: User-agent: * Allow: /maps/api/js? Allow: /maps/api/js/DirectionsService.Route Allow: /maps/api/js/DistanceMatrixService.GetDistanceMatrix Allow: /maps/api/js/ElevationService.GetElevationForLine Allow: /maps/api/js/GeocodeService.Search Allow: /maps/api/js/KmlOverlayService.GetFeature Allow: /maps/api/js/KmlOverlayService.GetOverlays Allow: /maps/api/js/LayersService.GetFeature Disallow: / Any assistance would be greatly appreciated. Thanks, Ruben
Technical SEO | | KempRugeLawGroup1 -
HTTP Status showing up in opensiteexplorer top pages as blocked by robot.txt file
I am trying to find an answer to this question it has alot of url on this page with no data when i go into the data source and search for noindex or robot.txt but the site is visible in the search engines ?
Technical SEO | | ReSEOlve0 -
Webmaster Tools and Domain registration
Hi, I have a travel project to manage and a question to arrange the registration of this page. Should I register in Webmaster Tools all domains which lead to the webpage of this travel company like abctravel.com, a-b-c-travel.com, adventure-bahamas-crew-travel.com and adventurebahamascrewtravel.com or only the main domain abctravel.com. Thanks for your advice.
Technical SEO | | reisefm0 -
We have set up 301 redirects for pages from an old domain, but they aren't working and we are having duplicate content problems - Can you help?
We have several old domains. One is http://www.ccisound.com - Our "real" site is http://www.ccisolutions.com The 301 redirect from the old domain to the new domain works. However, the 301-redirects for interior pages, like: http://www.ccisolund.com/StoreFront/category/cd-duplicators do not work. This URL should redirect to http://www.ccisolutions.com/StoreFront/category/cd-duplicators but as you can see it does not. Our IT director supplied me with this code from the HT Access file in hopes that someone can help point us in the right direction and suggest how we might fix the problem: RewriteCond%{HTTP_HOST} ccisound.com$ [NC] RewriteRule^(.*)$ http://www.ccisolutions.com/$1 [R=301,L] Any ideas on why the 301 redirect isn't happening? Thanks all!
Technical SEO | | danatanseo0 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0 -
Access To Client's Google Webmaster Tools
Hi, What's the best/easiest way for a client to grant access to his Google Webmaster Tools to me? Thanks! Best...Michael
Technical SEO | | 945010 -
Google can read japanese or only alphabet ?
Hi Actually im running a web shop in several languages: english, french, spanish, italian, russian, german, japanese, korean and japanese ! lol Im trying to optimize my web site for SEO so i changed URL rewriting rules for example French example: From: http://www.test.com/je-suis-francais.html -> http://www.test.com/Je-suis-français.html Japanese example: (i use UTF8 encoding) From: http://www.test.com/watashiwa-nihonnjin-desu.html -> http://www.test.com/私は日本人です.html So i get something like wikipedia (url with accents, ideogramms in several languages) Do you think wikipedia and me are doing wrong ?
Technical SEO | | nipponx0 -
Why do I see dramatic differences in impressions between Google Webmaster Tools and Google Insights for Search?
Has anyone else noticed discrepancies between these tools? Take keyword A and keyword B. I've literally seen situations where A has 3 or 4 times the traffic as B in Google Webmaster Tools, but half the traffic of B in Google Insights for Search. What might be the reason for this discrepancy?
Technical SEO | | ir-seo-account0