Our crawler was not able to access the robots.txt file on your site
-
Hello Mozzers!
I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed in Moz.
https://www.thefurnshop.co.uk/robots.txt
and Google isn't flagging anything up to us.
Does anyone know how to solve this problem?
Thanks
-
@LoganRay This was our issue. Didn't know Moz tries to retrieve the HTTP robots.txt first. Our HTTPS redirect was not working on static files only, so the HTTP path to the robots.txt was failing. We did not notice it because the HSTS policy was forcing the browser to redirect.
-
Wanted to jump back in on this topic as I've just confirmed my initial suspicion.
I just added a new client to our Moz account and had the exact same issue, crawler unable to access the robots.txt file. It's a secure site and was configured in Moz without the HTTPS. When I go to the robots.txt file without https://www, it redirects to the same thing as yours where the / between the TLD and page path gets removed.
Reconfigure your site and it should begin to work.
-
There are 2 parts of your robots.txt that could be causing this, and it all just depends on how each bot is reading regular expressions in your robots.txt:
First, your Disallow: /? can be read as Disallow all paths starting with "/" with 0 to infinity characters "" and one character "?". Try replacing this part with Disallow: /*? to make it not crawl anything with a query string (which is what I believe you were going for).
Second, you have a open Disallow followed by the User-agent: rogerbot and while this should not be read this way, once again it all depends on how each bot reads the commands. To fix this you should change your Disallow following your Googlebot-Image as Disallow: /
-
Hi there,
There's something odd going on when I try to access your robots.txt file without the www. The www gets added back on, but when it does, the slash between the TLD and page path gets deleted, see below. I'm guessing your domain in Moz is configured without the www, which means RogerBot is getting redirected to this slash-less version of the file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the quickest and easiest way to run an SEO audit on a Wordpress site that at least shows all the mechanical problems?
What is the quickest and easiest way to run an SEO audit on a Wordpress site that at least shows all the mechanical problems?
Getting Started | | integratedproperty0 -
Can I access old data/keyword research if I cancel my Moz Pro account?
I'm currently on the free month trial period for Moz Pro and I will probably cancel the account before the free period ends, but if I want to renew my subscription later, what happens to all the previous data? And does all the keyword research I've done disappear when I cancel it, or is it restored when I renew the subscription? Any insight is helpful! Thank you!
Getting Started | | TeamOneRep0 -
Site with 2 domains - 1 domain SEO opimised & 1 is not. How best to handle crawlers?
Situation: I have a dual domain site:
Getting Started | | DGAU
Domain 1 - www.domain.com is SEO optimised with product pages and should of course be indexed.
Domain 2 - secure.domain.com is not SEO optimised and simply has checkout and payment gateway pages. I've discovered that Moz automatically crawls Domain 2 - the secure.domain.com site and consequently picks up hundreds of errors.
I have put an end to this by adding a robots.txt to stop rogerbot and dotbot (mozs crawlers) from crawling domain 2. This fixes my errors in Moz reports however after doing more research into 'Crawler Control' I figure this might be the best option. My Question: Instead of using robots.txt to stop moz from crawing all of Domain 2 should I use on each page of domain 2? I believe this would then allow moz and google to crawl Domain 2 but also tell them both not to index it.
My understanding is that this would be best, and might even help my overall SEO by telling google not to give any SEO value to the Domain 2 pages?0 -
My website does not allow all crawler to crawl, Now my question is that whether i need to give permission to moz crawler if yes then whaat is moz bot name?
My website does not permit all crawler to crawl website. Whether ii need to give permission to moz bot to crawl website or not? If yes what is the moz bot name?
Getting Started | | irteam0 -
Not able to access Moz pro page and my campaigns
Hi, I could not able to access my campaign pages (pro home page). I received the below issue while logged into my account. This webpage has a redirect loop The webpage at http://moz.com/pro/home has resulted in too many redirects. Clearing your cookies for this site or allowing third-party cookies may fix the problem. If not, it is possibly a server configuration issue and not a problem with your computer. Learn more about this problem. Error code: ERR_TOO_MANY_REDIRECTS I have checked on both firefox,chrome browsers after cleared the history and cookies. Also checked with different ISP providers. But received the same. Has anyone experienced the same? Plz help.
Getting Started | | HCSEO510 -
Can not create new campain with my site: edunet.com.vn
I can not create new campain because it always warning my site is not a right URL. I don't understand, please tell what should I do. My site is: edunet.com.vn. (When I try to use "Grade a page for keyword" for URL: edunet.com.vn or edunet.com.vn/thong-tin-tuyen-sinh, it returns "Sorry, but that URL is inaccessible".) Thank you so much. Minh Tam.
Getting Started | | toosol0 -
Why cant I add my site to a campaign?
I am trying to add my site www.dominickdalsanto.com to a campaign and it keeps telling me the URL is invalid. I have tried entering it as: dominickdalsanto.com dominickdalsanto.com/ www.dominickdalsanto.com http://www.dominickdalsanto.com Nothing works. I even tried the redirect domains I have ayres-seo.com and still nothing. I tried a few of my other sites too and it works for only one of them. It also would not take dominickinargentina.com Can someone help me please? Thanks!
Getting Started | | Ayres-SEO0 -
In Open site explorer the page title and Url show in the left hand column. Why do some of my pages have no data for page title?
I am a first time user. Newly updated site using Drupal and having lots of SEO problems. Under site explorer, several pages list NO DATA for the page title. This doesn't seem right. Any suggestions on what this means?
Getting Started | | IV-Debbie0