Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Our crawler was not able to access the robots.txt file on your site
-
Hello Mozzers!
I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed in Moz.
https://www.thefurnshop.co.uk/robots.txt
and Google isn't flagging anything up to us.
Does anyone know how to solve this problem?
Thanks
-
@LoganRay This was our issue. Didn't know Moz tries to retrieve the HTTP robots.txt first. Our HTTPS redirect was not working on static files only, so the HTTP path to the robots.txt was failing. We did not notice it because the HSTS policy was forcing the browser to redirect.
-
Wanted to jump back in on this topic as I've just confirmed my initial suspicion.
I just added a new client to our Moz account and had the exact same issue, crawler unable to access the robots.txt file. It's a secure site and was configured in Moz without the HTTPS. When I go to the robots.txt file without https://www, it redirects to the same thing as yours where the / between the TLD and page path gets removed.
Reconfigure your site and it should begin to work.
-
There are 2 parts of your robots.txt that could be causing this, and it all just depends on how each bot is reading regular expressions in your robots.txt:
First, your Disallow: /? can be read as Disallow all paths starting with "/" with 0 to infinity characters "" and one character "?". Try replacing this part with Disallow: /*? to make it not crawl anything with a query string (which is what I believe you were going for).
Second, you have a open Disallow followed by the User-agent: rogerbot and while this should not be read this way, once again it all depends on how each bot reads the commands. To fix this you should change your Disallow following your Googlebot-Image as Disallow: /
-
Hi there,
There's something odd going on when I try to access your robots.txt file without the www. The www gets added back on, but when it does, the slash between the TLD and page path gets deleted, see below. I'm guessing your domain in Moz is configured without the www, which means RogerBot is getting redirected to this slash-less version of the file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved How can I improve SEO for my auditing firm’s website?
Hi Moz community! I run aauditing.com, offering audit, tax consulting, and accounting services in the UAE. I’m looking for effective SEO strategies to boost visibility and attract clients. Any tips on content optimization, backlinks, or local SEO would be appreciated!
Getting Started | | HFM972 -
Unsolved Crawler was not able to access the robots.txt
I'm trying to setup a campaign for jessicamoraninteriors.com and I keep getting messages that Moz can't crawl the site because it can't access the robots.txt. Not sure why, other crawlers don't seem to have a problem and I can access the robots.txt file from my browser. For some additional info, it's a SquareSpace site and my DNS is handled through Cloudflare. Here's the contents of my robots.txt file: # Squarespace Robots Txt User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: FacebookBot User-agent: Claude-Web User-agent: cohere-ai User-agent: PerplexityBot User-agent: Applebot-Extended User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps User-agent: * Disallow: /config Disallow: /search Disallow: /account$ Disallow: /account/ Disallow: /commerce/digital-download/ Disallow: /api/ Allow: /api/ui-extensions/ Disallow: /static/ Disallow:/*?author=* Disallow:/*&author=* Disallow:/*?tag=* Disallow:/*&tag=* Disallow:/*?month=* Disallow:/*&month=* Disallow:/*?view=* Disallow:/*&view=* Disallow:/*?format=json Disallow:/*&format=json Disallow:/*?format=page-context Disallow:/*&format=page-context Disallow:/*?format=main-content Disallow:/*&format=main-content Disallow:/*?format=json-pretty Disallow:/*&format=json-pretty Disallow:/*?format=ical Disallow:/*&format=ical Disallow:/*?reversePaginate=* Disallow:/*&reversePaginate=* Any ideas?
Getting Started | | andrewrench0 -
Unsolved Website Traffic
Greetings All. I'm working on a new business website for a client, and I've been accessing the site numerous times daily (troubleshooting, confirming changes, etc.). I've been using Google search to access the site, and I use a VPN so that my IP would be random. So I would presume that the site traffic should be increasing. But on the last Moz Pro crawl, the site traffic was still listed as 0.
Getting Started | | depawl52
Is there a minimum amount of traffic required before Moz recognizes it or is something else going on?
Thank you.0 -
Can Moz Monitor a JS Site?
We are building a new site that, on the blog landing page, uses JS to populate the individual blog article links on the page. The links are not viewable in the page source, but do appear once the page fully renders. After running an on-demand crawl of the site (in QA...not indexed yet), it appears that Moz isn't indexing these pages, and it also isn't reading other page elements that load later (like an H1 that is rendered in JS but not in page source). Are we going to be able to use Moz to track this site? Is there some setting to help?
Getting Started | | Rodrigo-DC0 -
Moz can't crawl my site.
Moz cannot carry out the site crawl on my online shop. Not really sure what the issue is, it has no problem getting onto my site when you use www. before the address, but it needs to be able to access bluerinsevintage.co.uk Stuck as what to do, we are a shopify store. Anyone else had this problem, or know what i need to change so they can crawl the site? thjis is the page they are getting when trying to get on bluerinsevintage.co.uk but if they use www.bluerinsevintage.co.uk the site comes up. Adam
Getting Started | | bluerinsevintage0 -
How long does it usually take Moz to populate information for a new Web site?
We recently launched (9/13/2013) an e-commerce Website and added the campaign to SEO MOZ. Week after week the Domain Rank is 1 and none of our keyword stats or link stats are populated. We have another Moz campaign that posts weekly updates and is doing extremely well. I'm just wondering how long it usually takes Moz to start populating all the analysis stats? I'm also wondering if there might be a campaign setting buried somewhere that I need to enable or maybe it just takes more than 5 weeks? Any insights would be much appreciated. Here's the new URL we need to track with MOZ: http://www.imsportshq.com
Getting Started | | Tripper0