5xx Crawl Issue might not be issues at all. Help
-
Hi,
I ran a crawl test on our website and it came back with 900 5xx potential errors. When I started opening these links 1 by 1 I could see they were actually working. So i exported the full list of 900 and went to the website: https://httpstatus.io/ pasted the links by 100 and used that. They came back with status codes of 301 / 301 / 200 which i believe means they are okay.
After reading it says that my programmer may need to see if we are blocking the MOZ BOT or to slow the MOZ BOT down. I guess I'm wondering if this is not done is the site actually having these 5xx errors when Google is Crawling or is it just showing 900 errors because of MOZ BOT but actually things are okay?
I know the simple answer is to get the programmer to fix the MOZ BOT issue to know for sure but getting programmers to do things take a lot of time so I'm trying to get a better idea here.
Thanks for your input.
-
Hi there!
Thanks so much for the great question! I'm so sorry to hear you're having this trouble with the 5xx errors. To resolve this we'd recommend adding a crawl delay for rogerbot to your robots.txt file. That crawl delay would look something like this:
User-agent: rogerbot
Crawl-delay: 10This will tell our crawler to slow down when it's crawling. We do not recommend using a crawl delay of longer than 10 as this can keep the crawl from completing.
As far as whether this is impacting Google's ability to crawl, I'm really not able to help identify that. I'm so sorry about that! The best suggestion I can make would be to check the server logs for your site to see how it is responding to other crawlers you may be concerned about.
If you have any other questions about rogerbot or the our tools, please feel free to send an email on over to help@moz.com.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved What would the exact text be for robots.txt to stop Moz crawling a subdomain?
I need Moz to stop crawling a subdomain of my site, and am just checking what the exact text should be in the file to do this. I assume it would be: User-agent: Moz
Getting Started | | Simon-Plan
Disallow: / But just checking so I can tell the agency who will apply it, to avoid paying for their time with the incorrect text! Many thanks.0 -
How can keyword explorer help me search on a more local level?
I am a total novice at this. I am taking the tutorial and the first thing she addresses is Keyword Explorer. It makes sense to me, but what doesn't is that it asks me to look for keywords in USA. I need to explore keywords on a local level. Anyone out there who can help me with this? am I over my head with Moz Pro if I am a complete novice?
Getting Started | | grettelp1 -
Moz only crawling one page of a campaign, please help
Today I set up a new campaign for a client, however the crawl has only found the home page and is saying that the URL is unavailable. The site is definitely live and the URL is correct. I have set up the campaign 3 times one with the full address (http://www.) one with www. and with just the domain name. All three of these have come page with one page crawled and "unavailable" above the URL. It is picking up the crawl issues on the page and showing domain authority but I don't know why it's not crawling other pages. Prior to setting up the campaign I did a site crawl and Moz found everything then, so I don't know why it isn't now. Please help. Thanks
Getting Started | | Wrapped0 -
Why do ignored crawl issues still count as issues?
I use Cloudflare, so I can't avoid the Crawl Error for "Pages with no Meta Noindex" because of the way Cloudflare protects email addresses from harvesting (it creates a new page that has no meta noindex values). I marked this issue as "ignore" because there's nothing I can do about it, and it doesn't really affect my site's performance from an SEO standpoint. But even marked as ignore, it is still included in my site crawl issues count. Of course, I want to see that issues count drop to zero, but that can't happen if the ignored issues are counted. I don't want mark it fixed, because technically it's not fixed. KwPld
Getting Started | | troy.brophy0 -
Why is Moz unable to crawl my site?
Was hoping someone could advise why Moz is unable to crawl my site at https://www.oceaniacruises.com **Moz was unable to crawl your site on Oct 5, 2017. **Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. Any help would be appreciated. Thanks!
Getting Started | | jbarinaga0 -
Standard Syntax in robots.txt doesn't prevent Moz bot from crawling
A client is getting many false positive site crawl errors for things like duplicate titles and duplicate content on pages that include /tag/ in the URL. An example is https://needquest.com/place_tag/autism-spectrum-disorder/page/4/ To resolve this we have set up a disallow statement in the robots.txt file that says
Getting Started | | btreloar
Disallow: /page/ For some reason this appears not to work, as the site crawl errors continue to list pages like this. Does anyone understand why that would be and what we need to do to properly disallow crawling these pages?0 -
Mozbot Can Not Crawl Entire Domain
I'm trying to crawl Redken.com in Moz Analytics and the Search Diagnostics is only crawling 4 pages. The domain uses a "select your country" the first time you visit, and it seems as though the bot is not getting beyond that (aka, not clicking on "USA") and is therefore not crawling the rest of the domain. There is no country specific URL other than redken.com. I've tried entering both "redken.com" and "www.redken.com" as the URL, but no luck. Any tips?
Getting Started | | LabeliumUSA0 -
After fixing Crawl Errors, how long does it take to for Moz or Google to re-crawl a website?
Last night I found out through Moz that my robots.txt file was blocking any crawling of my website. I fixed the issue. Now do I just sit and wait?
Getting Started | | cmc-interactive0