Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
520 Error from crawl report with Cloudflare
-
I am getting a lot of 520 Server Error in crawl reports. I see this is related to Cloudflare. We know 520 is Cloudflare so maybe the Moz team can change this from "unknown" to "Cloudflare 520". Perhaps the Moz team can update the "how to fix" section in the reporting, if they have some possible suggestions on how to avoid seeing these in the report of if there is a real issue that needs to be addressed. At this point I don't know.
There must be a solution that Moz can provide like a setting in Cloudflare that will permit the Rogerbot if Cloudflare is blocking it because it does not like its behavior or something.
It could be that Rogerbot is crawling my site on a bad day or at a time when we were deploying a massive site change. If I know when my site will be down can I pause Rogerbot?
I found this https://developers.cloudflare.com/support/troubleshooting/general-troubleshooting/troubleshooting-crawl-errors/
-
A 520 error is an HTTP error code that indicates that Cloudflare was unable to establish a connection to the origin server. This can happen for a variety of reasons, including:
Server downtime: The origin server might be down or undergoing maintenance.
Firewall restrictions: The origin server might have a firewall that is blocking requests from Cloudflare.
DNS issues: There might be a DNS misconfiguration that is preventing Cloudflare from resolving the origin server's IP address.
SSL issues: There might be an issue with the SSL certificate on the origin server.
To troubleshoot the issue, you can try the following:
Check if the origin server is up and running.
Check if the origin server has a firewall that is blocking requests from Cloudflare.
Check if the DNS is configured correctly.
Check if the SSL certificate is valid and configured correctly.
If none of these steps resolve the issue, you can reach out to Cloudflare support for further assistance.
-
@awilliams_kingston To answer your question, there is no option to pause Rogerbot manually. However, Rogerbot only crawls a website when a Site Crawl campaign is active and scheduled to run. If you want to pause Rogerbot, you can stop the active campaign or schedule the next crawl to start at a later time.
To schedule a Site Crawl, go to your Moz Pro account, click on "Site Crawl" in the left-hand navigation menu, and select "Add Campaign" to set up a new campaign or select an existing one. From there, you can customize your crawl settings, including the crawl frequency and start time.
If you have a scheduled maintenance window and want to prevent Rogerbot from crawling your site during that time, you can adjust the crawl frequency to avoid overlapping with your maintenance schedule. You can also use a robots.txt file to block the crawler from accessing specific pages or sections of your site.
-
@awilliams_kingston The 520 server error you're seeing in your Moz crawl reports is related to Cloudflare. It's a generic error, which means it could be caused by a variety of issues, including server overload or misconfigured settings.
To address this, you could check your Cloudflare firewall settings and see if there are any rules that are blocking the Moz Rogerbot crawler. If there are, try adding an exception for the Rogerbot user agent to allow it to crawl your site without being blocked.
If you know your site will be down for maintenance or undergoing significant changes, you could pause the Moz crawler during that time to prevent it from generating false 520 errors in your reports.
Finally, you could check out the troubleshooting guide in the Cloudflare documentation for more information on identifying and addressing crawl errors. Remember to work with both Moz and Cloudflare support teams to find a solution that works for your specific setup.
-
@Kateparish Thank you.
How do you pause Rogerbot? I can't find anything on that in my admin panel but maybe it is because there is no crawl happening at the moment and my next crawl is scheduled to happen in a few days. Also, is there a way to schedule a pause if a crawl is happening? If I know I have site maintenance on a certain day of the week a specific time, for example, I can have Rogerbot take a break? -
A 520 error typically indicates a connection error between Cloudflare and the origin server. This error occurs when the server returns an empty or invalid response to Cloudflare, or when the server takes too long to respond.
To troubleshoot a 520 error from a crawl report with Cloudflare, you can take the following steps:
Check the server logs: The first step in troubleshooting a 520 error is to check the server logs for any error messages. Look for any errors related to the server's network or connectivity, such as DNS resolution issues, network timeouts, or firewall restrictions.
Check Cloudflare logs: Cloudflare logs can provide additional insights into the cause of the error. Check the Cloudflare logs for any error messages or connection issues between Cloudflare and the origin server.
Temporarily disable Cloudflare: Temporarily disabling Cloudflare can help you determine if the error is caused by Cloudflare or the origin server. If the error disappears when Cloudflare is disabled, then the issue is likely with Cloudflare.
Contact Cloudflare support: If you are unable to resolve the issue on your own, you can contact Cloudflare support for assistance. Provide them with the server logs and Cloudflare logs, as well as any other relevant information, to help them diagnose the issue.
By following these steps, you should be able to identify and resolve the 520 error from the crawl report with Cloudflare.
-
@awilliams_kingston The 520 server error you're seeing in your Moz crawl reports is related to Cloudflare. It's a generic error, which means it could be caused by a variety of issues, including server overload or misconfigured settings.
To address this, you could check your Cloudflare firewall settings and see if there are any rules that are blocking the Moz Rogerbot crawler. If there are, try adding an exception for the Rogerbot user agent to allow it to crawl your site without being blocked.
If you know your site will be down for maintenance or undergoing significant changes, you could pause the Moz crawler during that time to prevent it from generating false 520 errors in your reports.
Finally, you could check out the troubleshooting guide in the Cloudflare documentation for more information on identifying and addressing crawl errors. Remember to work with both Moz and Cloudflare support teams to find a solution that works for your specific setup.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved The Moz.com bot is overloading my server
0 -
URLs dropping from index (Crawled, currently not indexed)
I've noticed that some of our URLs have recently dropped completely out of Google's index. When carrying out a URL inspection in GSC, it comes up with 'Crawled, currently not indexed'. Strangely, I've also noticed that under referring page it says 'None detected', which is definitely not the case. I wonder if it could be something to do with the following? https://www.seroundtable.com/google-ranking-index-drop-30192.html - It seems to be a bug affecting quite a few people. Here are a few examples of the URLs that have gone missing: https://www.ihasco.co.uk/courses/detail/sexual-harassment-awareness-training https://www.ihasco.co.uk/courses/detail/conflict-resolution-training https://www.ihasco.co.uk/courses/detail/prevent-duty-training Any help here would be massively appreciated!
Technical SEO | | iHasco0 -
Subdomain 403 error
Hi Everyone, A crawler from our SEO tool detects a 403 error from a link from our main domain to a a couple of subdomains. However, these subdomains are perfect accessibly. What could be the problem? Is this error caused by the server, the crawlbot or something else? I would love to hear your thoughts.
Technical SEO | | WeAreDigital_BE
Jens0 -
Why does Bing bot crawl so aggressively?
We observer that the Bing bot is crawling our site very aggressively. We set Bing's crawl control so that it should not crawl us during heavy traffic hours, but that did not change a thing. Does anyone have the problem and even better a solution?
Technical SEO | | Roverandom1 -
Duplicate content and 404 errors
I apologize in advance, but I am an SEO novice and my understanding of code is very limited. Moz has issued a lot (several hundred) of duplicate content and 404 error flags on the ecommerce site my company takes care of. For the duplicate content, some of the pages it says are duplicates don't even seem similar to me. additionally, a lot of them are static pages we embed images of size charts that we use as popups on item pages. it says these issues are high priority but how bad is this? Is this just an issue because if a page has similar content the engine spider won't know which one to index? also, what is the best way to handle these urls bringing back 404 errors? I should probably have a developer look at these issues but I wanted to ask the extremely knowledgeable Moz community before I do 🙂
Technical SEO | | AliMac260 -
Increase 404 errors or 301 redirects?
Hi all, I'm working on an e-commerce site that sells products that may only be available for a certain period of time. Eg. A product may only be selling for 1 year and then be permanently out of stock. When a product goes out of stock, the page is removed from the site regardless of any links it may have gotten over time. I am trying to figure out the best way to handle these permanently out of stock pages. At the moment, the site is set up to return a 404 page for each of these products. There are currently 600 (and increasing) instances of this appearing on Google Webmasters. I have read that too many 404 errors may have a negative impact on your site, and so thought I might 301 redirect these URLs to a more appropriate page. However I've also read that too many 301 redirects may have a negative impact on your site. I foresee this to be an issue several years down the road when the site has thousands of expired products which will result in thousands of 404 errors or 301 redirects depending on which route I take. Which would be the better route? Is there a better solution?
Technical SEO | | Oxfordcomma0 -
Can too many pages hurt crawling and ranking?
Hi, I work for local yellow pages in Belgium, over the last months we introduced a succesfull technique to boost SEO traffic: we have created over 150k of new pages, all targeting specific keywords and all containing unique content, a site architecture to enable google to find these pages through crawling, xml sitemaps, .... All signs (traffic, indexation of xml sitemaps, rankings, ...) are positive. So far so good. We are able to quickly build more unique pages, and I wonder how google will react to this type of "large scale operation": can it hurt crawling and ranking if google notices big volumes of content (unique content)? Please advice
Technical SEO | | TruvoDirectories0 -
Googlebot Crawl Rate causing site slowdown
I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT: http://imgur.com/dyIbf I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot. Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings? Thanks
Technical SEO | | SuperMikeLewis0