Hi Zippy-Bungle,
To understand first why the 803 error was reported:
When a page is called, the web server sends header details of what's to be displayed. You can see a complete list of these HTTP header fields here.
One of the headers sent by the web server is Content-length, which indicates how many bytes the rest of the page is going to send. So let's say for example that content length is 100 bytes but the server only sends 74 bytes (it may be valid HTML, but the length does not match the content length indicated)
Since the web server only sent 74 bytes and the crawler expected 100 bytes the crawler sees a TCP close port error because it is trying to read the number of bytes that the webserver said it was going to send. So you get an 803 error.
Now browsers don't care when a mismatch like this happens because Content-length is an outdated component for modern browsers, but Roger Mozbot (the Moz crawler, identified in your logs as RogerBot) is on a mission to show you any errors that might be occurring. So Roger is configured to detect and report such errors.
The degree to which an 803 error will adversely affect crawl efficiency for search engine bots such as Googlebot, Bingbot and others will vary, but the fundamental problem with all 8xx errors is that they result from violations of the underlying HTTP or HTTPS protocol. The crawler expects all responses it receives to conform to the HTTP protocol and will typically throw an exception when encountering a protocol-violating response.
In the same way that 1xx and 2xx errors generally indicate a badly-misconfigured site, fixing them should be a priority to ensure that the site can be crawled effectively. It is worth noting here that bingbot is well known for being highly sensitive to technical errors.
So what makes the mismatch happen?
The problem could be originating from the website itself (page code), the server, or the web server. There are two broad sources:
- Crappy code
- Buggy server
I'm afraid you will need to get a tech who understands this type of problem to work through each of these possibilities to isolate and resolve the root cause.
The Moz Resource Guide on HTTP Errors in Crawl Reports is also worth a read in case Roger encounters any other infrequently seen errors.
Hope that helps,
Sha