Crawler triggering Spam Throttle and creating 4xx errors
-
Hey Folks,
We have a client with an experience I want to ask about.
The Moz crawler is showing 4xx errors. These are happening because the crawler is triggering my client's spam throttling. They could increase from 240 to 480 page loads per minute but this could open the door for spam as well.
Any thoughts on how to proceed?
Thanks! Kirk
-
Thank you Dave!
-
Hey Kirk! We built our crawler to obey robots.txt crawl-delay directives. In the future, if this is ever an issue, you can use the crawl delay to slow Rogerbot down to a more reasonable speed. However, we don't recommend adding a crawl delay larger than 10 or Rogerbot might not be able to finish the crawl of your site.
Just add a crawl delay directive to your robots.txt file like this:
User-agent: rogerbot
Crawl-delay: 10Here's a good article that explains more about this technique: https://moz.com/learn/seo/robotstxt. I hope this helps, feel free to reach out if you have any other questions!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I added a privacy policy link to my footer and now Moz is showing thousands of 4xx errors
My website didn't have a privacy policy so I added one and put the link in the footer menu. When I did this, Moz came back telling me that there are a lot of new errors on the site. Is this a bad thing? Do I need to address it? HY59Iks sYyAHCB
Moz Bar | | elisa175910 -
How long after you create a campaign can you see a monthly view in dashboard?
I created a campaign on June 13th, 2017 and It's been about a month & two weeks and I'm still only seeing 'weekly' in my dashboard. I have tried searching how long it takes to be able to view monthly but I haven't found much information. Any help is appreciated! sXEAC8k.png
Moz Bar | | mkretsinger0 -
902 Error and Page Size Limit
Hello, I am getting a 902 error when attempting to crawl one of my websites that was recently upgraded to a modern platform to be mobile friendly, https, etc. After doing some research it appears this is related to the page size. On Moz's 902 error description it states: "Pages larger than 2MB will not be crawled. For best practices, keep your page sizes to be 75k or less." It appears all pages on my site are over 2MB because Rogbot is no longer doing any crawling and not reporting issues besides the 902. This is terrible for us because we purchased MOZ to track and crawl this site specifically. There are many articles which show the average page size on the web is well over 2MB now: http://www.wired.com/2016/04/average-webpage-now-size-original-doom/ Due to that I would imagine other users have come up against this as well and I'm wondering how they handled it. I hope Moz is planning to increase the size limit on Rogbot as it seems we are on a course towards sites becoming larger and larger. Any insight or help is much appreciated!
Moz Bar | | Paul_FL0 -
MOZ crawler 404 errors on wordpress
Hi all, I've got hundreds of issues coming up on the MOZ crawler with 404 errors, I don't know what these URL's are. Here's a couple of examples; http://www.theswagbagco.co.uk/category/watford/http%3A%2F%2Fwww.theswagbagco.co.uk%2F2015%2F10%2F15%2Fnew-products-2%2F
Moz Bar | | vaineh
http://www.theswagbagco.co.uk/2015/10/01/thank-you-epsom/http%3A%2F%2Fwww.theswagbagco.co.uk%2F2015%2F10%2F01%2Fthank-you-epsom%2F See the first one is one page with a different url appended, the second is the same thank-you-epsom url. How would I find out where these are even being linked from?0 -
New Spam Analysis Tool Results Questions
First off, I am incredibly excited about this tool. Secondly, I have a slew of questions, but the ones that are lingering for me are as follows. There are a few URLS that we used to control which are showing up in our list as spammy. The websites no longer exist as of roughly a week ago and I presume they still show up simply because of the last indexing. That being said, if a website doesn't exist anymore, yet the link is showing up in our GWMT or MOZ, is it still necessary to disavow? Is that overkill? http://alcoholdrugrehablosangeles.com/ is an example of a website we used to control and I have removed. My second lingering question is if there are a handful of links that are registering as spammy, and I presume it is due to lack of content/duplicate content, and I move that content to its appropriate place on another website and 301 that domain to its new home, will the "spam score" carry over?
Moz Bar | | HashtagHustler0 -
How Do I Troubleshoot 804 HTTPS Crawl Error?
In my Moz crawl report I get: Crawl Error
Moz Bar | | digium
Moz encountered an error on one or more pages on your site
Error Code 804: HTTPS (SSL) Error Encountered The Moz Help Section only says: 804 HTTPS (SSL) error 804 errors result from a site with misconfigured SSL software. If Moz's crawlers cannot correctly interpret an SSL response for a home page, the crawl ends immediately. My site is publicly accessible on https - https://www.respoke.io/ And I'm not seeing any issues with my certificate. Can anyone help me out? What steps can I take to troubleshoot this error? If SSL is misconfigured, how do I configure it properly?0 -
We Launched a new site and Rogerbot is still reporting on links/errors from the old site, is there a way to clear those out?
We are mostly a Branding agency, and have not put a lot of effort into SEO for ourselves... SEO tends to take a backseat to design most of the time, making it a little difficult for me at times when it comes to SEO. We recently launched a new site, http://Roninadv.com/ and the developer and I have done quite a bit of work to make it work well for Google. I was really looking forward to a new crawl report from Roger, but alas, It's like Roger crawled the old site? The new site has been up since last Monday. Is there a way to clear out the old errors? Do I just need to give roger more time?
Moz Bar | | PaulRonin0