Rogerbot getting cheeky?
-
Hi SeoMoz,
From time to time my server crashes during Rogerbot's crawling escapades, even though I have a robots.txt file with a crawl-delay 10, now just increased to 20.
I looked at the Apache log and noticed Roger hitting me from from 4 different addresses 216.244.72.3, 72.11, 72.12 and 216.176.191.201, and most times whilst on each separate address, it was 10 seconds apart, ALL 4 addresses would hit 4 different pages simultaneously (example 2). At other times, it wasn't respecting robots.txt at all (see example 1 below).
I wouldn't call this situation 'respecting the crawl-delay' entry in robots.txt as other question answered here by you have stated. 4 simultaneous page requests within 1 sec from Rogerbot is not what should be happening IMHO.
example 1
216.244.72.12 - - [05/Sep/2012:15:54:27 +1000] "GET /store/product-info.php?mypage1.html" 200 77813
216.244.72.12 - - [05/Sep/2012:15:54:27 +1000] "GET /store/product-info.php?mypage2.html HTTP/1.1" 200 74058
216.244.72.12 - - [05/Sep/2012:15:54:28 +1000] "GET /store/product-info.php?mypage3.html HTTP/1.1" 200 69772
216.244.72.12 - - [05/Sep/2012:15:54:37 +1000] "GET /store/product-info.php?mypage4.html HTTP/1.1" 200 82441example 2
216.244.72.12 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage1.html HTTP/1.1" 200 70209
216.244.72.11 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage2.html HTTP/1.1" 200 82384
216.244.72.12 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage3.html HTTP/1.1" 200 83683
216.244.72.3 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage4.html HTTP/1.1" 200 82431
216.244.72.3 - - [05/Sep/2012:15:46:16 +1000] "GET /store/mypage5.html HTTP/1.1" 200 82855
216.176.191.201 - - [05/Sep/2012:15:46:26 +1000] "GET /store/mypage6.html HTTP/1.1" 200 75659Please advise.
-
Hi BM7,
I'm going to open up a ticket on this to have our engineers take a closer look at your site. Once we have an overall response, I'll post it here for other community members to view.
Cheers!
-
Thanks Megan for your reply,
Will give that a try and have blocked 2 addresses so you are reduced to 2 crawler sessions. These two measures should reduce the load considerably as long as Rogerbot respects the 7 second delay.
IMHO ignoring the Crawl-Delay set by the webmaster of the site you are crawling, which crawlers are supposed to respect, is wrong. I got a Google WMT nasty for being down 5 hours due to Rogerbot as it was the middle of the night so only got restarted in the morning.
Also, my site has around 600 discrete pages of which you crawl about 500, so even at the original 10 seconds crawl delay you could do my whole site in less than 1.5 hours, which is only required once a week. So in my mind that suggests there is no need to overrule my settings in robots.txt 'so he (Roger) can complete the crawl'.
Regards,
-
Hi there,
This is Megan from the SEOmoz Help Team. I'm so sorry Rogerbot is causing you grief! This actually might be happening because your crawl delay is too long, so rogerbot just ends up ignoring it so he can complete the crawl. If you set your crawl delay to a max of 7, then it should solve your problem. If you're still running into issues, though, please send us a message to help@seomoz.org and we'll check it out asap!
Cheers!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Competitor getting External Links from search.aol.com
Recently, I noticed that one of the competitors I track within my Moz campaign received about 12 new inbound links. As a result, there DA jumped about 10 points. I reviewed these new external links and was surprised to see that they are all "search.aol.com/aol/search?query= ..." with Link Anchor Text that is good for the industry we compete in. Can anyone tell me why these are being counted as "Inbound Links". It just doesn't seem right. Is this some sort of black hat seo tactic?
Moz Pro | | itvisionsinc0 -
Can anyone offer an example of a site or page that gets 100% or even close to that on Search Visibility?
I have a couple of sites that I manage that kill it in the SERPs and yet they get a low search visibility score from Moz Pro, I am talking 18%-19%, and another that ranks well has a search visibility score of 8.71%. I know there are factors that go into calculating the score, I am just curious if anyone is really up there.
Moz Pro | | -b.graves- 00 -
How do i get the crawler going again?
The initial crawl only hit one page. Set up another campaign for another site and it crawled 260 pages. How can I get the crawler started up again or do I really have to wait a week ?
Moz Pro | | martJ0 -
Rank page1 but not getting any clicks !!
Hi everyone, I am on page #1 position #2 with my keyword but doesnt get any clicks !I desperatly need your help. Here are some info about my site. what do you think the problem is? Thanks for your help. -My keyword's Global and Local montly search is 1300 (exact) -Seomoz Rank Tracker shows that I rank ( on Page #1, Position #2 in Google / United Kingdom) -I use always private browsing to check my rankings -my domain is a .com and I bought the domain name from godaddy -Hosting is 1&1 and their server is in Germany. Which is a shame, I ve just realized 😞 -My site ranks on Google.uk (The web) but doesnt rank Google.co.uk (pages from uk). Is this the problem? I ve just change the target country to United Kingdom using webmaster tool. Will it help? Thanks a lot
Moz Pro | | Jorenr0 -
What do you get with mozpoints?
What is the point of collecting mozpoints? I read that you are able to purchase features, but what other perks are there with collecting mozpoints?
Moz Pro | | ReadyArtwork0 -
Newbie - help me get started, please :)
Hi Guy's, I am super excited to be here and looking fwd to getting to know you all. As the title suggests I am a complete newbie to the world of SEO I am very keen to learn, I just need pointing in the right direction. My website is live! Now I need to make it super populated, my site is an e-commerce website selling natural / organic beauty products for men women & children. I would like to make it one of the best out there and I am not worried about putting the hours in to achieve that goal. So my question to you guys is this.... where do I start?
Moz Pro | | dan1el0 -
How to get seomoz to re-crawl a site?
I had a lot of duplicate content issues and have fixed all the other warnings. I want to check the site again.
Moz Pro | | adamzski0 -
Can you help me get started using the crawl diagnostics report?
After getting the crawl diagnostics report for the first time my boss and I looked over it and we have tried to fix the problems but we are stumped.I have tried and watched videos , read books, etc.. but have found nothing to help. I need assistance getting started on improving my website. Can you help?
Moz Pro | | WVInjuryLawyer0