What To Do About Yahoo Slurp Bot Bogging My Site Down?
-
Hello,
Our IT department has informed me that they have seen extremely heavy traffic from the Yahoo Slurp bot in recent days. They are claiming this bot has single-handedly caused one of our servers to crash.
I am a bit skeptical of this, as I have not found these particular legitimate search engine bots to be aggressive resource hogs, especially for an enterprise-level web server.
I have requested to examine the server logs myself, but have not had success with this. IT is requesting to block this particular bot, but I am apprehensive about doing this, as I don't want this to have any negative implications on our site showing in Yahoo News or other Yahoo properties.
Does anyone else have experience with this bot being an overly-zealous resource drag, and if so, what is the best course of action to satisfy all parties?
-
Examining the server logs yourself probably wont help your understanding of the issue unless you know what your looking at specifically. On the Yahoo note, i have found Slurp to be really bad in the past, but no legitimate bot should be able to bring down a properly configured web server, especially an 'enterprise-level' one.
I would check your .htaccess and apache settings for bad redirects (or web.conf if on windows) before considering banning the bot. Other things to check would be website code or if a bot hits a massive and horribly optimised Database Query for example, that could bring the server down.
Ask IT exactly what the bot did that caused the server to go down, they should atleast be able to tell you that. If not then they need to run load tests against the website itself to try and reproduce the scenario and thus debug the issue, if indeed there is one.
Tl;dr :- Normally bad config or code / queries are to blame for this kind of thing. I'd review that before blocking a bot that crawls hundreds of thousands of other sites without issue.
-
You should be able to can control the rate at which the bot accesses you pages by adding a crawl delay in your robots.txt file. Robots.txt and crawl delay is discussed here: http://en.wikipedia.org/wiki/Robots_exclusion_standard, and Slurp bot here: https://help.yahoo.com/kb/SLN22600.html.
Should look like this in your robots.txt file:
User-agent: Slurp
Crawl-delay: 30
The crawl delay is the number of seconds the bot should wait between pageview (ask your IT guys what's appropriate for you). I stuck 30 in there, meaning the Slurp bot would only be able to access up to 2 pages a minute.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What do you think about SEO of big sites ?
Hi, I was doing some research of new huge sites for example carstory.com that have over million pages and i notice that many new sites have strong growing for number of keywords and then at some point everything start going down (Image of traffic drop attached)
Intermediate & Advanced SEO | | logoderivvthere are no major updates at this time but you can clearly see even on recent kewyords changes that this site start loosing keywords every day , so number of new keywords are much less that lost keywords. How would you explain it ? Is that at some point when site have more than X number of indexed pages then power of domain is not enough to keep all of them at the top and those keywords start dropping ? Please share you opinion and if you have any experience by yourself with huge sites. Thank You very appreciated 2LC3AxE
0 -
Why is my m-dot site outranking my main site in SERPs?
My client has a WP site and a Duda mobile site that we inherited. For some reason their m-dot site is ranking on P1 of Google for their top KWs instead of the main site which is much more robust. The main site might rank beyond page 5 when the generic home page for their m-dot site appears on P1. Does anyone have any idea why this might be happening?
Intermediate & Advanced SEO | | Etna0 -
Reindexing a site with www.
We have a site that has a mirror - i.e. www.domain.com and domain.com - there is not redirect both url's work and show pages so basically a site with 2 sets of URLs for each page. We have changed it so the domain.com and all assorted pages 301 redirect to the right URL with www. i.e. domain.com/about 301's to www.domain.com/about In the search engines the domain.com is the site indexed and the only www. page indexed is the homepage. I checked in the robots.txt file and nothing blocking the search engines from indexing both the www. and non www. versions of the site which makes me wonder why did only one version get indexed and how did the clients avoid a duplicate content issue? Secondly is it best to get the search engines to unidex domain.com and resubmit www.domain.com for the full site? We are definately staying with the www.domain.com NOT domain.com so need to find the best way to get the site indexed with www. and remove the non www. Hope that makes sense and look forward to everyone's input.
Intermediate & Advanced SEO | | JohnW-UK0 -
Site rankings down
Our site is over 10 years old and has consistently ranked highly in google.co.uk for over 100 key phrases. Until the middle of April, we were 7th for 'nuts and bolts' and 5th for 'bolts and nuts' - we have been around these positions for 5-6 years easily now. Our rankings dropped mid-April, but now (presumably as a result of Penguin 2.0), we've seen larger decreases across the board. We are now 5th page on 'nuts and bolts', and second page on 'bolts and nuts'. Can anyone please shed any light on this? Although we'd fallen some before Penguin 2.0, we've fallen quite a bit further since. So I'm wondering if it's that. We do still rank well on our more specialised terms though - 'imperial bolts', 'bsw bolts', 'bsf bolts', we're still top 5. We've lost out with the more generic terms. In the past we did a bit of (relevant) blog commenting and obtained some business directory links, before realising the gain was tiny if at all. Are those likely to be the issue? I'm guessing so. It's hard to know which to get rid of though! Now, I use social media sparingly, just Facebook, Twitter and G+. The only linkbuilding I do now is by sending polite emails to people who run classic car clubs that would use our bolts, stuff like that. I've had a decent response from that, and a few have become customers directly. Here's our link profile if anyone would be kind enough as to have a look: http://www.opensiteexplorer.org/links?site=www.thomassmithfasteners.com Also, SEOMOZ says we have too many links on our homepage (107) - the dropdown navigation is the culprit here. Should I simply get rid of the dropdown and take users to the categories? Any advice here would be appreciated before I make changes! If anyone wants to take a look at the site, the URL is in the link profile above - I'm terrified of posting links anywhere now! Thanks for your time, and I'd be very grateful for any advice. Best Regards, Stephen
Intermediate & Advanced SEO | | stephenshone1 -
Separate Site or should we incorporate it into our main site
Hello, We have a website to sell personal development trainings. The owners want to start 2 blogs - one for each owner - that promotes their personal coaching practices. What's the SEO advantages of embedding both blogs in the current site vs starting 2 brand new blogs with their names as the domain names?
Intermediate & Advanced SEO | | BobGW0 -
Any SEO suggestions for my site?
Site in question: http://bit.ly/Lcspfp Does anyone have any suggestions for any on-site SEO that would benefit my website? Any recommendations, big or small are appreciated.
Intermediate & Advanced SEO | | RichardTaylor1 -
Should I link my similar sites together?
Hi I currently have two sites within exactly the same market. I've just purchased a third website from someone. Should I link these sites together? (i.e. in the page header should I cross link them or point two of them to the third?) If I do this will it harm them if they are on the same C-Class IP blocks? Is using private domains and different hosting companies considered dodgey in any way? Basically I'm a big wimp and don't want to do anything potentially that might potentially hurt my rankings;)
Intermediate & Advanced SEO | | Blendfish0 -
When is it worth re-structuring your site?
I recently started working on a site that is 8 years old and the currently URLs/ site structure is not SEO friendly. We are concerned that in re-structuring the site, we may loose our rankings. Has anyone ever completely re-structured their site? Was it worth it?
Intermediate & Advanced SEO | | nicole.healthline0