What To Do About Yahoo Slurp Bot Bogging My Site Down?
-
Hello,
Our IT department has informed me that they have seen extremely heavy traffic from the Yahoo Slurp bot in recent days. They are claiming this bot has single-handedly caused one of our servers to crash.
I am a bit skeptical of this, as I have not found these particular legitimate search engine bots to be aggressive resource hogs, especially for an enterprise-level web server.
I have requested to examine the server logs myself, but have not had success with this. IT is requesting to block this particular bot, but I am apprehensive about doing this, as I don't want this to have any negative implications on our site showing in Yahoo News or other Yahoo properties.
Does anyone else have experience with this bot being an overly-zealous resource drag, and if so, what is the best course of action to satisfy all parties?
-
Examining the server logs yourself probably wont help your understanding of the issue unless you know what your looking at specifically. On the Yahoo note, i have found Slurp to be really bad in the past, but no legitimate bot should be able to bring down a properly configured web server, especially an 'enterprise-level' one.
I would check your .htaccess and apache settings for bad redirects (or web.conf if on windows) before considering banning the bot. Other things to check would be website code or if a bot hits a massive and horribly optimised Database Query for example, that could bring the server down.
Ask IT exactly what the bot did that caused the server to go down, they should atleast be able to tell you that. If not then they need to run load tests against the website itself to try and reproduce the scenario and thus debug the issue, if indeed there is one.
Tl;dr :- Normally bad config or code / queries are to blame for this kind of thing. I'd review that before blocking a bot that crawls hundreds of thousands of other sites without issue.
-
You should be able to can control the rate at which the bot accesses you pages by adding a crawl delay in your robots.txt file. Robots.txt and crawl delay is discussed here: http://en.wikipedia.org/wiki/Robots_exclusion_standard, and Slurp bot here: https://help.yahoo.com/kb/SLN22600.html.
Should look like this in your robots.txt file:
User-agent: Slurp
Crawl-delay: 30
The crawl delay is the number of seconds the bot should wait between pageview (ask your IT guys what's appropriate for you). I stuck 30 in there, meaning the Slurp bot would only be able to access up to 2 pages a minute.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it posible to improve site rankings working only with an other site?
Hi everyone, i´ll try to explain a situation is happening to me, i´m goint to try to explain the case (im writing the sites without links for explication purposes. Site 1: Adventurerooms Site 2: Adventureroomsmallorca Site 3: Adventureroomsmadrid (the new site) What happen is that at first there was only Adventurerooms and Adventureroomsmallorca, Adventurerooms was for Madrid and linked to the one in Mallorca too, was kind of giving the information for Madrid but in first page split with a link to Mallorca. In a new strategy we create Adventureroomsmadrid for Madrid, and leave Adventurerooms for Spain (with links to Adventureroomsmadrid and Adventureroomsmallorca. We redirect the info for Madrid in Adventurerooms to Adventureroomsmadrid with 301 redirections. We work during this 3 months in Adventureroomsmadrid making content in the blog, and improving (now Adventureroomsmadrid is Moz 15 (perhaps even more), and Adventurerooms is Moz 10. Surprising Adventurerooms is getting better in its search rankings, even when we took away content from it and even without working well. Adventureroomsmadrid is also improving but not as much as Adventurerooms (i know that is a new site, only 3 months), but Adventurerooms gets better results with no content and only DA of 10. I hope i´ve explain the case with my english so the question is: "Is it posible to improve site rankings working only with an other site?" Thanks in advance
Intermediate & Advanced SEO | | webtematica0 -
Review of our site
Hi Moz-Fans 🙂 I'm doing SEO for about a year now and have a new site to which I do not know where to improve any further. The main keyword is "Webdesign Freiburg" and the site is werkzeug - kasten . com Anyone want to have a look into and tell me what might bring us from page 2 to page 1 on google? Thanks a lot Marc
Intermediate & Advanced SEO | | RWW0 -
Moving to a new site while keeping old site live
For reasons I won't get into here, I need to move most of my site to a new domain (DOMAIN B) while keeping every single current detail on the old domain (DOMAIN A) as it is. Meaning, there will be 2 live websites that have mostly the same content, but I want the content to appear to search engines as though it now belongs to DOMAIN B. Weird situation. I know. I've run around in circles trying to figure out the best course of action. What do you think is the best way of going about this? Do I simply point DOMAIN A's canonical tags to the copied content on DOMAIN B and call it good? Should I ask sites that link to DOMAIN A to change their links to DOMAIN B, or start fresh and cut my losses? Should I still file a change of address with GWT, even though I'm not going to 301 redirect anything?
Intermediate & Advanced SEO | | kdaniels0 -
Site not progressing at all....
We relaunched our site almost a year ago after our old site dropped out of ranking due to what we think was overused anchor text.... We transferred over the content to the new site, but started fresh in terms of links etc. And did not redirect the old site. Since the launch we have focused on producing good content and social, but the site has made no progress at all. The only factor I can think off is that one site linked to us from all of their pages, which we asked them to remove which they did over 3 months ago, but still showing in Webmaster tools.... Any help would be appreciated. Thanks
Intermediate & Advanced SEO | | jj34340 -
Disavowing Links for Subcategory of Site
Has anyone tried using Google's Disavow tool with only a specific subcategory of their site? We're an ecommerce company and our site took a small hit with this recent Penguin update. We're certain previous linkbuilding efforts are the cause. But we'd like to try the Disavow tool with 1 subcategory to start, see if our rankings for that category improve (we used to be top 3, now ~12 or 13), and if so then roll it out through the rest of the site. Looking for input from others on if they have any experience with this or if it'd be better to just go for the whole thing at once. Thanks.
Intermediate & Advanced SEO | | Kingof50 -
What happens when I redirect an entire site to an established page on another site?
Hi There, I have a website which is dedicated to selling ONE product (in different forms) or my main brand site. It is branded similarly, targets similar keywords, and gets some traffic which convert to leads. Additionally, the auxiliary site has a Google Rank 2 in its own right. I am thinking of consolidating this "auxillary" site to the specific product page on my main site. The reason I am considering doing this is to give a "boost" to the main product page on our main site which has many core keywords sitting with SERP ranking of between 11-20 (so not in first 10) Because this auxiliary site it gets traffic and leads in its own right, I don't want this to be to the detriment of my leads overall. Question is - if I 301 redirect the entire domain from my auxillary site to the equivalent product on my main site am I likely to see a large "boost" to that product page? (i.e. will I likely see my ranking rise from 11 - 20 significantly)
Intermediate & Advanced SEO | | love-seo-goodness0 -
Were small sites hit by Panda?
It seems that primarily large sites were hit by Panda, but does any one know of / own a small site that was hit by Panda?
Intermediate & Advanced SEO | | nicole.healthline0 -
One platform, multiple niche sites: Worth $60/mo so each site has different class C?
Howdy all, The short of it is that I currently run a very niche business directory/review website and am in the process of expanding the system to support running multiple sites out of the same database/codebase. In a normal setup I'd just run all the sites off of the same server with all of them sharing a single IP address, but thanks to the wonders of the cloud, it would be fairly simple for me to run each site on it's own server at a cost of about $60/mo/site giving each site a unique IP on a unique c-block (in many cases a unique a-block even.) The ultimate goal here is to leverage the authority I've built up for the one site I currently run to help grow the next site I launch, and repeat the process. The question is: Is the SEO-value that the sites can pass to each other worth the extra cost and management overhead? I've gotten conflicting answers on this topic from multiple people I consider pretty smart so I'd love to know what other people say.
Intermediate & Advanced SEO | | qurve0