What To Do About Yahoo Slurp Bot Bogging My Site Down?
-
Hello,
Our IT department has informed me that they have seen extremely heavy traffic from the Yahoo Slurp bot in recent days. They are claiming this bot has single-handedly caused one of our servers to crash.
I am a bit skeptical of this, as I have not found these particular legitimate search engine bots to be aggressive resource hogs, especially for an enterprise-level web server.
I have requested to examine the server logs myself, but have not had success with this. IT is requesting to block this particular bot, but I am apprehensive about doing this, as I don't want this to have any negative implications on our site showing in Yahoo News or other Yahoo properties.
Does anyone else have experience with this bot being an overly-zealous resource drag, and if so, what is the best course of action to satisfy all parties?
-
Examining the server logs yourself probably wont help your understanding of the issue unless you know what your looking at specifically. On the Yahoo note, i have found Slurp to be really bad in the past, but no legitimate bot should be able to bring down a properly configured web server, especially an 'enterprise-level' one.
I would check your .htaccess and apache settings for bad redirects (or web.conf if on windows) before considering banning the bot. Other things to check would be website code or if a bot hits a massive and horribly optimised Database Query for example, that could bring the server down.
Ask IT exactly what the bot did that caused the server to go down, they should atleast be able to tell you that. If not then they need to run load tests against the website itself to try and reproduce the scenario and thus debug the issue, if indeed there is one.
Tl;dr :- Normally bad config or code / queries are to blame for this kind of thing. I'd review that before blocking a bot that crawls hundreds of thousands of other sites without issue.
-
You should be able to can control the rate at which the bot accesses you pages by adding a crawl delay in your robots.txt file. Robots.txt and crawl delay is discussed here: http://en.wikipedia.org/wiki/Robots_exclusion_standard, and Slurp bot here: https://help.yahoo.com/kb/SLN22600.html.
Should look like this in your robots.txt file:
User-agent: Slurp
Crawl-delay: 30
The crawl delay is the number of seconds the bot should wait between pageview (ask your IT guys what's appropriate for you). I stuck 30 in there, meaning the Slurp bot would only be able to access up to 2 pages a minute.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My site is not ranking at all.
Can anybody check it what is the main culprit behind my website's growth?
Intermediate & Advanced SEO | | anshu14320 -
Multiple Ecommerce sites, same products
We are a large catalog company with thousands of products across 2 different domains. Google clearly knows that the sites are connected. Both domains are fairly well known brands - thousands of branded searches for each site per month. Roughly half of our products overlap - they appear on both sites. We have a known duplicate content issue - both sites having exactly the same product descriptions, and we are working on it. We've seen that when a product has different content on the 2 sites, frequently, both pages get to page 2 of the SERPs, but that's as far as it goes, despite aggressive white hat link building tactics. 1. Is it possible to get the same product pages on page 1 of the SERPs for both sites? (I think I know the answer...) 2. Should we be canonicalizing (is that a word?) products across the sites? This would get tricky - both sites have roughly the same domain authority, but in different niches. Certain products and keywords naturally rank better on 1 site or the other depending on the niche.
Intermediate & Advanced SEO | | AMHC0 -
Will an inbound follow link on a site be devalued by an inbound affiliate link on the same site?
Hey guys, quick question I didn't find an answer to online. Scenario: 1. Site A links to Site B. It's a natural, regular, follow-link 2. Site A joins Site B's affiliate program, and adds an affiliate link Question: Does the first, regular follow link get devalued by the second affiliate link? Cheers!
Intermediate & Advanced SEO | | ipancake0 -
Is the Tool Forcing Sites to Link Out?
Hi I have a tool that I wish to give to sites, it allows the user to get an accurate idea of their credit score with out giving away any personal data and with out having a credit search done on their file. Due to the way the tool works and to make the implementation on other peoples sites as simple as possible the tool remains hosted by me and a one line piece of Javascript code just needs to be added to the code of the site wishing to use the tool. This code includes a link to my site to call the information from my server to allow the tool to show and work on the other site. My questions are: Could this cause a problem with Google as far as their link quality goes? - Are we forcing people to give us a backlink to use the tool? (in the eyes of Google) or will Google not be able to read the Javascript / will ignore the link for SEO purposes? Should I make the link in the code Nofollow? If I should make the link a Nofollow any tips on how to make the most of the opportunity from a link building or SEO point of view? Thanks for your help
Intermediate & Advanced SEO | | MotoringSEO0 -
Google penalized site--307/302 redirect to new site-- Via intermediate link—New Site Ranking Gone..?
Hi, I have a site that google had placed a manual link penalty on, let’s call this our
Intermediate & Advanced SEO | | Robdob2013
company site. We tried and tried to get the penalty removed, and finally gave up and purchased another name. It was our understanding that we could safely use either a 302 or 307 temporary redirect in order to redirect people from our old domain to our new one.. We put this into place several months and everything seemed to be going along well. Several days ago I noticed that our root domain name had dropped for our selected keyword from position 9 to position 65. Upon looking into our GWT under “Links to Your site” , I have found many, many, many links which were pointed to our old google penalized domain name to our new root domain name each of this links had a sub heading “Via this intermediate link -> Our Old Domain Google Penalized Domain Name” In light of all of this going on, I have removed the 307/302 redirect, have brought the
old penalized site back which now consists of a basic “we’ve moved page” which is linked to our new site using a rel=’nofollow’ I am hoping that -1- Our new domain has probably not received a manual penalty and is most likely now
received some sort of algorithmic penalty, and that as these “intermediate links” will soon disappear because I’m no longer doing the 302/307 from the old sight to the new. Do you think this is the case now or that I now have a new manual penalty place on the new
domain name.. I would very much appreciate any comments and/or suggestions as to what I should or can do to get this fixed. I need to still keep the old domain name as this address has already been printed on business cards many, many years ago.. Also on a side note some of the sub pages of the new root domain are still ranking very
well, it’s only the root domain that is now racking awfully.. Thanks,0 -
I have a general site for my insurance agency. Should I create niche sites too?
I work with several insurance agencies and I get this questions several times each month. Most agencies offer personal and business insurance and in a certain geographic location. I recommend creating a quality general agency site but would they have more success creating other nice sites as well? For example, a niche site about home insurance and one about auto insurance. What would your recommendation be?
Intermediate & Advanced SEO | | lagunaitech1 -
Site comparison - what is wrong with me?
www.bcspeakers.com/ vs www.psbspeakers.com/ with the search term "speakers" why does BC speakers show up in around #50-60 and PSB is not in the top #1000? From all metrics on seomoz PSB kicks BC in every area by a large margine! can anyone see why BC is listed for that keyword and PSB is not?
Intermediate & Advanced SEO | | kevin48030 -
Can you advise why my site get outranked by sites with way less authority and so on
Hello SeoMoz, As a new member I first want to thank you guys for your service, seomoz is by far the best resource and toolbox I have ever found. I have a question, or more of a request if you could advise me on what I do wrong.
Intermediate & Advanced SEO | | DennisForte
I have a website: www.letsflycheaper.com with a Domain Authority of 80, and my target keywords are keywords like: cheap business class, business class flights.
My target page is: www.letsflycheaper.com/business-class.php. With all my keywords I am page 2 and I have a real hard time getting on the first page, but if I look at my competitors like: www.wholesale-flights.com with a Domain Authority of 'just' 50, crappy backlinks and so on, they are all on the first page with almost all of my keywords that I want to target. What do I do wrong? Can you maybe give me a couple tips on where I should focus on more? Hopefully you guys can help me... Kind Regards, Ramon van Meer0