What To Do About Yahoo Slurp Bot Bogging My Site Down?
-
Hello,
Our IT department has informed me that they have seen extremely heavy traffic from the Yahoo Slurp bot in recent days. They are claiming this bot has single-handedly caused one of our servers to crash.
I am a bit skeptical of this, as I have not found these particular legitimate search engine bots to be aggressive resource hogs, especially for an enterprise-level web server.
I have requested to examine the server logs myself, but have not had success with this. IT is requesting to block this particular bot, but I am apprehensive about doing this, as I don't want this to have any negative implications on our site showing in Yahoo News or other Yahoo properties.
Does anyone else have experience with this bot being an overly-zealous resource drag, and if so, what is the best course of action to satisfy all parties?
-
Examining the server logs yourself probably wont help your understanding of the issue unless you know what your looking at specifically. On the Yahoo note, i have found Slurp to be really bad in the past, but no legitimate bot should be able to bring down a properly configured web server, especially an 'enterprise-level' one.
I would check your .htaccess and apache settings for bad redirects (or web.conf if on windows) before considering banning the bot. Other things to check would be website code or if a bot hits a massive and horribly optimised Database Query for example, that could bring the server down.
Ask IT exactly what the bot did that caused the server to go down, they should atleast be able to tell you that. If not then they need to run load tests against the website itself to try and reproduce the scenario and thus debug the issue, if indeed there is one.
Tl;dr :- Normally bad config or code / queries are to blame for this kind of thing. I'd review that before blocking a bot that crawls hundreds of thousands of other sites without issue.
-
You should be able to can control the rate at which the bot accesses you pages by adding a crawl delay in your robots.txt file. Robots.txt and crawl delay is discussed here: http://en.wikipedia.org/wiki/Robots_exclusion_standard, and Slurp bot here: https://help.yahoo.com/kb/SLN22600.html.
Should look like this in your robots.txt file:
User-agent: Slurp
Crawl-delay: 30
The crawl delay is the number of seconds the bot should wait between pageview (ask your IT guys what's appropriate for you). I stuck 30 in there, meaning the Slurp bot would only be able to access up to 2 pages a minute.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving multiple Sites to One Site and SEO Impact/Ideas
Hi there, We are in the process of moving 2 sites with higher page authority to another site we own (that is our company brand), so essentially 3 sites into one. We're at risk of losing a lot of SEO from the original 2 sites that have all the product information. We are doing this since we merged companies a couple years back and need one web precense. Anyhow, the site launch date is in 3 months and the recommendation is to start moving content over prior to that for top pages, which is a big undertaking when we are launching all the pages again with new content, redeisgn and moving sites in 3 months. If it's the right move, we should do it, but I just wanted to get opinions on how others have handled something similiar when moving to a site with lower site authority and trying not to lose rankings.
Intermediate & Advanced SEO | | lauramrobinson320 -
My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help
Hi, This site is not indexed on Google at all. http://www.thethreehorseshoespub.co.uk Looking into it, it seems to be giving a 503 error to the google bot. I can see the site I have checked source code Checked robots Did have a sitemap param. but removed it for testing GWMT is showing 'unreachable' if I submit a site map or fetch Any ideas on how to remove this error? Many thanks in advance
Intermediate & Advanced SEO | | SolveWebMedia0 -
When Mobile and Desktop sites have the same page URLs, how should I handle the 'View Desktop Site' link on a mobile site to ensure a smooth crawl?
We're about to roll out a mobile site. The mobile and desktop URLs are the same. User Agent determines whether you see the desktop or mobile version of the site. At the bottom of the page is a 'View Desktop Site' link that will present the desktop version of the site to mobile user agents when clicked. I'm concerned that when the mobile crawler crawls our site it will crawl both our entire mobile site, then click 'View Desktop Site' and crawl our entire desktop site as well. Since mobile and desktop URLs are the same, the mobile crawler will end up crawling both mobile and desktop versions of each URL. Any tips on what we can do to make sure the mobile crawler either doesn't access the desktop site, or that we can let it know what is the mobile version of the page? We could simply not show the 'View Desktop Site' to the mobile crawler, but I'm interested to hear if others have encountered this issue and have any other recommended ways for handling it. Thanks!
Intermediate & Advanced SEO | | merch_zzounds0 -
Ranking of Travel Sites in SERPs
Hello, I have noticed that some travel sites rank for almost all the keywords but when I click the page, it has no relevant content and often no content at all. I remember Google once updated its algorithm to do away with such sites but I still found some. The question is - if they don't have relevant content or if they don't have content at all, how do they even rank? Secondly, how come they have pages for all keyword combination? How is this achieved? Regards
Intermediate & Advanced SEO | | IM_Learner0 -
Article section on site or blog?
So, I've just started using MOZ since I've decided I wanna be an "expert" in SEO.
Intermediate & Advanced SEO | | KasperGJ
I run a couple of successful websites in Denmark and I've had some SEO guy do some SEO a few years back, but now I wanna learn this myself. I've already read a lot of books, blogs on the subject and talked with several SEO "experts". Anyways, I have a concrete "problem" which I need some help on deciding what to do. Its the same issue / dilemma on all my sites. Dilemma
On my site i have a menu-section called Articles and tips. As the name implies it's basically articles and tips on subjects related to the site.
The articles are both informal for the users and I also use these to attract new users on specific keywords.
The articles are not "spam" articles or quickly made articles, the actually give good information to the users and are wellwritten and so. I've hired a girl to create more articles, so there will be a good flow on articles, interviews and so on soon. Some SEO guys tells me, that I should create and use a external blog "instead" and post the articles there instead of on my site. (ex www.newsiteblog.com) And another SEO guy tells me that I should run a blog on my own site (ex www.ownsite.com/blog) , where I post the articles. I have a really hard time deciding what is the best way, since I hear all kinds of ideas, and really dont know who to trust. My own idea is, that it seems "stupid" to take content from the site and put on external blog.
Then I would also have to create a new blog, and point links from that to my site and so. Any of you guys have any ideas? Sorry for my bad english.0 -
Optimal site structure for travel site
Hi there, I am seo-managing a travel website where we are going to make a new site structure next year. We have about 4000 pages on the site at the moment. The structure is only 2-levels at the moment: Level 1: Homepage Level 2: All other pages (4000 individual pages - (all with different urls)) We are adding another 2-3 levels, but we have a challenge: We have potentially 2 roads to the same product (e.g. "phuket diving product") domain.com/thailand/activities/diving/phuket-diving-product.asp domain.com/activities/diving/thailand/phuket-diving-product.asp I would very much appreciate your view on the problem: How do I solve this dilemma/challenge from a SEO standpoint? I want to avoid DC if possible, I also only want one landing page - for many reasons. And usability is of course also very important. Best regards, Chris
Intermediate & Advanced SEO | | sembseo0 -
Interesting site migration question.
Hi all. I'm looking for some thoughts on a migrations option we have. At the moment we have two E-Com sites ranking well for some of the same terms. An older site, and a nice new site. The older site is ranking very well for category and product terms, the new one is slowly coming up. Ideally we would like to have one site, the nice new one, and get rid of the old one. If I 301 the old site url's to the new sites will that bring the new site url's into the same position as the old ones? I'm just not sure how this effects sites that are already ranking well. Any ideas are welcomed but I'm really looking for a definitive answer. It's a big decision after all.
Intermediate & Advanced SEO | | PASSLtd0 -
Site structure question
Hello Everyone, I have a question regarding site structure and I would like to mastermind it with everyone. So I am optimizing a website for a Ford Dealership in Boston, MA. The way the site architecture is set up is as follows: Home >>>> New Inventory >>> Inventory Page (with search refinement choices) After you refine your search (lets say we choose a Ford F150 in white) it shows a page with images, price information and specs. (Nothing the bots or users can sink their teeth into) My thoughts are to create category pages for each Ford model with awesome written content and THEN link to the inventory pages. So it would look like this: Home >>> New Inventory >>> Ford 150 Awesome Category Page>>>>Ford F150 Inventory Page I would work hard at getting these category pages to rank for the vehicle for our GEO targeted locations. Here is my questions: Would you be annoyed to first land on a category page with lots of written text, reviews images and videos first and then link off to the inventory page. Or would you prefer to go right from the new inventory page to the actual inventory page and start looking for vehicles? Thanks you so much, Bill
Intermediate & Advanced SEO | | wparlaman0