Personally, I've run across spiders that search for entry points and exploits in common CMS, e-commerce, and CRM web applications. For example, there was a recent Wordpress bug that could be exploited to serve malicious content (read: virus) to visiting users.
Spoofing the User-Agent string is elementary at best, and wouldn't fool any sys admin worth a salt. All you have to do is a WHOIS on the requested IP to help identify it's origin.
I'm a bit of a data geek, so I like to grep through log files to see things that won't show up in Analytics that require Javascript.