My post about automating incident response prompted a fascinating chat with a long-standing friend-colleague who knows far more about Incident Response technology than I ever did. With many thanks to Aaron Kaplan (AK), here’s a summary of our discussion…
Developments in automated defence
AK: Using Machine Learning (“AI”) in cyber-defence will be a gradual journey. So, in practice in the next years, we won’t even notice that ML is there. I don’t see it as a turnkey/switch on magic. It will appear first for a few well defined sub-fields such as fighting spam – already here for many years (spam assassin, Bayes statistics, etc), never 100% eliminated the spam problem but we can’t live without ML for spam classification anymore – or weeding out False Positive alerts in a Security Operations Centre….
AC: Since you mentioned “false positives” I have to mention my favourite talk from FIRST 2019: Desiree Sacher on learning (mostly about your own organisation’s processes and systems) from those…
AK: Or – this is already being worked on, for example by CIRCL – we will see very simple classifiers which try to answer simple questions such as “what kind of IP address is this?” (residential CPE, server, datacentre, …). Or questions such as “What’s the probability that this is a genuine webpage on that URL? Or is it rather a simple web server’s default page (in all its variations up till landing pages of domain grabbers)?”. IMHO these tasks are well suited to ML. And all that it will give us for Incident Response is that we get another “opinion” from some system. So the “opinion”/probability for decision support is probably going to come first.
AC: So this is providing incident responders with better information for their decision-making. Indeed often automating routine searches and giving statistical evidence for gut-feel decisions that, if they had time, they’d hope to do manually?
AK: It also is a path towards fully automatic decisions based on (possibly biased) simpler ML models (such as the ones mentioned above). But it will be a gradual, very long journey. First, we will discover just how hard it is to get a good ML based system running and how hard it is to eliminate bias. And since it’s a long journey, we won’t even notice we are on this journey. We plan and learn as we go. One example I am looking into right now: do GPT-3 based systems sufficiently summarise APT threat intelligence reports? It’s *really* fun to play around with…
Developments in automated attack
AC: You mentioned that the badguy might be able to interfere with my robot in even simpler ways?
AK: Yes. Here is a pretty good paper on how a 1-pixel change tricks the classifier into believing that a cat is a bear. Paper by no other one than Adi Shamir: a crypto legend looking into Deep Learning 🙂
AC: I guess they are both about the impact of tiny non-randomnesses…
AK: Another thought which came to mind after seeing the good guy – bad guy infinity-eight picture: this is basically reinforcement learning from the attacker’s point of view. And if the defender also deployed ML on his side of the infinity-eight game, we would end up with essentially something like Generative Adversarial Networks.
ANC: Hmmm. When you use a GAN in the lab, there’s a referee to make sure the good guy wins…
ANC: I think you’ve reassured me that both defenders and attackers are on a long journey to use this stuff. And, actually, maybe it doesn’t hugely tilt the balance? On the defender side, we can become more efficient by using ML decision-support tools to free up analysts’ and incident responders’ time to do the sort of things that humans will always be best at. Meanwhile explore what aspects of active defences can be automated.
Attackers will get new tools, too, but for most sites those are going to be mass attacks and, I think, pretty noisy? One of the few things I’ve always taken reassurance from is that a mass attack ought to be detectable simply because it is mass. It might take us a while to work out what it is, but so long as we share information, that doesn’t seem impossible. That’s how spam detection continues to work, and I’d settle for that level of prevention for other types of attack!
Some organisations will, by their nature, be specific targets of particularly well-funded attackers who may use ML for precision. Those organisations need equivalent skills in their defenders. But for most of us our defences need to be good – say, a bit better than good practice – but probably not elite. Thanks.