
An autonomous cyberattack challenge
Dstl is a department within the UK Ministry of Defence (MOD). As part of the Autonomous Resilient Cyber Defence (ARCD) project, it challenged industry to meet its ambition to reduce impact on the military by establishing an autonomous cyberattack response. The imperative was to exceed technologies which use artificial intelligence to spot anomalies that might indicate attack. Instead, the solution should actually decide the best response.

Multi-agent reinforcement learning
The three-year ARCD project aims to research and develop self-defending, self-recovering concepts for military platforms and technologies that can be tested and evaluated against attacks in a representative military environment. For our initiative, we developed a proof of concept using collaborative multi-agent reinforcement learning (MARL) agents – an approach that is currently the subject of much academic research.

Enabling an autonomous response
There are lots of ways to respond to a cyberattack. Choosing the right one depends on both the attack and the situation – which is not a combination well suited to most AI approaches. But with collaborative MARL, the complex problem of deciding the best response can be broken down into individual decisions. Each one of the multiple learning agents recommends whether their action is appropriate to a particular situation. The agents then collaborate to decide on the best course – which enables an autonomous response to the attack.

Context defines the correct action
The application we chose to explore was the defence of vehicles, where context is a key part of the problem. This simplified scenario illustrates the challenge: When under a cyberattack, do you stop your vehicle, or is it safer to carry on? The answer depends on the situation. You don’t always slam your brakes on in the middle of a busy road junction because there might be a safer option. This is equally true for a military situation where the context will, in part, define the correct action.

Learning in a safe, simulated environment
Our proof of concept used a simulated environment to train our many AI agents. It meant they could learn from their mistakes in a safe environment. Using simulation also enables us to test as yet unseen cybersecurity attacks and others where real world data is in short supply. We were also able to run simulations faster than real time – so speeding up the training process for the AI.

Protecting against wider threats
So where do we go from here? We continue to work with Dstl on research and development of the system. We intend to explore how to better train multiple agents working in parallel, and how to maximise the robustness of the system and so protect against a wider set of threats. Our aim for three years hence is to deliver a demonstration in representative context.
“The client's ambitions matched ours, so this was a great opportunity to advance some of the latest ideas and AI techniques we’ve been focusing on."
Richard Williams, Programme Manager

"It's wonderful to get the chance to disclose some aspects of the ground-breaking strides we’re making in next-generation cybersecurity technology."
Sam Pumphrey, Head of Digital Security

"Cybersecurity is so crucial for next-generation systems. It's thrilling to be able to push the boundaries of cutting-edge techniques in AI to achieve that next level of protection."
Madeline Cheah, Senior Consultant

“This is an exciting project that not only enables us to develop technology, but further deepens our expertise in AI application to cyber security, supporting Dstl’s aim to build UK capability.”
Mark Dorn, Associate Director

"This AI involves multiple agents acting in a dynamic environment, working together to respond to attacks. The real challenge is ensuring that the agents cooperate reliably in a meaningful way."
Peter Haubrick, Senior Machine Learning Engineer
