
Over the past three years, the UK Ministry of Defence (MOD) has recognised the transformative potential of Artificial Intelligence (AI) to enhance national defence. Documents like the Defence AI Strategy and Defence AI Playbook set a clear ambition to embed AI across both operational and support domains, from logistics and personnel management to frontline Intelligence, Surveillance and Reconnaissance (ISR), and Command and Control (C2). In these data-heavy environments, AI can help defence personnel cut through information overload and make faster, better decisions.
This strategic intent is translating into real-world applications. For instance, Thales has deployed Maritime Mine Countermeasures (MMCM) systems using Unmanned Surface Vessels (USV) to detect and neutralise sea mines as part of the UK-France Mine Hunting Capability Programme. BAE Systems, in partnership with Cellula Robotics developed Herne, the UK’s first autonomous Extra-Large Underwater Vehicle (XLAUV), capable of executing missions without human input and delivered “from whiteboard to water” in 11 months.
These innovations show that cutting-edge AI capabilities are entering UK defence operations. However, this enthusiasm must be balanced with ethical and security considerations – chief among them the principle of Human-In-The-Loop (HITL): the idea that, regardless of an AI system’s sophistication, humans must remain ultimately accountable for its actions.
This blog explores how HITL affects AI adoption in defence and what it means for AI security.
What is Human-In-The-Loop?
HITL is a systems design principle that embeds human oversight into AI decision-making, especially in high-stakes environments. It ensures human operators can supervise, intervene, or override AI actions when necessary. This includes informing training data, supervising outputs, validating predictions, or making the final call in mission-critical contexts.
The logic is simple: accountability. If AI makes a wrong decision, someone must be responsible. Like doctors interpreting clinical data or drivers overriding self-driving cars, defence scenarios require human judgment to remain in the loop. In a defence context, these could be systems that identify objects of interest but require a human operator to confirm live insights.
This principle is backed by defence policy. The MOD’s Joint Service Publication (JSP) 936 lays out guidelines for AI development, aligning with MOD’s AI Ethical Principles. These principles provide a regulatory backbone to ensure AI is developed responsibly, especially where failure could be catastrophic.
Many early applications of AI in defence focus on low-risk, back-office use cases where HITL is easier to apply. These are valuable proof points for safe AI deployment. But as AI evolves toward more agentic systems, those that assess, decide, and act independently, the tension between innovation and oversight grows. HITL, while vital, may not scale with these advanced capabilities due to the speed in which human operators can respond.
What Does This Mean for AI Security?
More sophisticated AI systems bring new capabilities but also introduce novel vulnerabilities, many of which stem from the tension between autonomy and oversight. This extends beyond traditional cyber security into areas like robustness and transparency.
Robustness
Defence AI often operates in harsh, unpredictable environments; at the edge, in ISR missions, or during last-mile resupply. These conditions include manipulation, spoofing, and data degradation, in which human oversight can't always keep pace.
To ensure resilience, AI systems must be tested not just in controlled labs but under real-world conditions. Stress-testing with real-world data helps uncover vulnerabilities early and strengthens trust in performance under pressure. HITL can’t detect every edge-case failure; the system itself must be inherently robust.
Auditability & Transparency
AI decisions in defence must be accountable and reviewable. This creates two core challenges:
- Exposing AI Models - Defence systems need transparency without compromising security. How do you explain an AI’s reasoning, its training data, assumptions, and limitations, without exposing secrets to adversaries? Tools like Google’s Model Card Toolkit are being explored to support transparency, but striking the right balance remains difficult.
- True Visibility - Often, the explanations provided to users are AI-generated, meaning they’ve been filtered through machine interpretation. This machine-informed narrative can obscure the true rationale or hide flaws. Trusting AI-generated explanations without verifying their accuracy risks masking bias or faulty logic. Users must be aware of this risk and trained to question it.
Why is Human-In-The-Loop Not Enough?
While HITL remains a temporary and necessary safeguard for AI adoption in defence, it is increasingly clear that HITL alone is not sufficient to secure the future of AI in back-office and operational environments. As AI systems grow in complexity and new agentic capabilities emerge, the limitations of a purely human oversight model begin to surface.
These systems face real-world threats like data manipulation, misinformation, and unpredictable conditions. Oversight alone can’t catch what it can’t see, which is why AI in defence needs more than just a human safety net. It needs to be stress-tested, robust, and designed to adapt and recover in complex environments.
HITL works well in low-risk, slower-paced contexts. But as UK defence leans into more capable and autonomous systems, HITL must evolve. It should become just one part of a broader assurance approach, combining transparency, resilience, continuous testing, and intelligent system design.
To move forward, defence must build AI systems where humans aren’t just “in the loop” to provide accountability, but helping to shape loops that are secure, self-aware, and capable of escalating decisions only when it truly matters to maintain an efficient and competitive utilisation of AI within defence.
Stuart Nelson, Innovation Associate at Plexal

