NCSC

Understanding adversarial attacks against Machine Learning and AI

A new NCSC paper explores how adversaries target machine‑learning and AI systems, outlining key attack techniques and their implications for security.

Access the full document

Please refer for further information:

Download

April 29, 2026

A new paper by the National Cyber Security Centre (NCSC) examines how adversaries target machine‑learning and AI systems, highlights key attack techniques and research gaps, and underscores the importance of cross‑sector collaboration through initiatives such as LASR.

Artificial Intelligence (AI) and Machine Learning (ML) systems offer significant advantages, yet also introduce considerable risks. The rapid development cycle, unique architectures, large model sizes, and prevalence of open-source components in ML systems create a significantly larger attack surface than traditional software, increasing opportunities for malicious actors to embed or exploit ML-specific vulnerabilities.

Designers, deployers, managers and operators of ML models need to understand ML-specific vulnerabilities and implement robust security measures to safeguard system integrity, confidentiality and performance, and allow the benefits of AI to be realised.

1. Introducing adversarial machine learning (AML) attacks

This paper has been developed with ML security experts from the national security and defence communities. It outlines an evolving set of Adversarial ML (AML) attack classes which group attacks that exploit vulnerabilities inherent in the operation of ML models. Such attacks can lead to significant harm, including unintended changes to model functionality, or the extraction of sensitive information.

In addition to their unique AML vulnerabilities, ML systems are equally susceptible to the broad spectrum of traditional cyber security threats. In this paper, we draw a distinction between these attack types and focus specifically on AML attack classes. In this way, this paper supplements existing taxonomies and frameworks, such as NIST's AML Taxonomy and MITRE ATLAS, which tend to include both ML-specific and wider cyber security attacks.

AML attacks are not unique to deep neural networks. They can occur at any stage of the model lifecycle – across development, training and deployment – and may target both hardware and software components. Attacks range from the simple to the sophisticated, with threats originating from basic API queries to direct unauthorised intrusion. A successful attack against a single component can cascade across the entire system, and the growing trust placed in AI/ML models makes them more attractive as entry points for propagating attacks throughout interconnected systems.

1.1 Aims of this paper

The attack classes introduced here group similar adversarial ML attack techniques to:

Raise awareness – for software developers, ML practitioners, cyber security specialists, security decision-makers and risk owners – of the many ways ML vulnerabilities may be exploited by AML attacks. This will help teams to better identify compromises and adopt security measures throughout the developmental lifecycle, as described in the UK government’s AI Cyber Security Code of Practice.
Support threat modelling of ML systems by articulating an adversary-first approach that complements existing defensive taxonomies (see the annex for the mapping of attack classes to NIST's AML Taxonomy and MITRE ATLAS).
Provide consistent language for discussing attacks across AI/ML architectures.
Highlight research gaps in understanding and defending against AML attacks, and enable greater collaboration on ML security issues across government, industry and academic sectors. This includes efforts from initiatives such as the Laboratory for AI Security Research (LASR), in which the NCSC is a core partner.

We do not attempt to define defences for every attack class since appropriate mitigations depend heavily on context, and the defensive landscape is evolving rapidly. Defending against AML attacks is an active research area, and we encourage further research to better protect ML systems against this wide array of potential attacks.

1.2 ML attacks in a cyber security context

As stated earlier, ML systems are equally susceptible to the broad spectrum of traditional cyber security attacks a malicious actor may conduct to compromise the confidentiality, integrity or availability of their target. These may attack the ML system’s wider IT infrastructure, operational or physical environment, and such breaches may influence the behaviour of an ML model even where the model is not the intended attack target. For further guidance on traditional cyber security attacks, see the NCSC’s 10 Steps to Cyber Security, and for protecting ML systems against a variety of attacks, see the NCSC’s Principles for the Security of Machine Learning.

Only those attacks that target the operation of the ML model specifically are considered to be AML. These AML attacks may be used to have a singular effect on the ML model, or may be used alongside more traditional attacks to achieve a wider effect on a target ML system and any downstream connections.

An example of a traditional attack commonly included in ML system vulnerabilities is the serialiser exploitation of Python pickle files, which can cause insecure loading and execution of hidden code on a target system. While this is a commonly used filetype in ML development, this attack is not specific to ML functionality, and is therefore not considered an AML vulnerability.

‍‍

To read the full paper, please download the document or visit NCSC's website.

‍

Image source: ismagilov via Getty Images

Stay connected with the latest LASR opportunities.