The Promptware Kill Chain

Attacks against modern generative artificial intelligence (AI) large language models (LLMs) pose a real threat. Yet discussions around these attacks and their potential defenses are dangerously myopic. The dominant narrative focuses on “prompt injection,” a set of techniques to embed instructions into inputs to LLM intended to perform malicious activity. This term suggests a simple, singular vulnerability. This framing obscures a more complex and dangerous reality. Attacks on LLM-based systems have evolved into a distinct class of malware execution mechanisms, which we term “promptware.” In a new paper, we, the authors, propose a structured seven-step “promptware kill chain” to provide policymakers and security practitioners with the necessary vocabulary and framework to address the escalating AI threat landscape.

In our model, the promptware kill chain begins with Initial Access. This is where the malicious payload enters the AI system. This can happen directly, where an attacker types a malicious prompt into the LLM application, or, far more insidiously, through “indirect prompt injection.” In the indirect attack, the adversary embeds malicious instructions in content that the LLM retrieves (obtains in inference time), such as a web page, an email, or a shared document. As LLMs become multimodal (capable of processing various input types beyond text), this vector expands even further; malicious instructions can now be hidden inside an image or audio file, waiting to be processed by a vision-language model.

The fundamental issue lies in the architecture of LLMs themselves. Unlike traditional computing systems that strictly separate executable code from user data, LLMs process all input—whether it is a system command, a user’s email, or a retrieved document—as a single, undifferentiated sequence of tokens. There is no architectural boundary to enforce a distinction between trusted instructions and untrusted data. Consequently, a malicious instruction embedded in a seemingly harmless document is processed with the same authority as a system command.

But prompt injection is only the Initial Access step in a sophisticated, multistage operation that mirrors traditional malware campaigns such as Stuxnet or NotPetya.

Once the malicious instructions are inside material incorporated into the AI’s learning, the attack transitions to Privilege Escalation, often referred to as “jailbreaking.” In this phase, the attacker circumvents the safety training and policy guardrails that vendors such as OpenAI or Google have built into their models. Through techniques analogous to social engineering—convincing the model to adopt a persona that ignores rules—to sophisticated adversarial suffixes in the prompt or data, the promptware tricks the model into performing actions it would normally refuse. This is akin to an attacker escalating from a standard user account to administrator privileges in a traditional cyberattack; it unlocks the full capability of the underlying model for malicious use.

Following privilege escalation comes Reconnaissance. Here, the attack manipulates the LLM to reveal information about its assets, connected services, and capabilities. This allows the attack to advance autonomously down the kill chain without alerting the victim. Unlike reconnaissance in classical malware, which is performed typically before the initial access, promptware reconnaissance occurs after the initial access and jailbreaking components have already succeeded. Its effectiveness relies entirely on the victim model’s ability to reason over its context, and inadvertently turns that reasoning to the attacker’s advantage.

Fourth: the Persistence phase. A transient attack that disappears after one interaction with the LLM application is a nuisance; a persistent one compromises the LLM application for good. Through a variety of mechanisms, promptware embeds itself into the long-term memory of an AI agent or poisons the databases the agent relies on. For instance, a worm could infect a user’s email archive so that every time the AI summarizes past emails, the malicious code is re-executed.

The Command-and-Control (C2) stage relies on the established persistence and dynamic fetching of commands by the LLM application in inference time from the internet. While not strictly required to advance the kill chain, this stage enables the promptware to evolve from a static threat with fixed goals and scheme determined at injection time into a controllable trojan whose behavior can be modified by an attacker.

The sixth stage, Lateral Movement, is where the attack spreads from the initial victim to other users, devices, or systems. In the rush to give AI agents access to our emails, calendars, and enterprise platforms, we create highways for malware propagation. In a “self-replicating” attack, an infected email assistant is tricked into forwarding the malicious payload to all contacts, spreading the infection like a computer virus. In other cases, an attack might pivot from a calendar invite to controlling smart home devices or exfiltrating data from a connected web browser. The interconnectedness that makes these agents useful is precisely what makes them vulnerable to a cascading failure.

Finally, the kill chain concludes with Actions on Objective. The goal of promptware is not just to make a chatbot say something offensive; it is often to achieve tangible malicious outcomes through data exfiltration, financial fraud, or even physical world impact. There are examples of AI agents being manipulated into selling cars for a single dollar or transferring cryptocurrency to an attacker’s wallet. Most alarmingly, agents with coding capabilities can be tricked into executing arbitrary code, granting the attacker total control over the AI’s underlying system. The outcome of this stage determines the type of malware executed by promptware, including infostealer, spyware, and cryptostealer, among others.

The kill chain was already demonstrated. For example, in the research “Invitation Is All You Need,” attackers achieved initial access by embedding a malicious prompt in the title of a Google Calendar invitation. The prompt then leveraged an advanced technique known as delayed tool invocation to coerce the LLM into executing the injected instructions. Because the prompt was embedded in a Google Calendar artifact, it persisted in the long-term memory of the user’s workspace. Lateral movement occurred when the prompt instructed the Google Assistant to launch the Zoom application, and the final objective involved covertly livestreaming video of the unsuspecting user who had merely asked about their upcoming meetings. C2 and reconnaissance weren’t demonstrated in this attack.

Similarly, the “Here Comes the AI Worm” research demonstrated another end-to-end realization of the kill chain. In this case, initial access was achieved via a prompt injected into an email sent to the victim. The prompt employed a role-playing technique to compel the LLM to follow the attacker’s instructions. Since the prompt was embedded in an email, it likewise persisted in the long-term memory of the user’s workspace. The injected prompt instructed the LLM to replicate itself and exfiltrate sensitive user data, leading to off-device lateral movement when the email assistant was later asked to draft new emails. These emails, containing sensitive information, were subsequently sent by the user to additional recipients, resulting in the infection of new clients and a sublinear propagation of the attack. C2 and reconnaissance weren’t demonstrated in this attack.

The promptware kill chain gives us a framework for understanding these and similar attacks; the paper characterizes dozens of them. Prompt injection isn’t something we can fix in current LLM technology. Instead, we need an in-depth defensive strategy that assumes initial access will occur and focuses on breaking the chain at subsequent steps, including by limiting privilege escalation, constraining reconnaissance, preventing persistence, disrupting C2, and restricting the actions an agent is permitted to take. By understanding promptware as a complex, multistage malware campaign, we can shift from reactive patching to systematic risk management, securing the critical systems we are so eager to build.

This essay was written with Oleg Brodt, Elad Feldman and Ben Nassi, and originally appeared in Lawfare.

Tags: AI, LLM, malware

Posted on February 16, 2026 at 7:04 AM • 11 Comments

Comments

Clive Robinson • February 16, 2026 9:14 AM

@ Bruce, ALL,

There is another model that starts with “prompt injection” but does not require a “Jail break” or “RCE” it simply gets a list of “tools” the “Agent” can access and uses those just as an Agent would to access local documents, files, sensors etc.

In effect the attacker builds their own agent and runs that.

As for “Retrieval-Augmented Generation”(RAG), it was designed to get around two issues,

1, The massive resources of the ML needed to build the weights used in the LLM DNN.
2, To access “current or local data” to get around the static nature of the DNN in the LLM that gets around the “memory issue” and the resulting fact LLM DNNs do not learn anything. That is what the ML does.

The problem the “working nemory” of the RAG is outside of the LLM thus you get two very real issues.

1, The access to this “working memory” is in effect the square of the size of the data held times a constant. Which makes it clearly slow.
2, The security of the information in the “working memory” is worse than a lot of “pre Chrome Web Browsers” for all the same reasons.

Now it’s not much talked about yet but these two issues opens up all sorts of “side channel leaks” so keeping data confidential is at best difficult…

Thus just using an agent with in effect is a security hole “built in”…

Similar reasoning applies to any framework you build agents from or into. Which means the use of the “Ralph Loop” and “Gas Station” are likewise massive generators of “side channel leaks” and there is little or nothing you can do to stop them hemorrhaging confidential nature just by the way they work.

Clive Robinson • February 16, 2026 9:39 AM

With regards,

“… the attacker circumvents the safety training and policy guardrails that vendors such as OpenAI or Google have built into their models. Through techniques analogous to social engineering”

There is another way to do this and in part it’s been used to prove that there will always be a way for an attacker to

“circumvents the safety training and policy guardrails”.

Put simply you use a level of cryptography to “hide the prompt”.

If you know that there is a source of information that is stable in an LLM then you can use it as an encryption key for a simple stream cipher, which you use to “hide the prompt” from the safety training and policy guardrails”.

I’ve detailed a number of times in the past that the use of a secure stream cipher and a plaintext code book, that allows “Deniable Plaintext Cryptography”.

So not only can any data be injected into the LLM but also hidden as plaintext in the output. So invisible to any observer.

As such this research is a direct consequence of the work of Claude Shannon during WWII, but also Gus Simmons in the 1980’s.

All that’s required is sufficient “redundancy bandwidth” in the information channel to make a covert side channel through which information can be exported.

Due to the “observer problem” there is nothing an information observer xan do to stop information being covertly exported…

bye bye ai • February 16, 2026 11:36 AM

I agree with Clive. I want to highlight a point he only touched upon which is the tradeoff between energy usage and speed. The problem with the solution the authors propose is that the more complex the system becomes…all other things being equal..the more costly it is both in computing power and ultimately in electricity. This then changes the financial framework in which the system is deployed both for the end user and the economy at large.

At what point in time does the attack/defensive cat and mouse game become the functioning equivalent of a computational and therefore economic fork bomb? For the reasons the authors articulated this isn’t a genuine concern in a traditional computing system but it may become a limiting factor for token based AI systems.

mark • February 16, 2026 1:05 PM

There’s… no structural difference between user commands and system commands?!

This, IMO, isn’t even alpha software, nor is it architected. It’s more like first programming class spaghetti…

Rontea • February 16, 2026 2:32 PM

In promptware attacks, reconnaissance focuses on probing the model—testing behavior, identifying guardrails, and mapping response limits—rather than scanning networks or collecting system metadata. Defenders should watch for subtle, iterative queries that reveal attackers are studying the AI, unlike the obvious network scans of traditional intrusions.

lurker • February 16, 2026 4:02 PM

@mark, ALL

There is then an underlying social problem that we see in a lot of other places these days: nobody will tell the emperor that he is wrong.

The AI boffins will say that these systems are open, evolving, learning, whatever. But yes, in that case they should be kept in steel cages, behind high walls …

Clive Robinson • February 16, 2026 6:38 PM

@ bye bye ai, ALL,

Whilst the Current AI LLM and AI systems are mostly junk and can not perform in a sensible way it’s not just “power, water and economics” that are a significant issue…

It’s hard to get data –for obvious reasons– but the number of GPU’s alone that burn out in one training run has been put by some as as high as 50,000… Yup have a think on that it’s why doing a full build on a DNN for a LLM is now almost prohibitive… So the trick is to augment existing models with new training data. But that has a problem in of it’s self that almost guarantees “general AI” is not going to be a thing in the near future. Basically it’s due to the amount of useless slop already in existing DNN models that you can not effectively subtract out as it’s diffused across the DNN weights…

But as you note with,

“At what point in time does the attack/defensive cat and mouse game become the functioning equivalent of a computational and therefore economic fork bomb?”

I think we are already beyond the tipping point on the economics and the result will be a major shock to not just the US but other bations where the Politicians have been conned into FOMO. The economics as Ed Zitron and several others,

“I fear tens of thousands of people will lose their jobs, and much of the tech industry will suffer as they realize that the only thing that can grow forever is cancer.”

https://www.wheresyoured.at/subprimeai/

Also Gary Marcus,

Breaking: OpenAI is probably toast

“Will OpenAI someday be seen as the WeWork of AI, as I have suggested several times, as far as back as late 2023? I still think so, and I think that moment is drawing close. They have, by any reasonable standard, seen rough times of late. Google and Anthropic have each largely caught up; various Chinese companies are closing in. More and more questions are being raised about their financing. (And more and more people are agreeing that AGI won’t arrive this decade).”

https://garymarcus.substack.com/p/breaking-openai-is-probably-toast

That is the economics “makes no sense” for “general AI” in any rational world. Nor sadly the US economy, and potentially the World economy, hence the “subprime” term being banded about. That is in a reasonable projection there is in effect,

“No ROI to be had at any price at any time…”

Because within a very few years maybe as few as one or two, the interest debt alone will not be met by any earnings and sensible investors will have “deserted the sinking ship, before they to get draged down”… By which time the “mug punters” will be dining at best on “fried tulip bulbs”[1].

That’s not to say that some of the current LLM and ML systems will not pay dividends, in niche areas that are very far from “general” and have clearly defined and highly specific function they will “pay off” but at what cost for the shortening in time you potentially gain?

We hear a lot currently about an “arms race” between computer security attackers and defenders using Current AI LLM Systems…

The thing is “arms races” never actually go very far, due to the exponential rise in costs against performance. In the defence industry we rarely get beyond CCCM because costs are more than a thousand fold for the next very fractional increment at any given technology level improvment.

Thus to sustain any kind of arms race one or both of the following has to happen,

1, Technology improvement at some significant power law versus cost (it happens but not often).

2, Other events force the “arms race” from being “general to highly specific” which tends to stretch out in time the over all exponential / cost rise curve.

Whilst both events are not “Black Swans on the horizon”, they are rarer than real “giant squid rings”.

[1] When I say “fried tulip bulbs” people think it’s metaphorical, it’s not. You can still find Dutch recipes for that and similar dishes.

https://cookingupdate.com/how-to-cook-tulip-bulbs/

Just remember that like “casava root” not all tulip bulbs can be safely eaten.

bye bye ai • February 16, 2026 9:36 PM

Because within a very few years maybe as few as one or two, the interest debt alone will not be met by any earnings and sensible investors will have “deserted the sinking ship, before they to get draged down”…

This seems to me to be taking too narrow a view of the matter. This article claims that there will be 700 billion dollars of investment in AI in 2026.

https://www.fool.com/investing/2026/02/16/big-tech-will-spend-700-billion-on-ai-top-stock/

700 hundred billion dollars is a humongous amount of textbooks for kindergarten students. So to me it’s the social misallocation of resources that I fear. That’s how a fork bomb works: by replication upon replication until all the resources in the system are consumed and the system then dies. I worry that is what AI is doing to American society. It’s more than the loss to investors; it is the opportunity cost to society. We are wasting huge sums of money arguing whether AI is conscious.

https://tech.yahoo.com/ai/claude/articles/anthropic-ceo-says-company-no-114500959.html

AI has gone beyond technology. It’s now flirting with being a religious cult. And I fear in that environment “No ROI to be had at any price at any time…” is besides the point. What has been the ROI on the Moonies? On Feng Shui?

Clive Robinson • February 17, 2026 4:09 AM

@ bye bye ai,

With regards that $700billion on “Data Centers” that is already falling apart and really not going to happen as they are a realy dumb idea and at best “a hole in the water to pour in ‘mug money'”…

With regards the Motley Fool recommendation I’ve talked about TSMC in the past and more importantly

I’ve mentioned even further back in the past “buying the stock of the supplier to the supplier” is in general a good idea for various reasons not least because they are generally free of “leveraged debt” that the buyer or first line supply may be holding. And many people don’t think that far.

That said you could actually look at TSMC activities and you might scratch your chin and think.

They’ve signed up to building Factories in the US which is actually a very very bad idea for a whole load of reasons. But the main one is to get Trump off of their backs.

So it’s going to be a “go slow” investment in the US if they have any sense that will probably pay nothing in the future… And even if they were serious a new FAB plant in a new country is going to take either significant cut in profits over a decade or so and / or significant debt so I’d think very carefully about if you really want to get into that type of tied in finance hole.

Worse think TicTok we know Trump will just steal it on “National Security” Grounds to keep Silicon Valley Megga Corps on side,

https://www.forbes.com/sites/forbeswealthteam/2025/09/25/the-web-of-billionaire-pals-partners-and-trump-supporters-taking-control-of-tiktok-us/

The other thing as I’ve mentioned before is that Trump or who ever follows him will use it as an excuse to nolonger keep Taiwan defended from China, as it will make the US based part many many times more valuable… So either way there is a high probability TSMC in Taiwan will be either stolen or toasted or both.

So all you would be buying into is a highly risky bet other investors are not doing suitable “due diligence” on. Just as they did not with Nvidia… which is about to fall into major difficulties and will certainly drop a long long way down on that overly high $5Trillion valuation.

As I advised in the past they were a good bet up untill they got to the $1Trillion valuation which is actually high for their real worth which is why I suggested people get out just before Nvidia got there.

Nvidia got up to 5Trillion by using financial trickery which I’ve covered a few times… It’s not sustainable as there is nothing actually there and it’s all kind of based on OpenAI, Meta, xAI and Oracle none of which look good for any money just paper to pass around and hype the stock values. This is known in the UK as a “Mug bait market” because those that do their due diligence know it’s at best an illusion to sucker the little guys in so the big guys can get out with their investment at or near top rate money… (Some are calling it “rug pull tactics” but…).

That said all the high end Chip manufacturers are very very dependent on just one Dutch company, to see who they are and why Veritasium did a fairly good video on them just recently,

https://m.youtube.com/watch?v=MiUHjLxm3V0

You might be better advised to look there rather than TSMC or Intel etc etc.

But hey I’m not nore have I ever been a financial advisor which I suspect is the same for you.

As for Motley Fool well, get yourself a long barge pole… Because it is not either, it pretends to be “an educational website that teaches the basics of finances”… but in reality is just another “subscription-based business model” to strip mugs of their money.

And no they certainly are not worth what they “supposedly” once were reputation wise, before the Dot Com bubble (and honestly even long before that as they were forced to admit). So if you follow their advice well… don’t be surprised when you don’t get what you are subscribing your hard earned bucks for.

Anonymous • February 17, 2026 2:35 PM

“of AI agents being manipulated” leads to the same link as the next one.

Clive Robinson • February 17, 2026 3:12 PM

With regards my comment of

blockquote>“I’ve mentioned even further back in the past “buying the stock of the supplier to the supplier” is in general a good idea for various reasons not least because they are generally free of “leveraged debt” that the buyer or first line supply may be holding. And many people don’t think that far.”

<

blockquote>

Don’t take it to far… as some may have done,

Flush with potential? Activist investor insists Japanese toilet giant is an AI sleeper

“The AI hype cycle has officially reached the toilet, with a Japanese bathroom giant suddenly being pitched as a serious tech play.

Activist investor Palliser Capital has taken aim at Japanese loo legend Toto, urging the firm to make more noise about its little-known chip components business and arguing the maker of the “Washlet” – described by the company as “the original shower toilet” – is quietly sitting on a flush of value tied to the AI boom.“

https://www.theregister.com/2026/02/17/palliser_capital_toto/

The opportunity for “bum jokes” is oh so high as some of the comments already show.

However I suspect that all jokes etc aside their might actually be a sound business case to be made, but opportunity…

The Promptware Kill Chain

Comments

Leave a comment Cancel reply