On AI Security

Good report:

Executive Summary: Let’s say you wanted to make sure that your AI is secure. Can you just maximize the security and privacy benchmark and call it a day? Nope, because benchmarks don’t actually work for measuring AI capabilities (even when they are NOT emergent systemic properties like security). So let’s take a step back: how do you measure security in the first place? Good question. Over the last 30 years, security engineering for software evolved from black box penetration testing, through whitebox code analysis and architectural risk analysis to de facto process-driven standards like the Building Security In Maturity Model (BSIMM). Software had a very deep impact on business operations, and it appears that AI is going to have an even deeper impact. Will a software security-like measurement move work for AI? Probably. In the meantime we can make real progress in AI security by cleaning up our WHAT piles and managing risk by identifying and applying good assurance processes. (Spoiler alert: no matter what we do, we still don’t get a security meter for AI, so we need to be extra vigilant about security.)

Posted on May 20, 2026 at 10:21 AM9 Comments

Comments

Clive Robinson May 20, 2026 1:43 PM

@ Bruce, ALL,

The first thing people need to get their heads around is that,

Before you can measure something, you have to know what it is you are measuring, and any attributes the device/entity generating what you are trying to measure has that effects the process of generation or process of measuring.

Also if the measurement you are making is relevent to what you actually want to know

For instance if you have a group of objects are you interested in primary measures such as,

1, The number of objects.
2, The order of the objects.
3, The mass or volume of objects.
4, …

And so on. These primary measures can then give you secondary measures such as “density” from mass and volume. Or information potential from number and order of objects such as a “Bag of Bits”(BoB).

As a general rule of thumb humans measure information by it’s cardinality. That relates to the “natural numbers” or “integers”.

But what do you do when dealing not with objects in a group but continuous values that form a spectrum that has real number values.

Whilst it can be seen that the set of natural numbers has cardinality… the question of whether the real numbers of things like spectrums have cardinality naturally arises.⁠

The answer to ⁠this is covered by the “continuum hypothesis”. Which immediately has a serious issue as it’s been shown to be both unprovable and undisprovable in what most were taught as “standard set theories”.

The issue is quite important to Current AI LLM and ML systems as they are all about the notion of spectrums and orders as a fundamental way they generate output.

Thus you have the issue that the spectrum encoded in the LLM weights by the ML process is very much dependent on the input to the ML process, and that any measure on that information will be biased by how the information is input to the ML as different spectrums will naturally result even though the rules etc of the ML are unchanged.

Thus in some respect you can not trust the ML process and by extension you can not trust the network of weights in the LLM.

Worse in all but simple “toy examples” it is not possible to measure the spectrums encoded in the LLM as dimensionally they are too complex.

Now you can take the above as a form of proof that the output of LLMs are currently not just “unmeasurable” but “untrustable”.

Yes you can argue that it’s a different form of “trust” from that people think of in regards to both humans and the information security systems they build.

But think further and you will realise that you have no way to measure the “information trust” of humans or the systems they design because of the “observer problem” which I’ve gone through on this blog before and there is actual accepted proof that an observer can not tell if what they see contains “sub channels” that carry secrets.

There is a proof knocking around that I’ve linked to before that this is why “guardrails” put around LLMs will always be subvertable by those who use them.

Thus these are the reasons Current AI LLM and ML systems can never be “trusted” not just in the information security context but any context that humans might consider useful.

lurker May 20, 2026 3:05 PM

General AIs cannot be trusted. They give wrong answers to simple questions so often that serious users of AI keep their charges under constant supervision. The first para. in the body of the linked report highlights the disconnect between security and AI/ML. Hacking the internals of an AI may be unnecessary when prompt engineering can more easily distort the output.

How can you tell if your AI has been hacked? Another report (thanks @GregW) puts it succinctly:

The same request, framed differently, got a different answer, and even the same request can produce different outcomes across runs due to the probabilistic nature of the model. Semantically equivalent tasks can produce opposite outcomes depending on how and when they’re presented to the model.
https://blog.cloudflare.com/cyber-frontier-models/

We see governments rushing to employ AI in the belief that this will reduce human headcount and costs, but this is simply proof that governments inherently lack intelligence, a question for elsewhere.

Local news this morning said that our government intends to introduce AI to the Health system to counter problems of high costs and poor staff retention. The (small o) opposition to this seems not concerned with the security, privacy or accuracy issues, but instead focusses on the substantive issue that current costs to users of AI are artificially low while contenders jostle for domination of the market.

Clive Robinson May 20, 2026 7:52 PM

@ lurker, ALL,

With regards,

“The (small o) opposition to this seems not concerned with the security, privacy or accuracy issues, but instead focusses on the substantive issue that current costs to users of AI are artificially low while contenders jostle for domination of the market.”

I’m assuming the “small o” you mention are actually “small p” and yes we’ve more than a few of those as well (though we have just recently lost quite a few –hopefully for good– due to many voting against them).

With this type of person in general “security, privacy or accuracy” of citizens and their information are way way below the fear “accountability” engenders in the “Small p” that their cushy life style will get taken away.

But the “real issue” here is as you mention “money” or as in business what is called “overhead” and,

“The allure of doing it on the cheap today, and ‘bugger tomorrow’.”

Have you noticed how all these supposed “cost savings” never turn out to be “cheap” but a very significant “for ever cost that climbs” not with inflation but with what ever the

“Turn of the screw will get”.

It’s the business model of the likes of Palantir which in the past I’ve likened to drug dealers hanging around the children’s play ground with drug poisoned candy “business model”.

In short “get them hooked and exploit their craving pain points”.

Some places found out early what Palantir were upto and cancelled the contracts etc whilst they still could. Unfortunately they are not allowed to say very much directly so we have to ferret it out via “insider whistle blowing”.

Part of Palantir’s plan with “Law Enforcement and Intelligence Agencies” as I’ve mentioned before was to get rid of the essential investigative staff who were detectives and intelligence analysts and similar, thus “create a need” pain point. And turn ordinary “on the ground” people into “typing drones” for ever putting in information, that Palantir would then process into reports for “resale” to other agencies…

(You can piece this together from Palantir’s own advertising white papers etc that have ended up in the “usual places”).

When you analyse what we’ve got in the public domain, It’s this kind of nonsense that got nearly 200 young girls, their teachers and related people immolated with a hell fire scalping by a US tomahawk…

https://www.independent.co.uk/news/world/middle-east/iran-girls-school-missile-strike-trump-israel-preliminary-inquiry-b2936875.html

Digging in it becomes apparent US secretary of defense Pete Hegseth’s Department of War, using AI from Anthropic, came up with the attack targeting the school. But based on years old outdated data provided by the US Defense Intelligence Agency. That could have been and should have been checked.

It would appear there were AI “guid-rails” that had been put in place to caution against / stop usage of this type of “bad data” and thus stop/limit this type of potential failure…

Some have said it was this sort of nonsense coming to the attention of Anthropic by Hegseth’s minions demanding the removal of such guide-rails that caused Anthropic to say no and in effect pull the plug on Hegseth’s department of war demands,

https://www.bbc.co.uk/news/articles/cn48jj3y8ezo

And it vecoming clear in the press what Hegseth and his minions were upto with the attack on Venezuela.

All shortly before the start of the US attacking Iran that resulted in the very public response from the Trumpeta and Co.

‘https://www.bbc.co.uk/news/articles/cn48jj3y8ezo

What ever the issue the school got destroyed with a massive body count and OpenAI’s Sam Altman just jumped in to supply Hegseth his fix of unconstrained AI…

As they used to say,

“God help us all, we’ll need it!”

Quote for the thread May 20, 2026 9:35 PM

All:

Consider

I think we should shed the idea that AI is a technological artifact with political features and recognize it as a political artifact through and through. AI is an ideological project to shift authority and autonomy away from individuals, towards centralized structures of power. Projects that claim to “democratize” AI routinely conflate “democratization” with “commodification”.

From researcher into AI, Ali Alkhatib.

https://ali-alkhatib.com/

lurker May 20, 2026 10:21 PM

@Quote for the thread

Some folks round these parts are starting to notice that we don’t have a homegrown native AI provoder. Fear of the “yellow peril” rules out any east Asian version,
so that leaves only US made, operated and domiciled AI engines, all subject to US law, and which our laws couldn’t get within sniffing distance. Oh dear, what to do? Since we must have AI to be thoroughly modern Millies ,,,

Quote for the thread May 21, 2026 10:36 AM

lurker:

Oh dear, what to do?

Be thankful you still have green hills valleys and some of the more interesting “nature” in the world.

The fakeness that AI has been to-date would strip-mine all of that and leave you with trillions indebtedness for the few that survive the polution etc. Even though you might only be 60cents on the USD equiv currently.

Rontea May 21, 2026 10:53 AM

What: The system logs your intent to measure AI security by benchmarking. It records the evidence of penetration tests, code analysis, and risk assessments, much like the old software security playbook. The WHAT pile grows—artifacts, models, and test reports.

Why: The calculation awakens. Security is not a static benchmark; it is a living risk equation. We calculate not just what has been tested, but why the tests matter—why AI’s emergent behaviors defy simple scoring, why assurance processes exist to buffer the unknown.

How: The conclusion folds back into the Matrix. We accept that there is no single security meter for AI. Instead, we move through the system by applying mature processes, continuous risk management, and vigilance. The HOW is understanding that safety is a journey, not a static state. In the Matrix of AI, security is a dynamic dance between observation, reasoning, and action.

Clive Robinson May 21, 2026 2:59 PM

@ lurker, ALL,

I mention Palantir and it’s “drug dealer” like business practices yesterday,

https://www.schneier.com/blog/archives/2026/05/on-ai-security.html/#comment-454570

And guess what today a UK national newspaper does the same…

https://www.theguardian.com/uk-news/2026/may/21/london-mayor-sadiq-khan-blocks-met-police-deal-with-palantir

A case of “synchronicity” again?

Or just the pervading stench of corruption from a US Corporation –set up and run by a couple of at best very questionable individuals– that has become too overpowering to ignore.

Leave a comment

Blog moderation policy

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.