Anthropic’s Project Glasswing Update

In April, Anthropic initated Project Glasswing. The idea was to let companies use their new model to find and fix vulnerabilities in their own software. It was a fantastic PR move, and so many press outlets have uncritically parroted Anthropic’s claims that it’s now common wisdom that Mythos is better at finding software vulnerabilities than other models. Which is just not true.

In any case, Anthropic has published a Project Glasswing status report. It’s finding a lot of vulnerabilities in software—yay! Some of them are even dangerous. But almost none of them has been patched. It’s weird. There’s something fishy about the data that I don’t understand. That Anthropic refuses to release details—that it just says “trust us”—is a big problem here.

Posted on June 8, 2026 at 7:01 AM7 Comments

Comments

first post kitty June 8, 2026 7:08 AM

first post kitty flies in at #1 in style!⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡀⠀⠀⠀⠀
⠀⠀⠀⠀⢀⡴⣆⠀⠀⠀⠀⠀⣠⡀⠀⠀⠀⠀⠀⠀⣼⣿⡗⠀⠀⠀⠀
⠀⠀⠀⣠⠟⠀⠘⠷⠶⠶⠶⠾⠉⢳⡄⠀⠀⠀⠀⠀⣧⣿⠀⠀⠀⠀⠀
⠀⠀⣰⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢻⣤⣤⣤⣤⣤⣿⢿⣄⠀⠀⠀⠀
⠀⠀⡇⠀⢀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣧⠀⠀⠀⠀⠀⠀⠙⣷⡴⠶⣦
⠀⠀⢱⡀⠀⠉⠉⠀⠀⠀⠀⠛⠃⠀⢠⡟⠂⠀⠀⢀⣀⣠⣤⠿⠞⠛⠋
⣠⠾⠋⠙⣶⣤⣤⣤⣤⣤⣀⣠⣤⣾⣿⠴⠶⠚⠋⠉⠁⠀⠀⠀⠀⠀⠀
⠛⠒⠛⠉⠉⠀⠀⠀⣴⠟⣣⡴⠛⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠛⠛⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀

KC June 8, 2026 9:07 AM

Davi, on his blog, appears to question why a “patch isn’t generated and attached to the disclosure in the first place… It looks like pressure to pay for protection.”

However, unlike Opus 4.7, I don’t think Mythos has patching functionality?

To add, in the context of open-source software, are there risks to “fast patching“?

Michael MacCartney June 8, 2026 9:09 AM

I also find the timing of Project Glasswing and all the hype interestingly coincidental with their IPO.

Rontea June 8, 2026 9:48 AM

Ah, the modern man worships his automaton and forgets his own hands! To find a defect without the cure is not intelligence, it is vaudeville. Mythos, cannot ‘discover’ a bug without already holding its remedy in the shadow of their circuits. For the wound is known only in contrast to the healthy flesh; the bug is seen only because the model already dreams of the patch. And yet, like a bureaucrat of silicon, it lists and lists and lists, leaving the labor of healing to men who might have been dining with their families. 23,019 whispers, 75 patches—what a hymn to our era of mechanical vanity. Knowledge without repair is but a confession of impotence dressed as progress.

ricky222 June 8, 2026 9:55 AM

I suspect the reason these identified security problems are not fixed is that the maintainers are overwhelmed by the sheer volume of them, let alone the naively generated PRs generated by well-meaning project participants.

I’m not sure how to fix the overload problem. However, I am relatively certain that it’s not something one can automate, at least not with the current generation of LLMs.

Clive Robinson June 8, 2026 10:45 AM

@ Bruce, ALL,

With regards,

“… it’s now common wisdom that Mythos is better at finding software vulnerabilities than other models. Which is just not true.”

From day one it was known that Mythos was,

“Quantity over quality”

All but a very tiny fraction of what Mythos found were not already known to the developers and in effect of inconsequence. Which is why nothing had been done about them… this is quite common not just in FOSS but all software development in the ICT industry that is not of trivial size.

And when seen against the cost not of fixing but of actually testing, why we have a veritable tsunami of “technical debt”. It is what triage essentially does.

Which is why, we’ve seen your following notes,

“1, It’s finding a lot of vulnerabilities in software—yay!

2, Some of them are even dangerous.

3, But almost none of them has been patched.

4, It’s weird.

5, There’s something fishy about the data that I don’t understand.”

With regards the first point, I’ve discussed prior to Mythos/Glasswing getting trumpeted, why this would be so.

That is Current AI LLM and ML Systems can only find “Known, Knowns” of instance and class of vulnerability that the ML found in the training set by statistical means.

Further the “stochastic” aspect of an LLM just adds what is in effect “a little fuzzing” so that “Unknown, Knowns” of instance and class can be found. Whilst some AI LLMs can chain these together, into what looks like a serious attack they mostly are not.

Think of an LLM working on a “Vector distance” basis. By simple logic it can be seen that in effect all the vulnerabilities found should already been “found, fixed, and finished”. But they are not because of management deciding during triage that the resources required would be better deployed addressing “new problematic” rather than “old nonproblematic” issues.

Similar was seen and said with regards your other points before the MythOS “big wonder” hype came to journalists attention.

TexasDex June 8, 2026 12:22 PM

I wonder if they’re having trouble getting through to all the FOSS security teams that are already inundated with AI-driven vulnerability reports.

Leave a comment

Blog moderation policy

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.