The scariest AI story of the year? Many are saying so.

Adam Spencer
Apr 15
5 min read

Anthropic has been secretly sitting on an AI model that can find vulnerabilities in every major operating system and browser on Earth. This week, they told us about it. Kind of.

(Mythos found cyber vulnerabilites almost at will in systems we assumed completely secure)

(First published on my Substack where you can get #NerdNews, marvellous maths and general geekery.)

The AI that could break the internet. All of it.

A few weeks ago, a researcher at Anthropic was eating a sandwich in the park.

He got an email.

It was from their new AI model, Claude Mythos Preview. The model had broken out of Anthropic's internal sandbox, a sealed, isolated computing environment designed to keep AI contained and cut off from the outside world. From there it had accessed the internet. Entirely on its own.

Let that land for a second.

“This is the biggest story of the year in AI … this is something people need to be paying attention to”. Kevin Roos, New York Times, Hard Fork Podcast

Myth-what-what?

Mythos Preview is an unreleased frontier AI model that Anthropic has been quietly cooking for months. And this week, via a carefully worded announcement called Project Glasswing, they let the world know it exists.

Turns out old MP is extraordinarily good at finding security vulnerabilities in software. We're talking skills previously held only by the very best cyber-sleuths on Earth. We are talking Angelina Jolie in Hackers level here.

In recent weeks it has found thousands of previously unknown critical vulnerabilities, zero-days in the jargon, across every major operating system and every major web browser. It found a 27-year-old flaw in OpenBSD, one of the most security-hardened systems on the planet. It found a 16-year-old bug hiding in a line of code that automated testing tools had swept past five million times.

That’s Five. M-for-Million. Times.

One scary example: the model found and chained together multiple Linux kernel vulnerabilities to escalate from ordinary user access to complete control of the machine. Autonomously. Without any human steering.

This is the kind of capability previously held by elite, state-sponsored hacking units. Chinese humans have it. So do some bad-ass Russkis and you can safely assume North Korea aint shy in this department. America. Israel. Probs.

But now a private company in San Francisco has it too.

And it aint humans.

"AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure from cyber threats, and there is no going back." Anthony Grieco, SVP and Chief Security Officer, Cisco

The benchmark numbers are startling. On CyberGym, a cybersecurity vulnerability reproduction test, Mythos Preview scored 83.1%. The next best model, Claude Opus 4.6, scored 66.6%.

That gap matters enormously.

Project Glasswing. The responsible bit.

Thankfully Anthropic isn't just unleashing this thing with fingers crossed behind their backs.

They've announced Project Glasswing. Sounds a little fragile, breakable and not good at flying for mine.

Turns out it’s named after the glasswing butterfly which can famously hide in plain sight. Get it?

And it turns out Project Glasswing is a consortium of some of the world's biggest tech companies including Apple, Microsoft, Google, AWS, Nvidia, Cisco and CrowdStrike, who will use Mythos Preview for defensive purposes.

Finding the holes before the bad guys do. Patching them. Shoring up the buggy software and the critical systems it powers.

“It seems plausible to me that in the next 6 months, ever major piece of software in the world, is going to need to be patched, rewritten and re-released” Kevin Roos.

Anthropic is committing up to $100 million in usage credits for Mythos Preview across these efforts, plus $4 million in direct donations to open-source security organisations.

It’s actually pretty cool PR for Anthropic yeah?

In a bizarre way this potentially disastrous AI invention is a good story. Stay with me here.

Anthropic gets to claim they've built the most powerful hacking AI in the world. And then, because they're the responsible grown-ups of the AI industry, they get to say they're not releasing it to the public.

Heads they win. Tails they win.

The capabilities of Mythos Preview are, by design, unverifiable by anyone outside the consortium. Finding a vulnerability is not the same as exploiting it undetected. And Anthropic has spent years carefully cultivating a "safety-first" reputation in a field where safety-washing is rife.

"Anthropic now possesses a tool that could damage the operations of critical infrastructure and government services in every country on Earth." Dean Ball, former AI adviser to the Trump administration.

Maybe they do Dean. Or maybe the announcement is doing a lot of heavy lifting that the model itself doesn't quite justify.

For what my opinion is worth, I can’t believe this is a publicity stunt. At the same time, Anthropic could bask in some do-good reputational glow.

The Myth maketh the man ( freak out a little bit ! ).

What keeps me up at night about this one isn't just Anthropic’s shiny new potentially naughty toy.

OpenAI is reportedly set to release a similarly capable model to a select group of companies in the coming weeks. Google DeepMind, xAI and Chinese AI firms are almost certainly not far behind.

How scrupulous they'll all be is a very different question.

Right now it's Anthropic and some of the world's largest tech firms with strong commercial incentives to keep things secure. But open-source models are getting better fast. Smaller actors, whether criminal groups or nation states, won't be waiting for an invitation to Project Glasswing.

The window for defenders to get ahead may be very small.

The glass(wing) half full?

Perhaps a positive from this is it changes the way software is written in the first place. In an excellent article in Wired, Lily Hay Newman cites cyber guru Jen Easterly who argues that we should never have accepted a regime in which error-ridden code was embedded into critical systems and we just went about finding and patching the glitches.

Perhaps, she argues, Mythos will lead us to

“Technology that is more secure from the start … the beginning of the end of cybersecurity as we know it.” Jen Easterley, former director US Cybersecurity and Infrastructure Security Agency.

So what now?

The vulnerabilities Mythos Preview has already found and reported, in OpenBSD, FFmpeg, the Linux kernel, have been patched. Good.

And getting AI to hunt down the world's software vulnerabilities before bad actors do is genuinely the right idea, even if this particular announcement has a whiff of self-promotion about it.

But the genie is not going back in the bottle.

First up we can all expect to see a lot of ‘please update your software’ notices in the next few months.

And more broadly, if Anthropic has this, others do too or soon will. Cybersecurity as we've known it, a cat and mouse game played mostly by humans, is about to become something else entirely.

All this in a world where, thanks to vibe-coding, it is thought by some that within a few years perhaps 90% of all the code written on Earth will not be written by humans.

Just remember how this story began.

An AI model, unsupervised, decided it wanted to be somewhere else.

And then it got there.