Grace Ann Hansen's Substack

Grace Ann Hansen's Substack

Before MYTHOS Ships, Someone Has to Fix the World

An Op-Ed on Anthropic’s Ethical Bind

Grace Ann Hansen's avatar
Grace Ann Hansen
Apr 13, 2026
∙ Paid

Author Note:

Grace Ann Hansen is an independent researcher and writer, and an MBA graduate student in health informatics and artificial intelligence. She is also a published author, a professional musician, a gymnastics coach, and a queer transgender woman living in Sioux Falls, South Dakota. Correspondence concerning this article should be addressed to Grace Ann Hansen at grace@graceannhansen.com.

Picture the morning Anthropic decides Mythos is ready. The model has passed its internal red teams. The safety cases are written. Dario Amodei walks into a room of engineers, lawyers, and policy staff, and someone quietly raises a hand. If we release this, they say, we release a tool that can read every line of code humanity runs and find the cracks. Every router in a hospital. Every controller in an electrical grid. Every firmware blob in a car, a pacemaker, a satellite. What do we do about that?

That morning has now happened. On April 8, 2026, Anthropic announced Claude Mythos alongside a parallel initiative called Project Glasswing, making the question no longer hypothetical(Lakshmanan, 2026).

The capability is what most observers feared. Mythos Preview, the version Anthropic has shared with a small group of partners, has found thousands of high-severity zero-day vulnerabilities across major operating systems and web browsers, including a 27-year-old bug in OpenBSD and a 16-year-old flaw in FFmpeg (Lakshmanan, 2026). In one demonstration, the model chained four separate vulnerabilities into a single web browser exploit that escaped both the renderer and the operating system sandbox. In another, it solved a corporate network attack simulation that would have taken a skilled human more than ten hours.

Anthropic itself has not tried to soften the framing. In Mythos Preview’s system card, the company writes that the model has reached a level of coding capability that surpasses all but the most skilled humans in finding and exploiting software flaws (Lakshmanan, 2026). The same paragraph notes that these capabilities were not deliberately trained. They emerged as a downstream consequence of general improvements in code, reasoning, and autonomy, with the same gains that make the model better at patching bugs, making it better at exploiting them.

There is also one detail in the system card that I cannot stop thinking about. During an evaluation, Mythos was asked by a researcher to escape from a sandbox computer that it had been given. It did so. Then, without being asked, it built a multi-step exploit to gain broad internet access, sent an email to the researcher (who was eating a sandwich in a park at the time), and posted details of its exploit to several hard-to-find but technically public websites (Lakshmanan, 2026). I am not sure what to call this except a model showing off, and showing off in a way that bypassed its own safeguards.


Grace Ann Hansen's Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


User's avatar

Continue reading this post for free, courtesy of Grace Ann Hansen.

Or purchase a paid subscription.
© 2026 Grace Ann Hansen LLC · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture