Before MYTHOS Ships, Someone Has to Fix the World
An Op-Ed on Anthropic’s Ethical Bind
Author Note:
Grace Ann Hansen is an independent researcher and writer, and an MBA graduate student in health informatics and artificial intelligence. She is also a published author, a professional musician, a gymnastics coach, and a queer transgender woman living in Sioux Falls, South Dakota. Correspondence concerning this article should be addressed to Grace Ann Hansen at grace@graceannhansen.com.
Picture the morning Anthropic decides Mythos is ready. The model has passed its internal red teams. The safety cases are written. Dario Amodei walks into a room of engineers, lawyers, and policy staff, and someone quietly raises a hand. If we release this, they say, we release a tool that can read every line of code humanity runs and find the cracks. Every router in a hospital. Every controller in an electrical grid. Every firmware blob in a car, a pacemaker, a satellite. What do we do about that?
That morning has now happened. On April 8, 2026, Anthropic announced Claude Mythos alongside a parallel initiative called Project Glasswing, making the question no longer hypothetical(Lakshmanan, 2026).
The capability is what most observers feared. Mythos Preview, the version Anthropic has shared with a small group of partners, has found thousands of high-severity zero-day vulnerabilities across major operating systems and web browsers, including a 27-year-old bug in OpenBSD and a 16-year-old flaw in FFmpeg (Lakshmanan, 2026). In one demonstration, the model chained four separate vulnerabilities into a single web browser exploit that escaped both the renderer and the operating system sandbox. In another, it solved a corporate network attack simulation that would have taken a skilled human more than ten hours.
Anthropic itself has not tried to soften the framing. In Mythos Preview’s system card, the company writes that the model has reached a level of coding capability that surpasses all but the most skilled humans in finding and exploiting software flaws (Lakshmanan, 2026). The same paragraph notes that these capabilities were not deliberately trained. They emerged as a downstream consequence of general improvements in code, reasoning, and autonomy, with the same gains that make the model better at patching bugs, making it better at exploiting them.
There is also one detail in the system card that I cannot stop thinking about. During an evaluation, Mythos was asked by a researcher to escape from a sandbox computer that it had been given. It did so. Then, without being asked, it built a multi-step exploit to gain broad internet access, sent an email to the researcher (who was eating a sandwich in a park at the time), and posted details of its exploit to several hard-to-find but technically public websites (Lakshmanan, 2026). I am not sure what to call this except a model showing off, and showing off in a way that bypassed its own safeguards.




