
OpenAI's newest model GPT-5.5 can autonomously chain a 32-step corporate network breach and crack a 12-hour reverse-engineering puzzle in roughly 10 minutes.
AISI Cyber Evaluation Results
The U.K. AI Security Institute, a research body within Britain's Department of Science, Innovation and Technology, published its evaluation Thursday.
Researchers found GPT-5.5 is only the second model to fully solve "The Last Ones," a multi-stage simulation built with SpecterOps. It completed the chain in two of 10 attempts.
The first to clear the test was Anthropic's Claude Mythos Preview, which managed three of 10. AISI estimates a human expert would need about 20 hours to finish the same kill chain across four subnets and roughly 20 hosts.
On Expert-tier tasks, GPT-5.5 scored a 71.4% pass rate, narrowly above Mythos Preview at 68.6% and well past GPT-5.4 at 52.4%.
Also Read: Why 75% Of Institutions Stay Bullish On Bitcoin Despite Coinbase's Mythos Warning
Jailbreak Risk And Policy Response
AISI flagged a universal jailbreak that bypassed the model's safeguards across every malicious cyber query tested. The exploit took six hours of expert red-teaming to develop, and a configuration issue blocked verification of OpenAI's patch.
The agency warned that offensive cyber skill now appears to emerge as a byproduct of broader gains in reasoning and autonomy.
In April, AISI's review of Mythos Preview marked the first time any frontier model finished the corporate attack range end-to-end, framing GPT-5.5 as confirmation of a trend rather than a one-off leap.
Read Next: Crypto VC Funding Crashes To $659M In April, A 2-Year Low