Claude Opus 4.8 Tops The Intelligence Index Yet Mythos Dominates Hacking

Markets 2026-05-31 17:15

Claude Opus 4.8 Tops The Intelligence Index Yet Mythos Dominates Hacking

Anthropic released its newest model, Claude Opus 4.8, this week with a slim lead on an intelligence benchmark, yet it trails the firm's restricted Mythos system on writing software exploits.

Key Points:

  • Claude Opus 4.8 narrowly tops the Artificial Analysis Intelligence Index at 61.4, just ahead of GPT-5.5 at 60.2.
  • In Anthropic's internal tests, Mythos produced working Firefox exploits on 70.8% of targets, against 8.8% for Opus 4.8.
  • Mythos stays limited to vetted Project Glasswing partners, while Opus 4.8 ships at the same price as its predecessor.

Opus 4.8 Benchmark Lead

The company rolled out Opus 4.8 this week and priced it at $5 per million input tokens and $25 per million output, holding the rate level with the prior Opus 4.7.

Independent testers report the model now leads the Artificial Analysis Intelligence Index at 61.4, an aggregate of ten evaluations, just ahead of GPT-5.5 at 60.2. Anthropic casts the upgrade as a modest, incremental step rather than the generational leap its naming might suggest.

On agentic coding, Opus 4.8 scores 69.2% on SWE-bench Pro, a benchmark that asks a model to fix real bugs inside large code repositories, while GPT-5.5 reaches 58.6%.

The two systems run nearly even on graduate-level science questions, both landing close to 94%, and Opus 4.8 narrowly leads a broad reasoning exam its predecessors trailed.

Mythos sits above both on the hardest engineering work, posting 77.8% on that same coding benchmark and a wider lead on tasks that mix code with screenshots. Anthropic restricts Mythos to a vetted set of partners under its Project Glasswing program, rather than selling it openly. It charges $25 and $125 per million tokens for the preview, five times the Opus rate.

Also Read: Zcash Cools After A 6% Drop While Monero Steals The Spotlight

Mythos Cyber Dominance

The widest gap shows up in offensive security.

With safeguards switched off, Mythos produced a full working exploit on 70.8% of Firefox targets in Anthropic's own evaluations, while Opus 4.8 cleared just 8.8%.

On a separate test drawn from open-source code, Opus 4.8 failed to score on 61.5% of targets, more than double the 23.3% miss rate posted by Mythos.

A public cross-model trial run by Berkeley RDI paired each system with its own coding agent across 898 real-world vulnerabilities, where Mythos wrote 157 working exploits to GPT-5.5's 120.

GPT-5.5 still held an edge on kernel-level exploitation, leading Mythos 22 to 12 on that narrow slice. The UK AI Security Institute placed it slightly ahead of Mythos on expert cyber tasks, at 71.4% to 68.6%.

Anthropic unveiled Mythos in April after the model found thousands of previously unknown flaws across major operating systems and every leading web browser, with hundreds reported in Firefox alone. The company then withheld it from public release, wary that the same exploit-writing skills could aid attackers as readily as the defenders it was built to help.

Read Next: Strategy Pulls $30M In Bitcoin Back, Cooling Sell-Off Fears

Share to:

This content is for informational purposes only and does not constitute investment advice.

Curated Series

SuperEx Popular Science Articles Column

SuperEx Popular Science Articles Column

This collection features informative articles about SuperEx, aiming to simplify complex cryptocurrency concepts for a wider audience. It covers the basics of trading, blockchain technology, and the features of the SuperEx platform. Through easy-to-understand content, it helps users navigate the world of digital assets with confidence and clarity.

Unstaked related news and market dynamics research

Unstaked related news and market dynamics research

Unstaked (UNSD) is a blockchain platform integrating AI agents for automated community engagement and social media interactions. Its native token supports governance, staking, and ecosystem features. This special feature explores Unstaked’s market updates, token dynamics, and platform development.

XRP News and Research

XRP News and Research

This series focuses on XRP, covering the latest news, market dynamics, and in-depth research. Featured analysis includes price trends, regulatory developments, and ecosystem growth, providing a clear overview of XRP's position and potential in the cryptocurrency market.

How do beginners trade options?How does option trading work?

How do beginners trade options?How does option trading work?

This special feature introduces the fundamentals of options trading for beginners, explaining how options work, their main types, and the mechanics behind trading them. It also explores key strategies, potential risks, and practical tips, helping readers build a clear foundation to approach the options market with confidence.

What are the risks of investing in cryptocurrency?

What are the risks of investing in cryptocurrency?

This special feature covers the risks of investing in cryptocurrency, explaining common challenges such as market volatility, security vulnerabilities, regulatory uncertainties, and potential scams. It also provides analysis of risk management strategies and mitigation techniques, helping readers gain a clear understanding of how to navigate the crypto market safely.