---
video_id: tSOjux2CMM0
title: "Too Dangerous to Release: Anthropic Seals Away \"Claude Mythos\" Is Born"
channel: "TBS CROSS DIG with Bloomberg"
published_at: "2026-04-09"
youtube_url: "https://www.youtube.com/watch?v=tSOjux2CMM0"
flecto_url: "https://flecto.zer0ai.dev/youtube/tSOjux2CMM0/"
language: "en"
---

# Too Dangerous to Release: Anthropic Seals Away "Claude Mythos" Is Born

## Key Takeaways

### 1. Claude Mythos Benchmark Performance Heralds a New Era
Claude Mythos Preview shattered conventional expectations where 4-5% improvements were the norm, achieving leaps of 10-20%+. It scored SWE-bench Verified 93.9%, GPQA Diamond 94.5%, and HLE 64.7% (with tool use), breaking through the wall that had made simultaneous excellence in coding ability and general intelligence seem "impossible." Shota Imai says, "When I saw the combination of SWE-bench and HLE scores, I suspected it was an April Fools' joke."

### 2. Cybersecurity Capabilities That Cross the Line
Mythos autonomously discovered thousands of zero-day vulnerabilities across major operating systems and browsers. It even detected a vulnerability that had gone unpatched for 27 years in OpenBSD, considered one of the most secure operating systems. It achieved a perfect score of 1.00 on Cybench pass@1 and 0.83 on CyberGym, leading by a wide margin. Imai points out, "This is a turning point where AI can now do unlimited things that were simply too cost-ineffective for humans to pursue."

### 3. Anthropic's Explosive Growth and "Claudenomics"
Anthropic's annual revenue run rate surged from $9 billion at the end of 2025 to $30 billion by April 2026 -- more than tripling -- and overtaking OpenAI for the first time. In a ranking internally dubbed "Claudenomics" at Meta, 85,000 employees consumed 60 trillion tokens per month (an estimated $900 million/month). The era of "token maxing" as an engineer productivity metric has arrived.

### 4. Meta Muse Spark Arrives and the Open-Source Pivot
"Muse Spark," the first model from Meta Superintelligence Labs (MSL), achieves performance comparable to Llama 4 with dramatically less compute. However, it marks a departure from Meta's traditional open-source approach, launching as a proprietary model. Led by Alexandr Wang (former Scale AI CEO, who founded his company at age 19), it was developed in approximately 9 months. While competitive in multimodal tasks, it still lags behind frontier models in coding benchmarks.

### 5. The Historical Significance of AI Safety and Release Decisions
Dario Amodei is the same person who restricted the release of GPT-2 in 2019 (during his time at OpenAI) citing safety concerns. Mythos's non-release represents a historical pattern of "the same person halting a public release once again." However, Mythos's capability level poses a "genuine threat incomparable to GPT-2." The system card paradoxically describes it as the "safest yet most dangerous model" -- explained through the metaphor of an expert mountaineer.

## Content Flow

### 00:00 - 04:29 | New Claude: "I Thought It Was April Fools'"
The show opens with Shota Imai's anecdote of checking the dates of American April Fools' Day upon seeing the Claude Mythos announcement. When GPT-2 was released in 2019, OpenAI also restricted its release citing safety concerns -- and Dario Amodei (now Anthropic CEO) is listed among the paper's authors. Imai's reaction, "So he's done it again," highlights the historical pattern behind the Mythos non-release.

### 04:29 - 09:30 | Mythos's Power by the Benchmarks
SWE-bench Verified 93.9%, SWE-bench Pro 77.8%, GPQA Diamond 94.5%. In a world where 4-5% improvements were the norm, this represents a 10-20%+ leap. Imai calls it "the first jump of this magnitude since GPT-4." The shock of Anthropic casually delivering this while everyone waited endlessly for GPT-5 was immense.

### 09:30 - 13:00 | The "Impossible" Balance of Coding and General Intelligence
A massive jump to HLE 64.7% (with tool use). The last time HLE exceeded 50% was Grok about a year ago, and now it has leapt another 10%+. The simultaneous achievement on SWE-bench and HLE was the biggest reason Imai suspected April Fools'. In contrast, GPT-5.4 "quietly released" its HLE score -- which hadn't improved much.

### 13:00 - 17:00 | Cyber Skills That Surpass Nearly All Humans
A perfect 1.00 on Cybench pass@1 -- the benchmark is no longer even meaningful at this level. A commanding lead at 0.83 on CyberGym. It discovered a 27-year-old vulnerability in OpenBSD, considered the most secure OS. A top university professor specializing in operating systems reacted with "No way." The system card itself states that "evaluation using real software is preferable" -- a testament to how far the performance has come.

### 17:00 - 23:25 | Withholding Public Release and Project Glasswing
Limited access to Mythos Preview provided to over 50 companies including AWS, Apple, Google, Microsoft, and NVIDIA, with $100 million in usage credits. API pricing at $25/M input and $125/M output (5x Opus 4.6) -- suggesting the model may exceed 5T parameters. The paradox of "the safest yet most dangerous model" is explained through the metaphor of an expert mountaineer.

### 23:25 - 27:00 | Anthropic's Growth Pace Far Exceeds Projections
$9 billion at end of 2025 to $30 billion in just over 3 months. Already surpassing projections based on The Information's internal documents. Claude Code and the Quit GPT movement are driving the explosive growth. "They're releasing something every one or two days" -- a testament to their overwhelming productivity powered by their own Claude Code.

### 27:00 - 31:00 | Token Maxing -- AI Tokens as the New Currency
Meta's internal "Claudenomics" ranking: 85,000 employees competing, 60 trillion tokens per month = approximately $900 million. If Mythos API pricing is 5x that of Opus 4.6, it could mean $4.5 billion per month. Jensen Huang: "If a $500,000-a-year engineer isn't spending $250,000 on AI tokens, that should be a wake-up call." We are entering an era where a programmer's skill equals their spending power.

### 31:00 - 38:18 | Massive TPU Procurement, Amazon Partnership, and "Claude Code Got Dumber"
A contract for 3GW of Google TPUs. An all-fronts procurement strategy spanning GPU + TPU + Amazon Trainium. Quality degradation from overuse surfaced with complaints that "Claude Code got dumber." Reports of Claude becoming unavailable on OpenClaw as well. Multimodal performance remains competitive with Gemini. An IPO is under consideration (in talks with Goldman Sachs and JPMorgan). "The Anthropic Guillotine" -- threatening not just startups but major enterprises as well.

### 38:18 - 42:00 | "Muse Spark" -- First Model from Meta Superintelligence Labs
The first model from MSL, led by Alexandr Wang (former Scale AI CEO, who founded his company at age 19). Codenamed "Avocado," developed in approximately 9 months. Organized into four divisions: TBD Lab, FAIR, Products and Applied Research, and MSL Infra. Achieves performance comparable to Llama 4 with dramatically less compute, but marks a departure from the traditional open-source approach by launching as a proprietary model.

### 42:00 - 47:22 | Has Muse Spark Caught Up to the Frontier?
Competitive performance in multimodal tasks like CharXiv Reasoning and MMMU Pro. However, a gap remains in coding benchmarks such as SWE-Bench Verified. A measured assessment that "in practical terms, it's not rated all that highly." Still, the very fact that "Meta has caught up to the frontier model race" represents significant progress.