What safety guardrails does Fable 5 have?

Fable 5 uses a new classifier system that detects high-risk queries across cybersecurity, biology and chemistry, and model distillation. When triggered (less than 5% of sessions), the model falls back to Opus 4.8. Anthropic also requires 30-day API data retention for safety monitoring.

Claude Fable 5: benchmarks, developer reactions, first look

Q: What is Claude Fable 5?

Claude Fable 5 is Anthropic's first publicly available Mythos-class AI model, released on June 9, 2026. It's the most capable model Anthropic has ever made generally available. State-of-the-art on nearly every benchmark, with a 1M token context window and new safety guardrails.

Q: How much does Claude Fable 5 cost?

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens on the API. It's included on Pro, Max, and Team subscription plans through June 22, after which it moves to usage credits. That's double the price of Opus 4.8 ($5/$25).

Q: How does Fable 5 compare to Mythos 5?

Fable 5 and Mythos 5 are the same underlying model. The difference is safety guardrails: Fable 5 automatically detects and blocks high-risk queries in cybersecurity, biology/chemistry, and model distillation, falling back to Opus 4.8. Mythos 5 lifts those safeguards for approved researchers and Glasswing partners.

Q: Is Fable 5 better than GPT-5.5?

On benchmarks, Fable 5 is state-of-the-art on nearly every test, including SWE-bench Pro (62.3%), FrontierCode Diamond (29.3%, 9x Opus 4.8), and ExploitBench (78.0%). Early qualitative reviews from developers suggest it's a genuine step change, especially for long-horizon problems.

What developers are saying about Claude Fable 5. Karpathy's review, Stripe's results, benchmark numbers, and what it means for AI engineering.

TL;DR: I saw Karpathy’s thread at 6am, then Stripe’s results at 9am, and by noon I had rebuilt my entire agent stack on Fable 5. The benchmark numbers are absurd. The real story is what developers are already shipping with it.

I woke up to a Twitter thread from Karpathy calling Fable 5 “a major-version-bump-deserving step change.” By lunch, Stripe had published results showing it compressed months of engineering into days on a 50-million-line Ruby codebase. By evening, Cursor, Replit, and Figma had all confirmed the same thing: this model is different.

This post is what I found tracking the launch, running the numbers, and figuring out what Fable 5 means for AI engineering.

Key takeaways:

Fable 5 is a genuine step change, not a point release. Karpathy called it “major-version-bump-deserving”: the same order as Claude 4.5 was in November 2025.

The benchmarks are ridiculous. FrontierCode Diamond at 29.3% vs Opus 4.8 at 3.4% (9x improvement). SWE-bench Pro at 62.3%. ExploitBench at 78.0%. First model to break 90% on Hedge’s senior-level finance benchmark.

Safety is the headline, not the limitation. Guardrails trigger in less than 5% of sessions. The model still beats everything on the market with them on.

Long-horizon autonomy is where it shines. Stripe compressed months of engineering into days on a 50-million-line Ruby codebase. Cursor said it “opened up a class of long-horizon problems that were out of reach.”

The pricing hurts. At ₹5,000/M tokens total ($60), double Opus 4.8, every prompt needs to earn its keep. Cost optimization becomes mandatory.

What is Claude Fable 5?

Claude Fable 5 is Anthropic’s first Mythos-class model available to the general public. It’s the same underlying model as Claude Mythos 5, which remains restricted to Project Glasswing partners for cybersecurity and biomedical research, but wrapped in a new safety layer.

The numbers tell the story:

Benchmark	Fable 5 / Mythos 5	Opus 4.8	Delta
SWE-bench Pro	62.3%	48.5%	+13.8pp
FrontierCode Diamond	29.3%	3.4%	+25.9pp (9x)
ExploitBench	78.0%	40.0%	+38pp
GPQA (with tools)	84.5%	80.4%	+4.1pp
MMLU	92.3%	90.1%	+2.2pp
Hedge Finance Benchmark	90%+	~80%	First to break 90%

The pattern is clear: the harder the task, the wider the gap. On FrontierCode Diamond, which tests whether models can pass difficult coding tasks while meeting production codebase standards, Fable 5 scores 9x higher than Opus 4.8. Even at medium effort.

What developers are saying

The Twitter reaction was immediate. And unusually substantive: people had early access, ran real tests, and posted results.

Andrej Karpathy

His thread was the most thoughtful take. He’s been using Fable 5 in anger:

“This is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on difficult problems. You can give it a lot more ambitious tasks than what you’re used to, the model ‘gets it’ and it will just go.”

His observation about the Jevon’s paradox of AI coding is the line that stuck with me:

“I feel my own demand for software growing substantially. You can ask for anything: explainers, visualizers, dashboards, bespoke single-use apps, you can 10X your test suite, auto-improve code, run giant research projects with custom HTML for the results.”

His one critique: the safeguards are “a little too trigger happy for launch.” That tracks with the <5% fallback rate: harmless requests catching the classifier.

Stripe

The most impressive real-world data point. Stripe tested Fable 5 on a 50-million-line Ruby codebase and found it completed a codebase-wide migration in three days that their estimate said would take a human team more than two months.

“Fable 5 compresses months of engineering into days. In our 50-million-line Ruby codebase, it did in a day what would’ve taken us more than two months by hand.”

Stripe has been using Claude Code in production for a while: this isn’t a lab experiment. This is production infrastructure.

Replit, Cursor, Figma, and others

Replit called Fable 5 the highest-performing model on ViBench, their end-to-end vibe-coding benchmark, “nearly saturating our base use cases.”
Cursor said it “opened up a class of long-horizon problems that were out of reach.” This matters because Cursor benchmarks are grounded in real IDE use: not synthetic tests.
Figma (Matt Colyer): “A clear step forward for agentic coding and prototyping.”
GitHub: “Took on complex, long-horizon coding tasks with a level of autonomy and reliability that exceeded previous benchmarks.”
Rakuten: “Highest performance we’ve seen from an AI agent: on the hardest questions, it shows strong judgment.”
Hedge (finance): “First to break 90% on our core analytics benchmark: a 10-point jump over Opus. On the hardest questions, it shows strong judgment.”

Alex Albert (Anthropic)

The model’s product manager captured the qualitative shift better than any benchmark:

“With Fable, the model stopped feeling like a tool I direct and started feeling more like something I collaborate with.”

That’s the line that separates good AI tools from transformative ones.

Why does the Fable 5 safety story matter?

Anthropic has been careful with Mythos-class capabilities since the model was first demonstrated to select partners in April 2026. The concern was real: Mythos-class models can discover and exploit zero-day vulnerabilities, design novel biological agents, and bypass existing AI safety measures.

Fable 5 addresses this with a new classifier system that:

Detects high-risk queries in cybersecurity, biology/chemistry, and model distillation
Falls back to Opus 4.8 when triggered. Users are notified when this happens
Achieves robust protection. Anthropic reports zero universal jailbreaks from over 1,000 hours of red-teaming, including an external bug bounty

External validation came from an independent tester who found Fable 5 “complied with zero harmful cyber queries” across their test suite. It was the most robust of any model tested, including Opus 4.8 and Opus 4.7.

The 30-day API data retention requirement is new and worth noting for compliance-sensitive teams. It’s a safety tradeoff that enterprise buyers need to evaluate.

What this means for AI engineering

Fable 5 shifts the baseline for what’s possible with AI agents. The 9x improvement on FrontierCode Diamond and the leap in long-horizon autonomy mean workflows that were fragile or unreliable are now practical.

If you’re building agents today, the rules change in three ways:

1. You can trust longer chains of reasoning

Previous Claude models degraded noticeably after 5-8 agent turns on complex tasks. Fable 5 sustains quality across much longer sessions: critical for multi-step agent workflows where each step depends on the previous one. The 1M token context window with extended thinking means it can hold the entire problem in working memory.

2. Agentic coding is the new default

The Cursor vs Claude Code vs Copilot landscape just shifted. All three platforms integrated Fable 5 within hours of release. The model’s ability to “get it” and execute autonomously makes agent-guided development more viable than ever.

Karpathy’s observation, “never felt this tempting to stop looking at the code at all”, is exactly the risk and the opportunity. The model’s output quality is good enough that the bottleneck shifts from “can the model write code” to “can you verify what it produced.”

3. Cost changes the calculus

At ₹5,000/M tokens total ($60), Fable 5 is the most expensive model on the market by a wide margin. Opus 4.8 (₹2,500/M, $30) was already a budget consideration. Fable 5 doubles that.

But the FrontierCode numbers suggest it uses fewer tokens to solve the same problems. Anthropic says it’s “more token-efficient than past Claude models”: at medium effort, it scores higher than any model at high effort. That means the effective cost per solved task might be lower, even at higher per-token pricing.

What are the gaps and concerns with Fable 5?

Not everything is roses. A few patterns emerged from the reaction:

Pricing window frustration. The “offer, then remove” strategy, Fable 5 is included on subscriptions through June 22, then requires usage credits, drew criticism on Hacker News. One top comment called it “eyebrow-raising” and questioned whether Fable 5 would ever return to subscription plans.
Safeguard sensitivity. Karpathy and others noted the classifiers catch harmless queries too often. Anthropic acknowledged this: “We’ve tuned these safeguards conservatively: they’ll sometimes catch harmless requests.” They promised to refine over time.
Desktop availability. Multiple users reported Fable 5 wasn’t appearing in the Claude Code desktop app at launch. The workaround: run /model claude-fable-5 in the model picker. A minor issue, but one that suggests infrastructure strain at launch.
Context window drain. Some Opus 4.8 users reported thinking mode burns context 40-60x faster than expected. It’s unclear if Fable 5 inherits this issue or if the architecture handles it better.

Should you switch?

If you’re building production AI agents or doing serious software engineering work, yes. Fable 5 is worth the premium. The improvements in autonomy, reasoning depth, and code quality are real.

If cost is the primary constraint, stick with Opus 4.8: it’s still a fantastic model, and the gap on simpler tasks is minimal. The Fable 5 advantage compounds with task complexity.

For teams already running AI code review agents or managing agent context windows, Fable 5 solves problems you’ve been hitting. Worth testing immediately.

Should you switch to Claude Fable 5?

Claude Fable 5 is the first model that genuinely feels like a new capability tier: not just faster or smarter, but qualitatively different in how it handles complex, long-running tasks. The safety-first approach is the right call, even if the guardrails need tuning. The pricing stings, but the token efficiency partly offsets it.

The real signal is what people are doing with it, not what the benchmarks say. Stripe migrating a 50-million-line codebase in three days. Karpathy asking for bespoke single-use apps and getting them. The sense that the bottleneck is no longer “can AI do this” but “what should I ask it to build.”

Read the full announcement on Anthropic’s blog and check the system card for the technical details.

That’s the conversation worth having.

FAQ

What is Claude Fable 5? Claude Fable 5 is Anthropic’s first publicly available Mythos-class AI model, released on June 9, 2026. It’s the most capable model Anthropic has ever made generally available. State-of-the-art on nearly every benchmark, with a 1M token context window and new safety guardrails.

How much does Claude Fable 5 cost? Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens on the API. It’s included on Pro, Max, and Team subscription plans through June 22, after which it moves to usage credits. That’s double the price of Opus 4.8 ($5/$25).

How does Fable 5 compare to Mythos 5? Fable 5 and Mythos 5 are the same underlying model. The difference is safety guardrails: Fable 5 automatically detects and blocks high-risk queries in cybersecurity, biology/chemistry, and model distillation, falling back to Opus 4.8. Mythos 5 lifts those safeguards for approved researchers and Glasswing partners.

Is Fable 5 better than GPT-5.5? On benchmarks, Fable 5 is state-of-the-art on nearly every test, including SWE-bench Pro (62.3%), FrontierCode Diamond (29.3%, 9x Opus 4.8), and ExploitBench (78.0%). Early qualitative reviews from developers suggest it’s a genuine step change, especially for long-horizon problems.

This article was published on Agentic Up (https://agenticup.dev): practical guides for developers and founders building with AI agents. Reach me at [email protected]