Anthropic Launches Claude Sonnet 5: Near-Opus Performance at a Mid-Tier Price
Anthropic launched Claude Sonnet 5, its most agentic Sonnet model, with performance approaching Opus 4.8 at a lower price, available on all plans. A look at its capabilities, pricing, and caveats.
Amid a fierce race over "agentic" models, Anthropic launched its new model Claude Sonnet 5, describing it as "the most agentic Sonnet model yet." The core idea is simple and enticing: performance approaching the expensive Opus models, but at a mid-tier price. The model was announced on June 30, 2026, and became immediately available across all plans — even the free ones — and in Claude Code, the Claude Platform, and its API.
A Core Promise: Near-Opus Capability at a Lower Price
Anthropic says Sonnet 5 narrows the gap with Opus models while lowering the price. It carries a significant improvement over its predecessor Sonnet 4.6 in reasoning, tool use, coding, and knowledge work, and its performance approaches Opus 4.8 in certain areas. On one agentic-coding benchmark, Sonnet 5 scored about 63.2% versus 69.2% for Opus 4.8 and 58.1% for Sonnet 4.6; it even slightly outperformed Opus 4.8 on a knowledge-work benchmark. But Anthropic itself acknowledges that Opus 4.8 remains the more accurate choice for high-stakes tasks.
It "Finishes Tasks" Instead of Stopping Halfway
What caught early-access partners' attention was not the numbers, but the behavior. According to their testimonials, Sonnet 5 completes complex tasks that previous models would "stop short of," and reviews its output without being asked. A Zapier engineer described a two-part task — updating Salesforce account tiers and sending an announcement — that "used to stall halfway" but now completes end to end. This is precisely the reliability gap that kept many enterprises hesitant to move agents from pilots to production: a model that gets 80% through a task then stops creates more problems than it solves.
Pricing: The Heart of the Announcement
Pricing may be the most important part of the announcement. Sonnet 5 starts at introductory pricing of $2 per million input tokens and $10 for output, through August 31, 2026, then rises to $3 and $15 respectively. This remains cheaper than Opus 4.8 ($5 and $25), and than GPT-5.5 and Gemini 3.1 Pro. But there is a buried detail worth attention: Sonnet 5 uses a new tokenizer that makes the same text map to more tokens (a roughly 30% increase on average). Anthropic says the introductory price is calibrated to make the transition "roughly cost-neutral," but high-volume enterprises will need to benchmark their specific cases carefully before assuming their bills stay flat — especially after September.
A Counter-Perspective: Independent Caveats
The picture is not entirely promotional. Beyond Anthropic's official numbers, developer Theo Browne reported that in his test, Sonnet 5 scored about 37% on a coding task, while consuming about six thousand dollars in testing costs — among the highest observed for a model. Other analysts warned that reading "newer model = better" is narrow: Sonnet 5 does raise the baseline, but it still needs task-level validation against real codebases and tools before relying on it in production. This balance matters: the real value shows in specific workloads, not in every use.
Safety and the Wider Context
On safety, Anthropic says Sonnet 5 shows a lower rate of "undesirable behaviors" than Sonnet 4.6, with less hallucination and sycophancy, and more resistance to "prompt injection" attacks. It launched with cyber safeguards enabled by default, but less strict than Fable 5's, because the company judged its cyber risk to be low. The launch comes within a broader industry shift from competing on numbers to competing on "what enterprises can actually afford to deploy," which serves Anthropic's narrative as it heads toward an anticipated IPO. It mirrors a similar path among competitors: OpenAI's GPT-5.6 and Google's Gemini 3.5 Flash, both pitched as a shift toward agency.
For the developer, Sonnet 5 represents a practical option for running agents at a lower cost, especially in multi-step tasks requiring long context, multiple tools, and reduced human-correction loops. But the balanced advice remains: benchmark on your own workload before migrating, and mind the impact of the new tokenizer on your actual cost after the introductory offer ends.