Update to Claude Opus 4.6.
By Jakub Antkiewicz
•2026-03-24T08:52:41Z
Anthropic has released Claude Opus 4.6, the latest update to its most advanced large language model, targeting significant improvements in agentic and coding capabilities. The new model distinguishes itself with enhanced planning, debugging, and the ability to operate reliably within large codebases. Notably, Opus 4.6 is the first in its class to feature a 1 million token context window in beta, enabling it to process and maintain coherence over substantially larger amounts of information. The company asserts the model has achieved state-of-the-art performance on several industry evaluations, positioning it as a direct competitor for complex reasoning and autonomous task execution in professional environments.
According to Anthropic's announcement, Opus 4.6 leads on benchmarks such as Terminal-Bench 2.0 for agentic coding and outperforms OpenAI’s GPT-5.2 by 144 Elo points on GDPval-AA, an evaluation of knowledge work in finance and legal domains. A key technical advancement is its long-context retrieval; the model scored 76% on a demanding 'needle-in-a-haystack' test, a marked improvement over its predecessor's 18.5%. For developers, the model introduces new controls like 'adaptive thinking' and an 'effort' parameter to manage the trade-off between reasoning depth, latency, and cost. Claude Opus 4.6 is now available through the claude.ai interface and API, with pricing held constant at $5 per million input tokens and $25 per million output tokens.
The release of Opus 4.6 signals a deepening focus on making AI agents practical for enterprise-scale problems that require long-term autonomy. Early access partners report that the model can handle complex, multi-step workflows with less human supervision, from managing software repositories to conducting cybersecurity investigations. Its reported proficiency in navigating multi-million-line codebases and performing consistently in legal and financial analysis suggests a push toward specialized, high-value corporate applications. This move intensifies competition in the AI market, shifting the emphasis from generalized conversational ability to the reliable execution of long-horizon, economically valuable tasks.
Anthropic is steering the conversation from raw capability benchmarks to operational reliability. By focusing on long-context coherence, agentic endurance, and granular cost-performance controls, the company is positioning Claude Opus 4.6 not just as a powerful model, but as a dependable platform for building enterprise-grade autonomous systems.