Claude Opus 4.6 Launches With Smarter Coding, Research, and Office Skills


Claude Opus 4.6

As earlier rumored, Anthropic has just released Claude Opus 4.6. The new model improves coding, research, and everyday office tasks, while adding the first 1M token context window in beta for its Opus-class models. Claude Opus 4.6 sees major improvement in planning, sustains agentic tasks longer, and works reliably across larger codebases.

Claude Opus 4.6 Excels in Coding, Knowledge Work, and Long-Context Reasoning

Anthropic mentions that it can now also perform better code reviews, catch mistakes, and handle debugging autonomously. Apart from coding, Opus 4.6 can run financial analyses, process spreadsheets, manage documents, and even create presentations. Within Cowork, where Claude multitasks independently, Opus 4.6 applies all these abilities automatically on your behalf.

In terms of performance, the model stands out across several benchmarks. It scores highest on the agentic coding evaluation Terminal-Bench 2.0, tops Humanity’s Last Exam, and outperforms GPT-5.2 and its own predecessor on GDPval-AA, a test for economically valuable tasks in finance, legal, and other sectors. BrowseComp evaluations also show its superior ability to locate hard-to-find online information.

Moving on, Opus 4.6 brings longer-context reasoning to the table. It can track hundreds of thousands of tokens, pick up subtle details, and reduce “context rot” in long sessions. On tests like MRCR v2, Opus 4.6 achieved 76% accuracy, compared to just 18.5% from Sonnet 4.5, demonstrating a giant leap in long-context performance.

Comparison chart
Image credit: Anthropic

Safety, developer controls, and office integration

Anthropic hasn’t compromised on the safety part either. That’s because Opus 4.6 maintains low rates of misaligned behavior, low over-refusal rates, and shows strong safeguards, including six new cybersecurity probes. The model is also used to detect and patch vulnerabilities in open-source software, supporting defensive cybersecurity tasks.

Not to forget, developers get new controls with the API. Adaptive thinking lets Claude decide when deeper reasoning is useful, effort settings adjust intelligence and speed, and context compaction lets longer-running tasks complete without hitting limits. Outputs now reach 128K tokens, with premium options for tasks above 200K. US-only inference is also available.

Office integration expands further, as Claude in Excel can now handle long, multi-step tasks with improved performance and structured data reasoning. In PowerPoint, Claude is now available as a research preview. The AI model also lets users convert Excel outputs into on-brand slides automatically. Moreover, Agent teams in Claude Code can now run tasks in parallel, coordinating autonomously on large, read-heavy workloads.

Claude Opus 4.6 is available today via claude.ai, API, and all major cloud platforms, making it easier for developers and enterprise teams to tackle complex tasks with smarter, more capable AI.

More about the topics: AI, anthropic

Readers help support Windows Report. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help Windows Report sustain the editorial team. Read more

User forum

0 messages