BlogCategoriesCompareAbout
  1. Home
  2. Blog
  3. Claude 1M Context Goes GA: Opus & Sonnet 4.6 Get Full Window at Standard Pricing
AI News

Claude 1M Context Goes GA: Opus & Sonnet 4.6 Get Full Window at Standard Pricing

Anthropic removes the long-context pricing premium. Opus 4.6 and Sonnet 4.6 now support 1 million tokens at standard rates — no beta headers, no rate limit penalties.

Rai Ansar
Updated Mar 16, 2026
5 min read
Claude 1M Context Goes GA: Opus & Sonnet 4.6 Get Full Window at Standard Pricing

Anthropic released 1 million token context windows for Claude Opus 4.6 and Sonnet 4.6 on March 13, 2026. Both models now process up to 1M tokens at standard pricing with no premium multiplier or rate limit penalties.

What changed for Claude 1M context GA on March 13, 2026?

Anthropic removed the 2x input price multiplier for Sonnet 4's long-context requests and added 1M context to Opus 4.6. Both models now process 1M tokens at standard rates with no premium pricing.

Sonnet 4's 1M context previously carried a 2x input price multiplier for prompts exceeding 200K tokens during public beta. Opus 4.6 lacked 1M context entirely.

Both flagship models now offer the full 1M window at their standard rates:

ModelInput PriceOutput PriceContext Window
Opus 4.6$5/MTok$25/MTok1,000,000 tokens
Sonnet 4.6$3/MTok$15/MTok1,000,000 tokens

Users pay the same per-token rate whether their request uses 9,000 tokens or 900,000 tokens. This eliminates cost penalties for processing large codebases, legal documents, or extended agent sessions.

How do Claude Opus and Sonnet perform on long-context benchmarks?

Opus 4.6 scores 78.3% on MRCR v2 at 1M tokens and Sonnet 4.6 scores 68.4% on GraphWalks BFS at 1M tokens. Both achieve the highest scores in their respective model classes.

Anthropic expanded the context window while maintaining accuracy across long sequences. MRCR (Multi-Round Coreference Resolution) tests whether models track entities and relationships across massive contexts. This capability proves essential when processing entire codebases or 500-page contracts.

GraphWalks BFS measures a model's ability to navigate complex information structures within long contexts. Sonnet 4.6's 68.4% score demonstrates reliable performance on structured data analysis tasks.

What does 1 million tokens represent in practical terms?

1 million tokens equals approximately 750,000 words of text, 75,000+ lines of code, or 600 images/PDF pages in a single request. This represents a 6x increase from the previous 100-page media limit.

The token capacity translates to specific use cases:

  • 10 full-length novels worth of text content

  • Complete codebases with cross-file dependency tracking

  • Entire contract sets or research paper collections

  • Thousands of pages of documentation

The media capacity increase from 100 to 600 pages per request significantly impacts enterprise document analysis workflows.

How does 1M context improve Claude Code performance?

Claude Code users experience 15% fewer compaction events with 1M context. Users can search, re-search, and refactor entire codebases without losing conversation history or model context.

Claude Code previously consumed 100K+ tokens searching codebases, triggering compaction processes that summarized earlier conversations. This compression caused the model to lose nuance from earlier searches and forget edge cases discovered minutes earlier.

The 1M context window allows continuous exploration of entire codebases without compaction interruptions. Users load complete repositories, explore thoroughly, and receive comprehensive fixes while the model maintains full awareness of all discovered patterns and edge cases.

Where can developers access Claude 1M context today?

1M context is available on Claude Platform API, Microsoft Azure Foundry, Google Cloud Vertex AI, and Claude Code for Max/Team/Enterprise plans. No beta headers or special model versions are required.

Platform availability includes:

  • Claude Platform API with standard endpoints

  • Microsoft Azure Foundry integration

  • Google Cloud Vertex AI deployment

  • Claude Code with automatic enablement on paid plans

Full throughput rate limits apply at every context length. Developers experience no reduced rates for long-context requests compared to shorter inputs.

How does Claude's pricing compare to other AI providers?

Claude offers the only frontier model family with 1M context across flagship models at flat pricing. GPT-5 provides 256K tokens maximum while Gemini 2.5 Pro uses tiered pricing for longer contexts.

ProviderMax ContextLong-Context Premium
Claude Opus 4.61M tokensNone (standard pricing)
Claude Sonnet 4.61M tokensNone (standard pricing)
GPT-5256K tokensN/A
Gemini 2.5 Pro1M tokensTiered pricing applies

Gemini 2.5 Pro matches Claude's 1M token window but implements tiered pricing structures that increase costs for longer contexts. Claude maintains consistent per-token rates regardless of context length.

What cost optimization features work with 1M context?

Prompt Caching reduces costs for repeated queries on the same large context, while Batch Processing provides 50% cost savings for non-time-sensitive workloads. Combined usage delivers maximum cost efficiency.

Prompt Caching stores frequently accessed large contexts like codebases. The first request loads the full context while subsequent requests reuse the cached version, reducing both latency and costs.

Batch Processing mode processes non-urgent workloads at 50% of standard pricing. Users can combine caching with batching to process massive document sets at significantly reduced costs.

How should developers adapt their AI architectures for 1M context?

Developers can eliminate document chunking, build longer-running agents, and simplify RAG architectures. Datasets under 750K words may not require RAG systems when full context loading becomes feasible.

The expanded context window enables architectural simplifications:

  1. Process entire repositories, contract sets, or research collections in single requests

  2. Maintain full execution traces in agents without summarization losses

  3. Load complete datasets directly instead of building complex retrieval pipelines

When 600 PDF pages or 75,000 lines of code fit in a single request at standard pricing, traditional chunking and retrieval strategies become unnecessary for moderate-sized datasets.

How do developers start using Claude 1M context?

API users send requests up to 1M tokens without special headers. Claude Code users update to the latest version for automatic enablement. Cloud users access standard endpoints on Azure Foundry and Vertex AI.

Implementation steps:

  1. API users: Send requests up to 1M tokens using standard endpoints

  2. Claude Code users: Update to latest version on Max/Team/Enterprise plans

  3. Cloud users: Access via Azure Foundry and Vertex AI standard model endpoints

The model IDs are claude-opus-4-6 and claude-sonnet-4-6. No new model versions or special variants are required for 1M context access.

Related Resources

Explore more AI tools and guides

Best AI Legal Tools 2026: Ultimate Harvey AI vs LegalZoom vs Casetext Comparison for Law Firms

Best AI Productivity Tools 2026: Ultimate Notion AI vs ClickUp vs Monday.com Comparison for Remote Teams

Figure Robot vs Tesla Bot 2026: Ultimate Humanoid AI Robot Comparison for Home Automation

Best AI Blog Writer 2026: Ultimate Hands-On Review of Top Tools for Automated Content Creation and SEO Optimization

Best Free AI Photo Editor 2026: Ultimate Hands-On Review of Top Tools for Effortless Image Enhancement and Creative Edits

More ai news articles

Share this article

TwitterLinkedInFacebook
RA

About the Author

Rai Ansar

Founder of AIToolRanked • AI Researcher • 200+ Tools Tested

I've been obsessed with AI since ChatGPT launched in November 2022. What started as curiosity turned into a mission: testing every AI tool to find what actually works. I spend $5,000+ monthly on AI subscriptions so you don't have to. Every review comes from hands-on experience, not marketing claims.

On this page

Stay Ahead of AI

Get weekly insights on the latest AI tools and expert analysis delivered to your inbox.

No spam. Unsubscribe anytime.

Continue Reading

All Articles
Best AI Legal Tools 2026: Ultimate Harvey AI vs LegalZoom vs Casetext Comparison for Law FirmsAI News

Best AI Legal Tools 2026: Ultimate Harvey AI vs LegalZoom vs Casetext Comparison for Law Firms

Discover the most powerful AI legal tools transforming law firms in 2026. Our comprehensive comparison covers Harvey AI, LegalZoom, Casetext, and other leading platforms for contract analysis, legal research, and document automation to help you choose the perfect solution for your practice.

Rai Ansar
Mar 16, 202621m
Best AI Productivity Tools 2026: Ultimate Notion AI vs ClickUp vs Monday.com Comparison for Remote TeamsAI News

Best AI Productivity Tools 2026: Ultimate Notion AI vs ClickUp vs Monday.com Comparison for Remote Teams

Discover which AI productivity platform delivers the best workflow automation for remote teams in 2026. Our comprehensive comparison of Notion AI, ClickUp, and Monday.com reveals the strengths, pricing, and ideal use cases for each tool to help you make the right choice for your team's productivity needs.

Rai Ansar
Mar 16, 202612m
Figure Robot vs Tesla Bot 2026: Ultimate Humanoid AI Robot Comparison for Home AutomationAI News

Figure Robot vs Tesla Bot 2026: Ultimate Humanoid AI Robot Comparison for Home Automation

Figure AI and Tesla are racing to bring humanoid robots to your home in 2026. Our comprehensive comparison analyzes real-world testing data, pricing, and capabilities to help you choose the best AI robot for home automation tasks.

Rai Ansar
Mar 16, 202613m

Your daily source for AI news, expert reviews, and practical comparisons.

Content

  • Blog
  • Categories
  • Comparisons
  • Newsletter

Company

  • About
  • Contact
  • Editorial Policy
  • Privacy Policy
  • Terms of Service

Connect

  • Twitter / X
  • LinkedIn
  • contact@aitoolranked.com

© 2026 AIToolRanked. All rights reserved.