Claude Sonnet 4 Review: Anthropic’s New Standard
Anthropic’s Sonnet 4 is here. It’s the new default for Claude Pro users.
We’ve been testing it intensively.
What’s New
Improved Reasoning
Sonnet 4 handles complex reasoning better than its predecessor. Multi-step problems, nuanced analysis, and logical chains all improved.
Better Instruction Following
More precise adherence to detailed instructions. Less “creative interpretation” when you want exact compliance.
Enhanced Coding
Significant improvements in:
- Code generation accuracy
- Debugging explanations
- Architecture discussions
- Test writing
Faster Responses
Noticeably quicker than Sonnet 3.5, approaching Haiku speeds for many tasks.
Larger Effective Context
Still 200K tokens, but better utilization. Maintains coherence over longer conversations.
Testing Results
Writing
Test: Complex blog post with specific requirements
Result: Excellent. Followed format, maintained voice, hit key points.
Compared to GPT-4o: Slightly more natural prose, similar overall quality.
Analysis
Test: Business case analysis with multiple factors
Result: Nuanced, balanced, identified considerations we hadn’t prompted.
Compared to GPT-4o: More thorough, less definitive (which is appropriate).
Coding
Test: Build a feature with specific requirements
Result: Clean, well-documented code. Good architecture choices.
Compared to GPT-4o: Similar quality. GPT-4o’s code interpreter still an advantage.
Reasoning
Test: Logic puzzles and multi-step problems
Result: Significant improvement. Gets more problems right.
Compared to GPT-4o: Comparable. Both very capable.
What We Love
1. Writing Quality
Still the best at natural, thoughtful prose. Less “AI voice.”
2. Nuanced Responses
Handles ambiguity well. Acknowledges uncertainty appropriately.
3. Speed Improvements
Faster responses make conversations more natural.
4. Instruction Precision
Does what you ask more reliably.
5. Consistency
Less variation in output quality.
What’s Still Missing
No Code Execution
GPT-4o can run code. Claude still can’t.
No Image Generation
DALL-E integration gives ChatGPT an edge for visual creators.
No Real-Time Information
Still no native browsing. Perplexity fills this gap.
Ecosystem Size
Fewer integrations than ChatGPT.
Pricing
Same as before:
- Free: Access with limits
- Pro ($20/month): Higher limits, priority
- API: Per-token pricing
No price increase with capability increase. That’s good.
Who Should Use Sonnet 4
Writers
If writing quality matters, Sonnet 4 is the best general-purpose choice.
Analysts
For nuanced analysis without false confidence.
Professional Users
The improvement in instruction following helps work applications.
Developers
Competitive with GPT-4o for most coding tasks.
Sonnet 4 vs Competition
| Aspect | Sonnet 4 | GPT-4o | Gemini Pro |
|---|---|---|---|
| Writing | Best | Great | Good |
| Coding | Great | Best | Good |
| Speed | Fast | Fast | Fastest |
| Context | 200K | 128K | 1M |
| Ecosystem | Smaller | Largest | Google-focused |
| Price | Similar | Similar | Similar |
Our Verdict
Sonnet 4 is an excellent evolution. Not revolutionary—Anthropic is iterating effectively.
Use Sonnet 4 for:
- Writing of all kinds
- Analysis requiring nuance
- Professional communication
- Coding (competitive option)
Consider alternatives for:
- Need code execution (GPT-4o)
- Need image generation (GPT-4o)
- Need massive context (Gemini)
Rating: 9/10
Claude remains our first choice for writing. Sonnet 4 cements that position.
Anthropic continues to refine what they do best. Sonnet 4 is the result.
Frequently Asked Questions
For writing and nuanced analysis, yes. For coding with real-time execution and broad integrations, GPT-4o has advantages. They're both excellent for different uses.
If you use Claude daily for work, yes. The higher limits and priority access are worth $20/month. For occasional use, the free tier is sufficient.