Anthropic has released Claude 3.5 Sonnet, the first model in its new 3.5 generation, claiming it surpasses both GPT-4o and its own Claude 3 Opus in intelligence while delivering twice the speed at one-fifth the cost.
Performance Breakthrough
The company reports that Claude 3.5 Sonnet achieves state-of-the-art results across multiple benchmarks, setting new standards for what a mid-tier priced model can accomplish:
- Graduate-level reasoning (GPQA): 59.4%, surpassing Claude 3 Opus and GPT-4o
- Coding proficiency (HumanEval): 92.0%, a significant leap over previous models
- Math problem-solving (MATH): 71.1%, demonstrating improved reasoning
- Multilingual math (MGSM): 91.6%, showing strong cross-language capabilities
“We’ve made substantial improvements in reasoning, knowledge, and coding ability,” said Dario Amodei, CEO of Anthropic. “Claude 3.5 Sonnet represents what we believe AI assistants should be: highly capable, fast, and safe.”
Vision and Artifact Capabilities
Alongside raw intelligence improvements, Claude 3.5 Sonnet introduces enhanced vision capabilities. The model can analyze charts, graphs, and complex images with improved accuracy, making it particularly valuable for data analysis and document processing tasks.
Anthropic also unveiled “Artifacts,” a new feature in Claude.ai that allows the model to generate interactive content including code, documents, and visualizations in a dedicated window alongside the conversation.
Speed and Pricing
Claude 3.5 Sonnet operates at roughly twice the speed of Claude 3 Opus while maintaining lower pricing:
- Input: $3 per million tokens
- Output: $15 per million tokens
This positions it as a compelling option for enterprise applications requiring both high performance and cost efficiency.
Safety Measures
True to Anthropic’s mission, Claude 3.5 Sonnet includes updated safety features. The company reports rigorous red-teaming and evaluation processes, with the model designed to be helpful while avoiding harmful outputs. Internal testing indicates improvements in refusing inappropriate requests while remaining useful for legitimate tasks.
Developer Access
Claude 3.5 Sonnet is available immediately through Anthropic’s API, Amazon Bedrock, and Google Cloud’s Vertex AI. The company has also updated its free Claude.ai interface to use the new model as the default.
Industry Response
Early adopters have praised the model’s coding abilities in particular, with several developers reporting that it generates more accurate and maintainable code than previous options. The combination of speed, intelligence, and competitive pricing has led some analysts to describe it as the new default choice for AI-assisted development.
Future Releases
Anthropic indicated that Claude 3.5 Opus, expected to offer even greater capabilities, will follow later in 2024. The company continues to pursue its research agenda around AI safety and beneficial AI development.