The Rise of Claude 3.5 Sonnet: A New Era in Large Language Models
The AI community is abuzz with the release of Anthropic’s Claude 3.5 Sonnet, a large language model (LLM) that has taken the performance crown from OpenAI’s GPT-4. This new model has not only surpassed its predecessor in terms of performance but has also demonstrated its capabilities in various tasks, leaving many AI influencers and power users in awe.
Claude 3.5 Sonnet’s impressive performance timeline
Advancing Coding Skills and Product Creation
One of the most impressive aspects of Claude 3.5 Sonnet is its ability to create complex code and products with ease. Enterprise AI influencer and expert Allie K. Miller demonstrated this by asking the model to create a playable game based on a screenshot, which it did in under half a minute. The model’s ability to understand the context and create a fully functional game is a testament to its capabilities.
“This is wild.” - Allie K. Miller
Similarly, the informative X account @TestingCatalog News showed how the newly launched “Artifacts” playground can execute code for real, working web forms that Claude 3.5 Sonnet built.
Claude 3.5 Sonnet’s code generation capabilities
Recreating Imagery from the Past
Claude 3.5 Sonnet’s capabilities don’t stop at coding. It has also demonstrated its ability to recreate imagery from the past, including a 3D scene from the 1995 movie Hackers.
Claude 3.5 Sonnet’s impressive recreation of a 3D scene
Putting Pressure on OpenAI
The release of Claude 3.5 Sonnet has put pressure on OpenAI to continue making the case for its models as the right choice. With Anthropic’s model available at similar pricing, OpenAI is under renewed pressure to deliver.
“Hey, @OpenAI. You sleep through AGI. While you make promises all the time… and announce without delivering… the competition manages to deliver without making big announcements beforehand!” - @kimmonismus
Still Not Human Level
Despite the lofty praise around X, others noted that Claude 3.5 Sonnet still struggled with some basic cognitive tasks that humans can perform with relative ease, such as playing tic-tac-toe.
Claude 3.5 Sonnet’s limitations in basic cognitive tasks
However, even with these minor issues, Claude 3.5 Sonnet appears to be a tremendous leap for Anthropic and LLMs generally, and shows that the performance gains of individual AI model makers are certainly not slowing down with current levels of available compute resources.