Beyond OpenAI: Exploring the World of Large Language Model APIs
The rise of AI-powered features in B2B applications has been nothing short of phenomenal. At CommandBar, we’re no exception, relying on large language model (LLM) APIs to build embedded user assistance agents for 20 million end users. After 11 months of using OpenAI’s APIs, we’re now venturing out to explore alternative models, driven by the need to lower costs, improve performance, and gain more control over data privacy.
The world of LLM APIs beyond OpenAI
7 Large Language Models Not From OpenAI
- Claude
- Llama
- Mistral
- Cohere
- Bert
- Ernie
- Gemini
The dominance of OpenAI’s models in the B2B world is undeniable, with a study by venture capital firm a16z revealing that enterprises predominantly use OpenAI’s models to power their AI features and apps. However, this dominance may not last, as companies are now testing multiple models to avoid vendor lock-in and find OpenAI alternatives that can help them save costs, gain more control, and avoid vendor lock-in.
The quest for OpenAI alternatives
More Than Just OpenAI: Exploring Alternative LLM APIs
OpenAI is not the only provider of LLM APIs. Anthropic offers its Claude models via an API, as does Mistral for its proprietary models. Google’s Gemini is also accessible via API. Via API providers like Replicate and Hugging Face, you can access various open-source LLMs via API, including Mistral’s Mixtral models and Meta’s Llama models.
The diverse landscape of LLM API providers
Which AI API Should You Use?
When choosing an AI API, there are three core metrics to consider: price, quality, and speed. The ideal API will depend on your specific use case and requirements. For instance, if you’re building an AI writing tool, you may prioritize quality and speed over price. On the other hand, if you’re using an LLM to automatically rename downloaded files, you may be more concerned with price and speed.
The three core metrics for choosing an AI API
Diversifying Your AI Infrastructure
Relying on a single API to run your entire product or feature can be risky. Vendor downtime or shutdowns can impact your customer experience, making it essential to diversify your AI infrastructure. By using multiple APIs, you can mitigate vendor risk and ensure business continuity.
Diversifying your AI infrastructure for business continuity
Orchestration Across Multiple Models
As models specialize, it’s likely that a single AI application will get the best performance by orchestrating across multiple models. For example, in our chat product, we can orchestrate queries to models based on the currently available latency from each, to ensure users get the fastest responses possible. We’ve also explored orchestrating to trade off cost and quality.
Orchestrating across multiple models for optimal performance
In conclusion, the world of LLM APIs is vast and diverse, with multiple models and providers vying for attention. By exploring alternative models and diversifying your AI infrastructure, you can avoid vendor lock-in, reduce costs, and improve performance. The future of AI applications lies in orchestration across multiple models, and it’s essential to stay ahead of the curve.
The future of AI applications: orchestration across multiple models