Smaller Models: The Key to Unlocking AI at the Edge

Smaller AI models could be the key to unlocking AI at the edge, enabling applications to move out of the data center and into edge locations.

Smaller Models: The Key to Unlocking AI at the Edge

The large language models (LLMs) that have been dominating the AI landscape may be about to get some competition from smaller, more agile models. These smaller models could be the perfect fit for the space, power, and compute constraints at the edge, enabling AI applications to move out of the data center and into edge locations.

“We’re seeing some of these models are shrinking in size pretty dramatically,” - Francis Chow, VP and GM for Edge and In-vehicle Operating Systems at Red Hat.

Microsoft and Google are already working on smaller, more efficient models. Microsoft has reportedly formed a new team to create a generative AI model that requires less compute power than OpenAI’s ChatGPT, while Google has unveiled its line of Gemma models, scaled-down versions of its Gemini technology that can run on a laptop.

Training at the Edge

According to Francis Chow, these smaller models could enable training to move from data centers to edge locations. This shift is made possible not only by the shrinking models but also by the fact that some edge applications don’t require a full LLM to function. Keeping the data at the edge, closer to where it’s generated, can enable these applications to leverage more specific data than a regular LLM is trained on.

Edge computing infrastructure is expected to reach $350 billion by 2027

The Future of Edge AI

Spending on edge compute infrastructure is expected to hit $232 billion this year and nearly $350 billion by 2027, according to IDC. This growth is driven by the increasing demand for AI applications at the edge.

“Edge computing will play a pivotal role in the deployment of AI applications,” - Dave McCarthy, IDC Research VP.

Applying AI

Enterprises are still trying to figure out what they can do with AI. The key is to develop a solid strategy and figure out which solutions are mature enough and will generate the right return on investment. Another key task is working up an informed approach to AI governance, especially in light of efforts to regulate the technology.

“In general terms, if something has to be decided in real-time and it doesn’t take a lot of heavy analytics, that’s best run at the edge,” - Francis Chow.

Applications of Edge AI

The applications of edge AI will vary by vertical. Financial companies might use AI for executing trades faster and more intelligently, while retail might use it for loss prevention or real-time promotions based on customer purchases.

Edge AI applications will vary by vertical