The Rise of Agentic Systems: Understanding the Capabilities of AI Agents
The concept of AI agents has become a pivotal topic in the field of artificial intelligence, particularly in the development of Large Language Models (LLMs). As the field of AI continues to evolve, understanding and leveraging the spectrum of agentic capabilities will be crucial for developing efficient and robust LLM applications.
Illustration of AI agents in action
Defining an AI Agent
According to the LangChain Blog, the definition and understanding of what constitutes an ‘agent’ can vary widely, often leading to confusion and debate among developers and researchers. The blog highlights that even simple systems where an LLM routes between different paths can be considered agents under this definition.
Andrew Ng, a prominent figure in AI, suggests that instead of debating which systems qualify as true agents, it is more productive to view agent capabilities on a spectrum. This perspective aligns with how autonomous vehicles are categorized by their levels of autonomy.
The Spectrum of Agentic Behavior
LangChain Blog further elaborates on the concept of ‘agentic’ behavior, presenting it as a measure of how much an LLM determines a system’s actions. The blog categorizes systems into different levels of agentic behavior:
- Router: Systems that use an LLM to route inputs into specific workflows.
- State Machine: Systems that include multiple routing steps and can loop until a task is complete.
- Autonomous Agent: Highly agentic systems that build and remember tools for future steps, akin to the implementation seen in the Voyager paper.
This technical gradation helps developers design and describe LLM systems more effectively.
The Importance of Agentic Systems
Understanding the level of agentic behavior in a system can significantly influence the development process. More agentic systems require robust orchestration frameworks, durable execution environments, and comprehensive evaluation and monitoring tools. LangChain Blog emphasizes that as systems become more agentic, they also become more complex and challenging to manage, necessitating specialized tools and infrastructure.
For instance, highly agentic systems benefit from frameworks that support branching logic and cycles, enabling faster development. They also require monitoring tools that allow developers to observe and modify the agent’s state or instructions in real-time, ensuring the system stays on track.
New Tooling for Agentic Systems
The increasing complexity and capabilities of agentic systems have driven the need for new tools and infrastructure. LangChain has developed LangGraph for agent orchestration and LangSmith for testing and observability of LLM applications. These tools are designed to support the unique requirements of highly agentic systems.
Illustration of LangChain tools in action
CriticGPT: A New Era in AI Error Detection
OpenAI researchers have introduced CriticGPT, a new artificial intelligence model designed to detect and critique errors in ChatGPT-generated code. This model aims to improve the alignment of AI systems with human expectations through reinforcement learning based on human feedback (RLHF), which improves the accuracy of the output of a large language model (LLM).
CriticGPT serves as an assistant for human trainers who review ChatGPT-generated code. Built on the LLM GPT-4 family, CriticGPT analyzes code and highlights potential errors, helping human reviewers identify bugs that might otherwise go unnoticed.
The development of CriticGPT involved training the model on numerous inputs containing intentional errors. Human trainers modified the code written by ChatGPT, introduced errors, and provided feedback as if they had discovered the errors themselves. This process allowed the model to learn to identify and critique different types of coding errors.
Project Indus: A Breakthrough in Indigenous LLMs
Tech Mahindra has launched Project Indus, an indigenous large language model (LLM) that can converse in Indic languages and dialects, including Hindi. The LLM will be implemented using a Generative Artificial Intelligence (GenAI)-in-a-box framework to simplify the deployment of advanced AI models for enterprises.
The companies aim to create multiple tailored use cases and enable customers to leverage various applications, including customer support, experience, and content creation across healthcare, rural education, banking and finance, agriculture, and telecom, among other industries.
Illustration of Project Indus in action