LLM Agents, Part 3 - Multi-Agent LLM Products: A Design Pattern Perspective
In this article, we will explore how established software design principles can be applied to the emerging trend of multi-agent large language model (LLM) systems.
Why write this?
I see lots of “multi-agent” frameworks out there and I, personally, think most of them are nonsense. They are nonsense because they try to paint rosier picture than what it really takes to build extremely complex intelligent software systems. For example they claim that if you get a few LLMs to talk to each other in natural language you have a software system that robustly solves your complex business problems. Or if you throw a large crew of LLMs at a problem, they can reliably do sales and marketing and operations for your business. I think the creators of (and perhaps those who get excited about) these either have never written serious software, or are just interested in the academic exercise of “what if” rather than building anything that can actually go into production.
Starting from principles is such an important thing to do when proposing and building a new complex framework and I’m utterly surprised by how unimportant it seems in many of the proposed frameworks. Hopefully, I have convinced you that going to the basics of RL is important in thinking through agentic workflows, and in this article my attempt is to convince you that going back to software design principles is the way to go about creating multi-agent systems.
Software Architecture and ML
In this article, we will explore how established software design principles can be applied to the emerging trend of multi-agent large language model (LLM) systems. We will examine how traditional software design patterns, such as Domain-Driven Design (DDD), Service-Oriented Architecture (SOA), and microservices architecture, contribute to the development of these multi-agent systems.
Traditional design patterns provide a robust framework for software development. By integrating machine learning (ML) into these patterns, we can introduce a new dimension to software architecture. ML enables probabilistic routing between software components, replacing pre-programmed deterministic routing. This integration not only enhances the functionality of individual components but also introduces new capabilities. Both LLMs and specialized ML models, and often a combination of the two, can be utilized to achieve these improvements.
The incorporation of LLMs into software systems brings a broad range of benefits, making them more dynamic and flexible. These systems can exhibit a diverse set of behavior without the need for explicit programming, which offers a significant advantage. However, this flexibility comes at a cost: such systems can be harder to predict, maintain, and debug reliably.
Communication methods between components, previously services and more recently agents, remain consistent with traditional approaches, using REST, GraphQL, JSON, and DSLs. However, the introduction of natural language as an interface adds a new layer of complexity, with its own set of advantages and challenges. These hybrid systems, combining predetermined and probabilistic behavior, may become the new standard in software development.
In the following sections, we will delve deeper into the concepts of DDD, SOA, and microservices architecture. We will explain how DDD focuses on modeling software based on real-world domains with isolated data sharing between domains. We will also explore the benefits and challenges of this new approach, drawing parallels to successful microservices implementations to illustrate suitable use cases.
Domain-Driven Design: The Foundation
DDD emphasizes modeling software around the core domain of a business. It advocates for a common language shared by developers and domain experts, ensuring everyone speaks the same language. DDD breaks down the domain into bounded contexts, areas with well-defined and segregated responsibilities, and often minimal dependency on other areas.
Bounded contexts ensure that complexity is manageable by focusing on specific aspects of the domain. This focus also promotes better communication and understanding between developers and domain experts. By breaking down the domain into bounded contexts, we lay the groundwork for introducing agents with specialized capabilities, each responsible for a specific bounded context within the larger multi-agent LLM system. Just as bounded contexts promote modularity and focus within the domain, agents with bounded responsibility will do the same.
Example: E-commerce Platform
Consider an e-commerce platform. DDD could be used to define several bounded contexts:
Customer Management: Handles customer accounts, profiles, and preferences.
Product Catalog: Manages product information, categories, and pricing.
Order Processing: Processes orders, manages inventory levels, and handles payments.
Content Management: Creates and manages product descriptions, promotions, and other content.
Note that each of these are borrowed from the business domain of commerce to facilitate better communication between stakeholders and developers but also to tap into the robust nature of trusted and true business workflows. Each bounded context has its own data entities, business rules, and common language. This modular approach allows developers to focus on specific areas of functionality without getting overwhelmed by the complexity of the entire system.
From Bounded Contexts to Services
SOA takes the concept of bounded contexts from DDD and maps them to services. Each service encapsulates a specific domain functionality and exposes a well-defined interface. This promotes loose coupling, allowing services to evolve independently without impacting others.
Microservices architecture takes SOA a step further by creating even smaller, more focused services. Unlike SOA, in microservice architectures the services focus as narrowly as possible, often only on a single function. This approach offers greater agility, scalability, and resilience. Each microservice owns its data and logic, promoting independent development and deployment.
Example: E-commerce Platform - Microservices Breakdown
The platform would be composed of independent, loosely coupled microservices:
Customer Service: This service manages customer accounts, profiles, login credentials, and preferences. It would expose APIs for user registration, login, profile management, and wishlist functionalities.
Product Service: This service handles product information, including descriptions, categories, images, pricing, and availability. It would provide APIs for product search, filtering, retrieving product details, and managing inventory levels.
Recommender Service: This service handles proactive product recommendations functionality and integrates with the Product Service for data retrieval.
Search Service: This service handles product search functionality and integrates with the Product Service for data retrieval.
Order Service: This service oversees the order processing flow. It would handle actions like adding items to the cart, managing shopping carts, initiating checkout, processing payments, and managing order fulfillment. The order service would interact with both the Product Service and the Payment Service.
Payment Service: This service handles secure payment processing, integrating with various payment gateways. It would expose APIs for initiating payments, handling authorization, and receiving transaction confirmations.
Content Management Service: This service focuses on creating and managing website content, including product descriptions, promotions, blog posts, and other informational pages. It would provide APIs for content creation, editing, and publishing.
Each microservice would expose well-defined APIs for other services to interact with. For example:
When a customer adds an item to the cart in the frontend, it would send an API request to the Cart functionality within the Order Service.
The Order Service might then interact with the Product Service to retrieve product details and confirm availability.
Upon checkout, the Order Service would communicate with the Payment Service to initiate the payment process.
You might notice that some of the things that you’ve imagined as “multi-agent” systems could be achieved simply by a well designed software system.
The challenge with a system like this is that, although it might include some narrow scope AI components, it is ultimately passive and fairly rigid in what it can do. Combining the reliability of software written within those design principles with the flexibility of emergent capabilities offered via LLMs can be a winning formula.
Multi-Agent LLMs: A New Design Pattern
Multi-agent LLMs borrow heavily from the principles discussed above. Just like microservices, they consist of multiple, specialized agents (in addition to services), each focusing on a specific aspect of the task. These agents collaborate and leverage services to achieve a common goal, similar to how services interact through APIs.
Beyond Microservices: Active Agents vs. Passive Data Handlers
Microservices excel in building modular, scalable software systems. However, they primarily function as passive data handlers, responding to requests and manipulating data. Multi-agent LLMs, on the other hand, take a leap forward by introducing “active” components inside these services, effectively allowing them to “make decisions” in scenarios without being deterministically programmed to do so. These agents can:
Continuously monitor the situation, analyze data, and identify potential issues or opportunities.
Take initiative and perform actions without explicit instructions. This can involve initiating communication with other agents, retrieving information, or even triggering predefined workflows.
Collaborate and negotiate with each other to achieve a common goal. This allows for dynamic decision-making and adaptation to unforeseen circumstances.
This shift from passive data handling to active agents unlocks new possibilities:
Complex Task Automation: Multi-agent LLMs can automate complex tasks that require reasoning, planning, and collaboration across different domains. Imagine a system with a service constantly monitoring traffic patterns, augmented with an agent analyzing weather data, and a second one making decisions about rerouting deliveries to avoid congestion – a scenario beyond the pre-determined nature of microservices.
Emergent Behavior: LLMs themselves show emergent properties; they can classify text, extract entities, and more although they are only explicitly trained on predicting the next most likely token. When LLM-powered agents that are fine-tuned to strengthen any of these properties or are augmented with tools that give them specialized capabilities can interact with each other, non-trivial collective behavior might appear. The semantic flexibility, although less controllable in nature, combined with the reliability of JSON based communication between various software services, agentic or otherwise, could result in systems that work in ways that are more “expected” by human operators, for example by adapting and responding to situations in ways that might not be explicitly programmed.
Continuous Improvement: The modular, and to some extent potentially redundant, nature of multi-agent systems can make them less constrained by improvements in a single component as the only opportunity for improvements in the overall system. For example, an agent that is fine-tuned to do task decomposition effectively can help other agents do that task well by providing examples in a few shot setup. In a more well setup system, each component inside agents, potentially small LM or non-LM models, can have a feedback loop continuously being restrained and improved. This could include models that are involved in the policies of individual agents or the overall system.
Contracts, Languages, and Communication
Microservices architectures thrive on clear and well-defined communication. This communication relies on predefined API contracts, essentially agreements that dictate how services interact with each other. These contracts act like sheet music for an orchestra, ensuring each microservice plays its part seamlessly.
REST APIs and JSON are the cornerstones of these contracts. REST (Representational State Transfer) defines a standardized architecture for requesting and receiving data between services. JSON (JavaScript Object Notation) acts as the “language” for transmitting data, offering a lightweight and human-readable format for exchanging information.
Agentic systems use these existing mechanisms and will also introduce a new dimension to communication, adding two more communication types:
Domain-Specific Languages (DSLs): These are custom languages tailored to a specific domain or purpose. Imagine a trading agent responsible for capital market transactions using a combination of statistics, machine learning, and business logic rules. Communicating this info in natural language is too complex and error-prone, and in JSON is too limited. However, using a DSL, imagine a set of pseudocode snippets describing the logic of the rules, as a communication contract between the controller agent and the executor agent can be the most efficient channel. DSLs offer more expressiveness and efficiency compared to generic JSON data, but require specialized knowledge to understand and implement.
Natural Language (NL): This is the most human-like form of communication. Agents could potentially communicate and share information using natural language processing (NLP) techniques. However, natural language is inherently ambiguous and prone to misinterpretations. While offering the most flexibility, NL communication is also the least robust and requires advanced NLP capabilities to manage effectively.
Even in the realm of multi-agent systems, the established approach of API calls and JSON data exchange remains the most reliable and robust communication method. It provides a clear and well-defined path for information exchange. DSLs offer a middle ground, balancing expressiveness with control. Finally, natural language communication, while offering the most flexibility, comes with the greatest risk of misunderstandings and requires significant development effort to implement effectively. In all likelihood, a product that you design would tap into all these different communication channels between services and therefore agents to achieve the best balance between performance and control.
Shared Benefits and Challenges
Both multi-agent LLMs and microservices architectures offer several advantages:
Modularity: Break down complex tasks into smaller, manageable units.
Scalability: Scale individual agents or services independently based on needs.
Resilience: If designed right, given the adaptability of agent policies that leads to some redundancy, failure of one agent or service doesn't cripple the entire system.
Independent Deployment: Deploy and update individual agents/services without affecting others.
However, both approaches also come with challenges:
Increased Complexity: Managing interactions and dependencies between agents/services requires careful planning.
Testing and Debugging: Debugging issues that span multiple agents/services can be intricate. Also, the probabilistic nature of agents can make systems built with them considerably harder to debug.
Distributed System Management: Distributing resources and ensuring consistent behavior across agents/services adds complexity.
Therefore, multi-agent systems, unlike what vendors tell you, are not a silver bullet for everything and choosing to approach solving a business problem with them comes down to a careful pros/cons analysis.
How to design multi-agent architectures?
Map the workflow humans execute to achieve a particular objective including people, processes, and tools involved.
Draw context boundaries around parts of the process that are self-contained.
You want each of these areas to have minimal data dependency on another one (if they share / exchange a lot of data they might have to be merged).
You want each of these areas to have minimal functional dependencies on another one (if most of the time a change in one requires a change in another one, they should be merged into one context).
Decide if each of these contexts are a software services or if they need to become “agentic” and mark them as such (“Payment Processing Service”, “Data Analyzer Agent”)
Note that each agent might contain microservices (“PDF Parser microservice”, “info retrieval microservice”)
Add any other services necessary that your architecture doesn’t explicitly contain (“User Management Service”, “Shared Memory Service”)
Determine how data flows through the system (“PDF goes from File Upload service to parsing service”, “JSON containing rewritten query and search filters goes from query analyzer service to info retrieval service).
Revise the system modularity (context boundaries) to minimize data movement.
Determine and document communication protocols between different services and agents.
In most interfaces the protocol should be REST based and the data load should be JSON for robustness purposes.
If that fails to meet your requirement, then try to use DSLs, only if that fails also, use natural (or formal) language.
It is ok to use natural language for most of your interfaces in a quick and dirty prototype implementation, but you should remind yourself that, in all likelihood, it will not meet the reliability threshold for user facing production deployment.
Determine how you would unit test each microservice.
If conceptualizing a unit test for a microservice is too complex, it might be a sign of a need to break it into smaller pieces.
The unit test for a service would be all component unit tests passing. The unit test for an agent might be a less trivial heuristic on how all component unit tests behaved (in principle agents should be able to recover from some of the failing components; for example it can choose a document search action instead if google search action is failing).
Congratulations! You designed your first multi-agent architecture.