The adoption of Large Language Model (LLM) based systems is transforming the artificial intelligence landscape, with growing interest in agentic architectures capable of autonomously interacting with complex environments. However, the design of effective multi-agent systems raises fundamental questions regarding optimal architectures, communication protocols, and evaluation methodologies. This thesis addresses the problem of comparative evaluation between single-agent and multi-agent architectures in multi-turn scenarios, specifically analyzing the trade-off between architectural complexity, computational costs, and performance. The work also focuses on the integration of emerging protocols such as Model Context Protocol (MCP) and Agent-to-Agent (A2A) for the implementation of distributed systems, comparing them with traditional monolithic approaches. To conduct this analysis, MABench, a benchmarking framework derived from $\tau$-bench, has been used, enabling systematic evaluation of different agentic strategies (single-agent, supervisor-based, swarm and decentralized) in simulated user-agent interaction environments. The evaluation metrics, implemented through the DeepEval framework, include task completion, tool correctness, step efficiency, communication quality and verification thoroughness. Experimental results demonstrate that more complex architectures do not necessarily guarantee superior performance, and that the choice of optimal architecture strongly depends on the application domain and operational constraints. Specifically, distributed systems based on MCP and A2A show performance comparable to their local counterparts, opening interesting prospects for scalable enterprise deployments. This work aims to contribute to this topic by providing a reproducible evaluation techniques and practical guidelines for selecting agentic architectures and evaluation criterias in enterprise contexts.

Architectural Trade-offs in LLM-Based Agentic Systems: Complexity, Cost, and Performance in Multi-Turn Scenarios

GRANDI, ANDREA
2024/2025

Abstract

The adoption of Large Language Model (LLM) based systems is transforming the artificial intelligence landscape, with growing interest in agentic architectures capable of autonomously interacting with complex environments. However, the design of effective multi-agent systems raises fundamental questions regarding optimal architectures, communication protocols, and evaluation methodologies. This thesis addresses the problem of comparative evaluation between single-agent and multi-agent architectures in multi-turn scenarios, specifically analyzing the trade-off between architectural complexity, computational costs, and performance. The work also focuses on the integration of emerging protocols such as Model Context Protocol (MCP) and Agent-to-Agent (A2A) for the implementation of distributed systems, comparing them with traditional monolithic approaches. To conduct this analysis, MABench, a benchmarking framework derived from $\tau$-bench, has been used, enabling systematic evaluation of different agentic strategies (single-agent, supervisor-based, swarm and decentralized) in simulated user-agent interaction environments. The evaluation metrics, implemented through the DeepEval framework, include task completion, tool correctness, step efficiency, communication quality and verification thoroughness. Experimental results demonstrate that more complex architectures do not necessarily guarantee superior performance, and that the choice of optimal architecture strongly depends on the application domain and operational constraints. Specifically, distributed systems based on MCP and A2A show performance comparable to their local counterparts, opening interesting prospects for scalable enterprise deployments. This work aims to contribute to this topic by providing a reproducible evaluation techniques and practical guidelines for selecting agentic architectures and evaluation criterias in enterprise contexts.
2024
Multi-Agent Systems
Agentic AI
MCP/A2A
Benchmark
Enterprise
File in questo prodotto:
File Dimensione Formato  
Grandi.Andrea.pdf

Accesso riservato

Dimensione 5.13 MB
Formato Adobe PDF
5.13 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14251/5404