Architectural Trade-offs in LLM-Based Agentic Systems: Complexity, Cost, and Performance in Multi-Turn Scenarios

The adoption of Large Language Model (LLM) based systems is transforming the artificial intelligence landscape, with growing interest in agentic architectures capable of autonomously interacting with complex environments. However, the design of effective multi-agent systems raises fundamental questions regarding optimal architectures, communication protocols, and evaluation methodologies. This thesis addresses the problem of comparative evaluation between single-agent and multi-agent architectures in multi-turn scenarios, specifically analyzing the trade-off between architectural complexity, computational costs, and performance. The work also focuses on the integration of emerging protocols such as Model Context Protocol (MCP) and Agent-to-Agent (A2A) for the implementation of distributed systems, comparing them with traditional monolithic approaches. To conduct this analysis, MABench, a benchmarking framework derived from $\tau$-bench, has been used, enabling systematic evaluation of different agentic strategies (single-agent, supervisor-based, swarm and decentralized) in simulated user-agent interaction environments. The evaluation metrics, implemented through the DeepEval framework, include task completion, tool correctness, step efficiency, communication quality and verification thoroughness. Experimental results demonstrate that more complex architectures do not necessarily guarantee superior performance, and that the choice of optimal architecture strongly depends on the application domain and operational constraints. Specifically, distributed systems based on MCP and A2A show performance comparable to their local counterparts, opening interesting prospects for scalable enterprise deployments. This work aims to contribute to this topic by providing a reproducible evaluation techniques and practical guidelines for selecting agentic architectures and evaluation criterias in enterprise contexts.

Architectural Trade-offs in LLM-Based Agentic Systems: Complexity, Cost, and Performance in Multi-Turn Scenarios

GRANDI, ANDREA

2024/2025

Abstract

The adoption of Large Language Model (LLM) based systems is transforming the artificial intelligence landscape, with growing interest in agentic architectures capable of autonomously interacting with complex environments. However, the design of effective multi-agent systems raises fundamental questions regarding optimal architectures, communication protocols, and evaluation methodologies. This thesis addresses the problem of comparative evaluation between single-agent and multi-agent architectures in multi-turn scenarios, specifically analyzing the trade-off between architectural complexity, computational costs, and performance. The work also focuses on the integration of emerging protocols such as Model Context Protocol (MCP) and Agent-to-Agent (A2A) for the implementation of distributed systems, comparing them with traditional monolithic approaches. To conduct this analysis, MABench, a benchmarking framework derived from $\tau$-bench, has been used, enabling systematic evaluation of different agentic strategies (single-agent, supervisor-based, swarm and decentralized) in simulated user-agent interaction environments. The evaluation metrics, implemented through the DeepEval framework, include task completion, tool correctness, step efficiency, communication quality and verification thoroughness. Experimental results demonstrate that more complex architectures do not necessarily guarantee superior performance, and that the choice of optimal architecture strongly depends on the application domain and operational constraints. Specifically, distributed systems based on MCP and A2A show performance comparable to their local counterparts, opening interesting prospects for scalable enterprise deployments. This work aims to contribute to this topic by providing a reproducible evaluation techniques and practical guidelines for selecting agentic architectures and evaluation criterias in enterprise contexts.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria "Enzo Ferrari"
			
	Corso di studio
	
				Artificial intelligence engineering
			
	Anno Accademico
	
				2024
			
	Parola chiave
	
				Multi-Agent Systems
Agentic AI
MCP/A2A
Benchmark
Enterprise
			
	Relatore
	
				CALDERARA, SIMONE
			
	Controrelatore
	
				FIORINI, COSIMO
			
	Appare nelle tipologie:
	
				Lauree Magistrali

File in questo prodotto:

File	Dimensione	Formato
Grandi.Andrea.pdf Accesso riservato Dimensione 5.13 MB Formato Adobe PDF	5.13 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14251/5404