The exponential growth of unstructured data in high-stakes domains, such as finance, legal compliance, and procurement, has created a critical need for automated systems capable of rigorous document auditing. While Large Language Models (LLMs) offer powerful summarization capabilities, traditional Retrieval-Augmented Generation (RAG) architectures often fail to provide the precision required for both Documentary Assessment (verifying internal compliance against requirements) and Due Diligence Analysis (validating claims against external reality). This thesis presents SPECTRE, a general-purpose platform designed for automated auditing, based on an Agentic KRAG (Knowledge-Retrieval Augmented Generation) architecture. Unlike standard RAG systems, SPECTRE employs a hybrid neuro-symbolic approach that orchestrates autonomous agents to perform two distinct classes of tasks: (1) Gap Analysis & Documentary Assessment, which verifies whether a document set meets specific technical or functional requirements; and (2) Factual Verification, which cross-references internal claims with a Knowledge Graph populated from external web sources. Furthermore, to address the lack of explainability in generative models, the framework incorporates a dedicated XAI (Explainable AI) module. This component computes a deterministic Reliability Score and enforces a strict Citational Audit Trail, shifting the paradigm from probabilistic trust to verifiable algorithmic reliability. While the architecture is designed to be domain-agnostic, this work validates its efficacy through a comprehensive focus on Venture Capital (VC) Analysis. By modeling the complex ecosystem of startups, founders, and market metrics, SPECTRE demonstrates the ability to automate the investment screening process with high precision. Experimental results indicate that the system's approach significantly reduces hallucinations compared to vector-only baselines, offering a robust and scalable framework for automated auditing across data-intensive industries.

SPECTRE: A GENERAL-PURPOSE AGENTIC KRAG ARCHITECTURE FOR DUE DILIGENCE AND DOCUMENTARY ASSESSMENT

ROTEGLIA, STEFANO
2025/2026

Abstract

The exponential growth of unstructured data in high-stakes domains, such as finance, legal compliance, and procurement, has created a critical need for automated systems capable of rigorous document auditing. While Large Language Models (LLMs) offer powerful summarization capabilities, traditional Retrieval-Augmented Generation (RAG) architectures often fail to provide the precision required for both Documentary Assessment (verifying internal compliance against requirements) and Due Diligence Analysis (validating claims against external reality). This thesis presents SPECTRE, a general-purpose platform designed for automated auditing, based on an Agentic KRAG (Knowledge-Retrieval Augmented Generation) architecture. Unlike standard RAG systems, SPECTRE employs a hybrid neuro-symbolic approach that orchestrates autonomous agents to perform two distinct classes of tasks: (1) Gap Analysis & Documentary Assessment, which verifies whether a document set meets specific technical or functional requirements; and (2) Factual Verification, which cross-references internal claims with a Knowledge Graph populated from external web sources. Furthermore, to address the lack of explainability in generative models, the framework incorporates a dedicated XAI (Explainable AI) module. This component computes a deterministic Reliability Score and enforces a strict Citational Audit Trail, shifting the paradigm from probabilistic trust to verifiable algorithmic reliability. While the architecture is designed to be domain-agnostic, this work validates its efficacy through a comprehensive focus on Venture Capital (VC) Analysis. By modeling the complex ecosystem of startups, founders, and market metrics, SPECTRE demonstrates the ability to automate the investment screening process with high precision. Experimental results indicate that the system's approach significantly reduces hallucinations compared to vector-only baselines, offering a robust and scalable framework for automated auditing across data-intensive industries.
2025
Agentic AI
RAG
Graph
Venture Capital
Document Analysis
File in questo prodotto:
File Dimensione Formato  
TESI_MAGISTRALE_UPLOAD.pdf

accesso aperto

Dimensione 6.14 MB
Formato Adobe PDF
6.14 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14251/4636