The exponential growth of unstructured data in high-stakes domains, such as finance, legal compliance, and procurement, has created a critical need for automated systems capable of rigorous document auditing. While Large Language Models (LLMs) offer powerful summarization capabilities, traditional Retrieval-Augmented Generation (RAG) architectures often fail to provide the precision required for both Documentary Assessment (verifying internal compliance against requirements) and Due Diligence Analysis (validating claims against external reality). This thesis presents SPECTRE, a general-purpose platform designed for automated auditing, based on an Agentic KRAG (Knowledge-Retrieval Augmented Generation) architecture. Unlike standard RAG systems, SPECTRE employs a hybrid neuro-symbolic approach that orchestrates autonomous agents to perform two distinct classes of tasks: (1) Gap Analysis & Documentary Assessment, which verifies whether a document set meets specific technical or functional requirements; and (2) Factual Verification, which cross-references internal claims with a Knowledge Graph populated from external web sources. Furthermore, to address the lack of explainability in generative models, the framework incorporates a dedicated XAI (Explainable AI) module. This component computes a deterministic Reliability Score and enforces a strict Citational Audit Trail, shifting the paradigm from probabilistic trust to verifiable algorithmic reliability. While the architecture is designed to be domain-agnostic, this work validates its efficacy through a comprehensive focus on Venture Capital (VC) Analysis. By modeling the complex ecosystem of startups, founders, and market metrics, SPECTRE demonstrates the ability to automate the investment screening process with high precision. Experimental results indicate that the system's approach significantly reduces hallucinations compared to vector-only baselines, offering a robust and scalable framework for automated auditing across data-intensive industries.
SPECTRE: A GENERAL-PURPOSE AGENTIC KRAG ARCHITECTURE FOR DUE DILIGENCE AND DOCUMENTARY ASSESSMENT
ROTEGLIA, STEFANO
2025/2026
Abstract
The exponential growth of unstructured data in high-stakes domains, such as finance, legal compliance, and procurement, has created a critical need for automated systems capable of rigorous document auditing. While Large Language Models (LLMs) offer powerful summarization capabilities, traditional Retrieval-Augmented Generation (RAG) architectures often fail to provide the precision required for both Documentary Assessment (verifying internal compliance against requirements) and Due Diligence Analysis (validating claims against external reality). This thesis presents SPECTRE, a general-purpose platform designed for automated auditing, based on an Agentic KRAG (Knowledge-Retrieval Augmented Generation) architecture. Unlike standard RAG systems, SPECTRE employs a hybrid neuro-symbolic approach that orchestrates autonomous agents to perform two distinct classes of tasks: (1) Gap Analysis & Documentary Assessment, which verifies whether a document set meets specific technical or functional requirements; and (2) Factual Verification, which cross-references internal claims with a Knowledge Graph populated from external web sources. Furthermore, to address the lack of explainability in generative models, the framework incorporates a dedicated XAI (Explainable AI) module. This component computes a deterministic Reliability Score and enforces a strict Citational Audit Trail, shifting the paradigm from probabilistic trust to verifiable algorithmic reliability. While the architecture is designed to be domain-agnostic, this work validates its efficacy through a comprehensive focus on Venture Capital (VC) Analysis. By modeling the complex ecosystem of startups, founders, and market metrics, SPECTRE demonstrates the ability to automate the investment screening process with high precision. Experimental results indicate that the system's approach significantly reduces hallucinations compared to vector-only baselines, offering a robust and scalable framework for automated auditing across data-intensive industries.| File | Dimensione | Formato | |
|---|---|---|---|
|
TESI_MAGISTRALE_UPLOAD.pdf
accesso aperto
Dimensione
6.14 MB
Formato
Adobe PDF
|
6.14 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14251/4636