Multimodal grasp generation for prosthetic hands using vision and language

The hands represent a crucial part of a person's body due to the fact that they are involved in a multitude of everyday tasks, from the most basic to the most complex ones. This consideration makes the loss of a limb a deeply debilitating and life altering experience, that deprives an individual of one among their most important ways to interact with the world. Thus arises the need to research solutions capable of bringing back the previous level of freedom, by means of a prosthetic counterpart, in a way that lets the subject feel in control while being as intuitive and simple as possible. The idea behind this thesis is to achieve so by leveraging two innate instruments that people can rely on to express their needs: vision and speech. From a technical point of view, gaze information along with a brief textual description are used to identify the intent of the user to grab a specific object in a specific way. Then a pipeline containing a generative model is employed to obtain the corresponding pose that the prosthetic hand has to assume to perform the task. Finally, the resulting pose is provided to a simulated hand for validation purposes.

Multimodal grasp generation for prosthetic hands using vision and language

FINI, GIADA

2024/2025

Abstract

The hands represent a crucial part of a person's body due to the fact that they are involved in a multitude of everyday tasks, from the most basic to the most complex ones. This consideration makes the loss of a limb a deeply debilitating and life altering experience, that deprives an individual of one among their most important ways to interact with the world. Thus arises the need to research solutions capable of bringing back the previous level of freedom, by means of a prosthetic counterpart, in a way that lets the subject feel in control while being as intuitive and simple as possible. The idea behind this thesis is to achieve so by leveraging two innate instruments that people can rely on to express their needs: vision and speech. From a technical point of view, gaze information along with a brief textual description are used to identify the intent of the user to grab a specific object in a specific way. Then a pipeline containing a generative model is employed to obtain the corresponding pose that the prosthetic hand has to assume to perform the task. Finally, the resulting pose is provided to a simulated hand for validation purposes.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria "Enzo Ferrari"
			
	Corso di studio
	
				Artificial intelligence engineering
			
	Anno Accademico
	
				2024
			
	Parola chiave
	
				Robotic hands
Grasp synthesis
Generative modeling
Intent recognition
Hand pose simulation
			
	Relatore
	
				BIAGIOTTI, LUIGI
			
	Controrelatore
	
				BIAGI, FEDERICO
			
	Appare nelle tipologie:
	
				Lauree Magistrali

File in questo prodotto:

File	Dimensione	Formato
Fini.Giada.pdf embargo fino al 02/12/2026 Dimensione 2.57 MB Formato Adobe PDF	2.57 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14251/4161