Research on Multi-Modal Large Language Models and their application for the verification and validation of identity documents

dc.contributor.advisorTFEForcén Carvalho, Juan Ignacio
dc.contributor.affiliationEscuela Técnica Superior de Ingeniería Industrial, Informática y de Telecomunicaciónes_ES
dc.contributor.affiliationIndustria, Informatika eta Telekomunikazio Ingeniaritzako Goi Mailako Eskola Teknikoaeu
dc.contributor.authorOtazu Redín, Judit
dc.date.accessioned2024-10-07T15:47:42Z
dc.date.issued2024
dc.date.updated2024-10-07T12:37:21Z
dc.description.abstractIn response to the exponential growth and increasing adoption of Multi-Modal Large Language Models, this project aims to explore their application in a critical field: the verification and validation of identity documents. These models, which effectively integrate image, text, video, and audio processing, are proposed as potential improvements over traditional systems specialized in specific tasks. The research will compare the effectiveness of MM-LLMs against dedicated models, including both commercial and open-source solutions, in key tasks such as classification, image quality, fraud detection, OCR (Optical Character Recognition) and Entity Mapping. Additionally, the explainability of these multimodal models will be analyzed, offering a transparent alternative to the opacity of the ’black box’ typically associated with artificial intelligence. The study also recognizes and addresses the challenges that arise from the substantial hardware demands and potential latency issues inherent in these advanced systems.en
dc.description.degreeMáster Universitario en Ingeniería Informática por la Universidad Pública de Navarraes_ES
dc.description.degreeNafarroako Unibertsitate Publikoko Unibertsitate Masterra Informatika Ingeniaritzaneu
dc.embargo.inicio2024-10-07
dc.embargo.lift2029-10-01
dc.embargo.terms2029-10-01
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttps://academica-e.unavarra.es/handle/2454/52031
dc.language.isoeng
dc.rights.accessRightsinfo:eu-repo/semantics/embargoedAccess
dc.subjectLarge Language Modelen
dc.subjectMulti-Modal Large Language Modelen
dc.subjectNatural Language Processingen
dc.subjectComputer Visionen
dc.subjectTransformersen
dc.subjectEmbeddingsen
dc.subjectCNNen
dc.subjectOCR (Optical Character Recognition)en
dc.subjectDocument Authenticityen
dc.subjectAnti-spoofingen
dc.subjectPromptingen
dc.titleResearch on Multi-Modal Large Language Models and their application for the verification and validation of identity documentsen
dc.typeinfo:eu-repo/semantics/masterThesis
dspace.entity.typePublication
relation.isAdvisorTFEOfPublication2980928f-fd1b-4d86-858b-f8c28e18b365
relation.isAdvisorTFEOfPublication.latestForDiscovery2980928f-fd1b-4d86-858b-f8c28e18b365

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
TFM_FINAL.pdf
Size:
38.53 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description: