ESTUDIO Y EVALUACIÓN DE LAS ESTRATEGIAS DE EXTRACCIÓN DE CONOCIMIENTO EN MODELOS LLM PERSONALIZADOS

Valverde Gómez, Daniel

dc.contributor.advisor	Contreras Bárcena, David	es-ES
dc.contributor.author	Valverde Gómez, Daniel	es-ES
dc.contributor.other	Universidad Pontificia Comillas, Escuela Técnica Superior de Ingeniería (ICAI)	es_ES
dc.date.accessioned	2023-10-26T11:20:39Z
dc.date.available	2023-10-26T11:20:39Z
dc.date.issued	2024	es_ES
dc.identifier.uri	http://hdl.handle.net/11531/84261	es_ES
dc.description	Grado en Ingeniería en Tecnologías de Telecomunicación	es_ES
dc.description.abstract	El auge del desarrollo de las tecnologías de inteligencia artificial (IA) ha impulsado numerosas innovaciones en el campo del procesamiento del lenguaje natural. Sin embargo, los altos costes computacionales asociados con los modelos de gran tamaño (LLM) representan un desafío considerable para su adopción a gran escala. Para abordar este problema, los modelos de código abierto emergen como una solución viable ofreciendo capacidades similares a las de un modelo de IA comercial, a un coste mucho menor. Este proyecto se centra en el estudio de diferentes modelos de código abierto, analizando el rendimiento y las especificaciones de cada uno de ellos para llevar seleccionar aquel que ofrezca los mejores resultados. Los modelos estudiados son Falcon, LLama, Mistral, MPT y Qwen y el seleccionado es Llama3-8B. La selección del modelo se basa en sus benchmarks, la valoración de la comunidad y sus especificaciones. Con el modelo seleccionado se llevan a cabo dos pruebas de concepto (PoC) para probar las capacidades del modelo. En la primera se realiza un ajuste fino (fine-tuning) de los parámetros del modelo en un conjunto de datos de preguntas y respuestas reales entre pacientes y doctores para dotar al modelo de la personalidad de un doctor y modificar su comportamiento. La segunda prueba de concepto explora el concepto de generación aumentada por recuperación (RAG) que permite combinar los conocimientos del modelo con una base de datos vectorial para mejorar sus respuestas aumentando el contexto. Se estudian tres técnicas de recuperación de documentos denominadas naïve, parent document y multiquery para evaluar su rendimiento en la tarea de recuperación de información de reportes médicos y elegir la técnica más adecuada para llevar a cabo dicha tarea.	es-ES
dc.description.abstract	The advancements in the development of artificial intelligence (AI) technology have driven numerous innovations in the field of natural language processing (NLP). However, the high computational costs associated with large language models (LLM) pose a significant challenge for their large-scale adoption. To address this issue, open-source models emerge as viable solution, offering similar capabilities to a commercial AI model at much lower cost. This project focuses on studying the different state of the art open-source models, analyzing the performance and specifications of each to select the one that offers the best results. The studied models include Falcon, Llama, Mistral, MPT and Qwen, with Llama3-8B being the selected model. The selection process involved examining its benchmarks, community ratings and specifications. Subsequently, the selected model is used to carry out two proofs of concept (PoC) to test the model’s capabilities. The first one involves fine-tuning the model’s parameters on a dataset of real questions and answers between patients and doctors to endow the model with a doctor’s personality and modify its behavior. The second proof of concept explores the concept of Retrieval Augmented Generation (RAG), which allows combining the model’s knowledge with a vector database to enhance its responses by increasing the context. Three document retrieval techniques, named naïve, parent document, and multiquery are analyzed to evaluate their performance in the task of retrieving information from medical reports and to select the most suitable technique for this task.	en-GB
dc.format.mimetype	application/pdf	es_ES
dc.language.iso	es-ES	es_ES
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	es_ES
dc.subject.other	KTT (GITT)	es_ES
dc.title	ESTUDIO Y EVALUACIÓN DE LAS ESTRATEGIAS DE EXTRACCIÓN DE CONOCIMIENTO EN MODELOS LLM PERSONALIZADOS	es_ES
dc.type	info:eu-repo/semantics/bachelorThesis	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.keywords	IA, LLM, RAG, Código abierto, NLP, Llama, ajuste fino	es-ES
dc.keywords	AI, LLM, RAG, Open-Source, NLP, Llama, fine-tuning	en-GB

Files in this item

Name:: TFG-Valverde Gomez, Daniel.pdf
Size:: 6.180Mb
Format:: PDF
Description:: Trabajo Fin de Grado

View/Open

Name:: Anexo I - Valverde Gomez, ...
Size:: 139.3Kb
Format:: PDF
Description:: Autorización

View/Open

This item appears in the following Collection(s)

KTT-Trabajos Fin de Grado

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States