Interfaz para la evaluación de los modelos LLM

González Rodríguez, Daniel

dc.contributor.advisor	Contreras Bárcena, David	es-ES
dc.contributor.author	González Rodríguez, Daniel	es-ES
dc.contributor.other	Universidad Pontificia Comillas, Escuela Técnica Superior de Ingeniería (ICAI)	es_ES
dc.date.accessioned	2024-08-20T07:09:57Z
dc.date.available	2024-08-20T07:09:57Z
dc.date.issued	2024	es_ES
dc.identifier.uri	http://hdl.handle.net/11531/92012
dc.description	Máster Universitario en Ingeniería Industrial	es_ES
dc.description.abstract	Este proyecto consiste en la creación de una interfaz para la evaluación de los Large Language Models (LLMs), que permite determinar la frecuencia de aparición de alucinaciones en las respuestas generadas por los mismos. La interfaz ha sido programada en Python, utilizando el entorno de desarrollo PyCharm, e integrando la plataforma Ollama, que simplifica la instalación e interacción con diferentes LLMs. La interfaz ofrece diferentes metodologías de evaluación de la eficacia de los modelos. Algunas de las metodologías desarrolladas se basan en la comparación con datasets, como el cálculo de los índices denominados correctness y adherencia al contexto. Otras metodologías son autosuficientes, como el cálculo de la consistencia, la utilización del diálogo entre agentes, o la programación de un retriever que facilita la evaluación de la Retrieved Augmented Generation (RAG) del modelo.	es-ES
dc.description.abstract	This project involves the creation of an interface for evaluating the Large Language Models (LLMs), which allows the determination of the frequency in which the generated responses contain hallucinations. The interface has been programmed in Python, using the PyCharm development environment, and integrating the Ollama platform, which simplifies the installation and interaction with different LLMs. The interface offers different methodologies for evaluating the effectiveness of the models. Some of the developed methodologies are based on comparisons with datasets, such as calculating the indices known as correctness and context adherence. Other methodologies are self-sufficient, such as calculating the consistency, using a dialogue between agents, or programming a retriever that facilitates the evaluation of the model's Retrieved Augmented Generation (RAG).	en-GB
dc.format.mimetype	application/pdf	es_ES
dc.language.iso	en-GB	es_ES
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	es_ES
dc.subject.other	H62-electronica (MII-N)	es_ES
dc.title	Interfaz para la evaluación de los modelos LLM	es_ES
dc.type	info:eu-repo/semantics/masterThesis	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.keywords	Large Language Models (LLM); alucinaciones; Ollama; Python; Retrieved Augmented Generation (RAG); diálogo entre agentes; adherencia al contexto	es-ES
dc.keywords	Large Language Models (LLM); hallucinations; Ollama; Python; Retrieved Augmented Generation (RAG); dialogue between agents; context adherence	en-GB

Files in this item

Name:: TFM-Gonzalez Rodriguez, Daniel.pdf
Size:: 1.550Mb
Format:: PDF
Description:: Trabajo Fin de Máster

View/Open

Name:: AnexoI_TFM_DanielGonzalezRodri ...
Size:: 30.50Kb
Format:: PDF
Description:: Autorización

View/Open

This item appears in the following Collection(s)

H62-Trabajos Fin de Máster

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States