Chatbot de Conocimiento Interno de un Equipo de Trabajo

Jiménez Carmona, José Antonio

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/11531/98044

Registro completo de metadatos

Campo DC	Valor	Lengua/Idioma
dc.contributor.advisor	Buero Viana, Juan Antonio	es-ES
dc.contributor.author	Jiménez Carmona, José Antonio	es-ES
dc.contributor.other	Universidad Pontificia Comillas, Escuela Técnica Superior de Ingeniería (ICAI)	es_ES
dc.date.accessioned	2025-03-13T19:59:33Z	-
dc.date.available	2025-03-13T19:59:33Z	-
dc.date.issued	2025	es_ES
dc.identifier.uri	http://hdl.handle.net/11531/98044	-
dc.description	Máster Universitario en Big Data	es_ES
dc.description.abstract	Team Copilot es una aplicación de chatbot que tiene como objetivo ayudar a los miembros de un equipo a realizar su trabajo mediante la gestión de un conjunto de documentos PDF y respondiendo a preguntas sobre los documentos. La aplicación está escrita en Python y tiene una API basada en FastAPI con endpoints para autenticación, carga de documentos y realización de preguntas. La aplicación tiene un agente basado en LangGraph que gestiona los chats y utiliza un modelo de embeddings remoto de Voyage AI y un modelo LLM remoto de Anthropic. La aplicación utiliza las bibliotecas de Python PyMuPDF y PyTesseract para extraer el texto de los documentos PDF. PyMuPDF se utiliza para extraer texto plano e imágenes y PyTesseract se utiliza para extraer texto de las imágenes previamente extraídas a través de OCR (Reconocimiento Óptico de Caracteres). El texto extraído de cada documento se almacena en una base de datos PostgreSQL configurada como una base de datos vectorial con la extensión PgVector de PostgreSQL.	es-ES
dc.description.abstract	Team Copilot is a chatbot application that aims to help the members of a team do their work by managing a set of PDF documents and replying to questions about the documents. The application is written in Python and has a FastAPI based API with endpoints for authentication, uploading documents and making questions. The application has a LangGraph based agent that manages the chats and uses a remote embedding model from Voyage AI and a remote LLM model from Anthropic. The application uses the PyMuPDF and PyTesseract Python libraries to extract the text of the PDF documents. PyMuPDF is used to extract plain text and images and PyTesseract is used to extract text from the previously extracted images through OCR (Optical Character Recognition). The extracted text of each document is stored in a PostgreSQL database configured as a vector database with the PgVector PostgreSQL extension.	en-GB
dc.format.mimetype	application/pdf	es_ES
dc.language.iso	es-ES	es_ES
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	es_ES
dc.subject.other	H0Z	es_ES
dc.title	Chatbot de Conocimiento Interno de un Equipo de Trabajo	es_ES
dc.type	info:eu-repo/semantics/masterThesis	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.keywords	Chatbot, Documento, PDF, API, FastAPI, LLM	es-ES
dc.keywords	Chatbot, Document, PDF, API, FastAPI, LLM	en-GB
Aparece en las colecciones:	TFG, TFM (temporales)

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
TFM - Jimenez Carmona, Jose Antonio.pdf	Trabajo Fin de Máster	6,93 MB	Adobe PDF	Visualizar/Abrir
declaracion_autoria_firmada.pdf	Autorización	71,26 kB	Adobe PDF	Visualizar/Abrir Request a copy

Mostrar el registro sencillo del ítem