Speech Analytics

España Carrera, Alberto

dc.contributor.advisor	Fernandez Gallardo, Antonio	es-ES
dc.contributor.author	España Carrera, Alberto	es-ES
dc.contributor.other	Universidad Pontificia Comillas, Escuela Técnica Superior de Ingeniería (ICAI)	es_ES
dc.date.accessioned	2025-02-04T15:44:04Z
dc.date.available	2025-02-04T15:44:04Z
dc.date.issued	2025	es_ES
dc.identifier.uri	http://hdl.handle.net/11531/97238
dc.description	Máster Universitario en Big Data	es_ES
dc.description.abstract	El proyecto tiene como objetivo analizar cómo los clientes de Telefónica España se comunican con la compañía a través del canal telefónico 1004, utilizando técnicas avanzadas de deep learning y procesamiento de lenguaje natural (NLP). El proceso parte de convertir grabaciones de audio en texto mediante distintas soluciones de transcripción automatizada. El proyecto se organiza en tres fases: obtención de audios mediante herramientas de web scraping, transcripción de los audios a texto y, finalmente, desarrollo y evaluación de modelos de machine learning para distintos casos de uso. El enfoque principal es identificar clientes insatisfechos con el fin de reducir la tasa de abandono o churn. Inicialmente se implementa un modelo con fastText y, posteriormente, se exploran arquitecturas más complejas que combinan redes convolucionales con LSTM y GRU, utilizando la biblioteca Keras. También se desarrolla un modelo basado en BERT, y se analizan alternativas que mejoren su rendimiento en español. En la fase inicial, se automatizó la descarga de audios y se realizaron transcripciones manuales para evaluar cuatro servicios de transcripción, aunque la elección definitiva quedó fuera del alcance del equipo. Se comprobó que el uso de GPUs mejora significativamente el rendimiento de los modelos, aunque se diseñó también una versión optimizada para CPUs. Como conclusión, se recomienda continuar integrando técnicas de deep learning, aunque estas requieren un elevado coste computacional. Además, se sugiere ampliar el volumen de datos disponibles para optimizar los resultados, dado que en este tipo de proyectos la cantidad de datos es clave para mejorar la eficacia de los modelos.	es-ES
dc.description.abstract	This project aims to analyze how Telefónica España’s customers communicate with the company through its 1004 telephone channel, using advanced deep learning and natural language processing (NLP) techniques. A key step involves converting call recordings into text through the evaluation of different automated transcription solutions. The project is structured in three phases: obtaining call audios via web scraping tools, transcribing those audios into text, and finally developing and assessing machine learning models for various use cases. The main focus is to identify dissatisfied customers in order to reduce churn rates. Initially, a model was built using fastText, followed by more advanced architectures combining convolutional neural networks with LSTM and GRU layers, implemented through the Keras library. A BERT-based model was also developed, alongside the evaluation of alternative tools that might improve performance in Spanish-language scenarios. In the initial phase, automated mass downloading of audios was implemented, and a set of calls was manually transcribed to compare the performance of four transcription services. While one of them underperformed significantly, it proved difficult to objectively determine the best option among the remaining three. Additionally, the final decision on transcription tools was beyond the project team’s responsibilities. Regarding model performance, the use of GPUs provided significant benefits. However, since GPU availability is limited, a version optimized for CPU use was also designed. Despite this, the most advanced techniques still rely on graphic processing units for their implementation. For future phases, the project recommends continuing to adopt deep learning techniques, acknowledging their high computational cost. Moreover, it highlights the importance of increasing the volume of downloaded and transcribed calls, as larger datasets are crucial for improving model accuracy in deep learning projects.	en-GB
dc.format.mimetype	application/pdf	es_ES
dc.language.iso	es-ES	es_ES
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	es_ES
dc.subject.other	H0Z	es_ES
dc.title	Speech Analytics	es_ES
dc.type	info:eu-repo/semantics/masterThesis	es_ES
dc.rights.accessRights	info:eu-repo/semantics/restrictedAccess	es_ES
dc.keywords	Inteligencia Artificial, Scraping, speech-to-text, deep learning, procesamiento del lenguaje natural, GPUs, machine learning.	es-ES
dc.keywords	Artificial Intelligence, Scraping, speech-to-text, deep learning, natural language processing (NLP), GPUs, machine learning.	en-GB

Ficheros en el ítem

Nombre:: TFM_Espana_Carrera_Alberto.pdf
Tamaño:: 1.673Mb
Formato:: PDF
Descripción:: Trabajo Fin de Máster

Ver/

Nombre:: AnexoI.pdf
Tamaño:: 130.4Kb
Formato:: PDF
Descripción:: Autorización

Ver/

Este ítem aparece en la(s) siguiente(s) colección(ones)

TFG, TFM (temporales)

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivs 3.0 United States