Frontiers of Large Language Models: Empowering Decision Optimization, Scene Understanding, and Summarization Through Advanced Computational Approaches [Tesis doctoral]

de Curtò i Díaz, Joaquim

View/Open

2024211232755319_Frontiers_of_Large_Language_Model.pdf (2.928Mb)

Date

2023-12-21

Author

de Curtò i Díaz, Joaquim

Estado

info:eu-repo/semantics/publishedVersion

Metadata

Show full item record

Mostrar METS del ítem

Ver registro en CKH

Abstract

Este trabajo de investigación explora el impacto transformador de los Large Language Models (LLMs) en el campo de la Inteligencia Artificial (IA), destacando su capacidad para comprender y tomar decisiones complejas de manera autónoma. Se divide en varios segmentos: Comprensión de Escenas con Vehículos Aéreos No Tripulados (UAVs): Se analiza cómo los LLMs pueden mejorar la comprensión semántica de escenas capturadas por UAVs, utilizando Visual Language Models y sistemas de detección de objetos de última generación. Se propone una implementación práctica eficiente y se evalúa su impacto en diversos sectores, como el cine y la publicidad. Toma de Decisiones bajo Incertidumbre: Se investiga el uso de LLMs para informar estrategias en entornos dinámicos, centrándose en el problema de Multi-Armed Bandits (MAB). Se demuestra la capacidad adaptativa y el rendimiento competitivo de las estrategias informadas por LLMs. Evaluación de Generative Adversarial Networks (GANs): Se estudia la calidad de las imágenes generadas por GANs utilizando la Signature Transform como medida de similitud entre distribuciones de imágenes. Se realiza un análisis exhaustivo para comprender la convergencia de GANs y la calidad de ajuste. Síntesis Automática de Vídeos: Se introduce un nuevo benchmark para la síntesis automática de vídeos, utilizando LLMs y Signature Transform. Se propone un enfoque innovador basado en componentes armónicos capturados por la Signature Transform, evaluando su precisión y correlación con el concepto humano de un buen resumen. En resumen, este trabajo destaca el potencial de los LLMs para abordar tareas complejas en diversos dominios, como la optimización de decisiones, la comprensión de escenas y la síntesis automática de vídeos, estableciendo nuevas fronteras en la aplicación de esta tecnología y señalando direcciones para futuras investigaciones.

This research work explores the transformative impact of Large Language Models (LLMs) in the field of Artificial Intelligence (AI), highlighting their ability to understand and make complex decisions autonomously. It is divided into several segments: Scene Understanding with Unmanned Aerial Vehicles (UAVs): It analyzes how LLMs can enhance semantic understanding of scenes captured by UAVs, utilizing Visual Language Models and state-of-the-art object detection systems. An efficient practical implementation is proposed and its impact is evaluated in various sectors such as film and advertising. Decision Making under Uncertainty: The use of LLMs to inform strategies in dynamic environments is investigated, focusing on the Multi-Armed Bandits (MAB) problem. The adaptive capability and competitive performance of strategies informed by LLMs are demonstrated. Evaluation of Generative Adversarial Networks (GANs): The quality of images generated by GANs using the Signature Transform as a measure of similarity between image distributions is studied. A comprehensive analysis is conducted to understand GANs' convergence and fit quality. Automatic Video Synthesis: A new benchmark for automatic video synthesis is introduced, utilizing LLMs and Signature Transform. An innovative approach based on harmonic components captured by the Signature Transform is proposed, evaluating its accuracy and correlation with the human concept of a good summary. In summary, this work highlights the potential of LLMs to address complex tasks in various domains such as decision optimization, scene understanding, and automatic video synthesis, establishing new frontiers in the application of this technology and pointing out directions for future research.

URI

http://hdl.handle.net/11531/88113

Frontiers of Large Language Models: Empowering Decision Optimization, Scene Understanding, and Summarization Through Advanced Computational Approaches [Tesis doctoral]

Tipo de Actividad

Monografía

Palabras Clave

Large Language Models (LLMs) Inteligencia Artificial (IA) Comprensión de Escenas Toma de Decisiones
Large Language Models (LLMs) Artificial Intelligence (AI) Scene Understanding Decision Making

Collections

Artículos

Except where otherwise noted, this item's license is described as Creative Commons Reconocimiento-NoComercial-SinObraDerivada España