Integración y aplicación de técnicas de aprendizaje por refuerzo al robot IRB120 en el entorno virtual de MuJoCo

Dong, Lixiang

dc.contributor.advisor	Güitta López, Lucía	es-ES
dc.contributor.advisor	López López, Álvaro Jesús	es-ES
dc.contributor.author	Dong, Lixiang	es-ES
dc.contributor.other	Universidad Pontificia Comillas, Escuela Técnica Superior de Ingeniería (ICAI)	es_ES
dc.date.accessioned	2020-06-10T14:17:33Z
dc.date.available	2020-06-10T14:17:33Z
dc.date.issued	2020	es_ES
dc.identifier.uri	http://hdl.handle.net/11531/46855
dc.description	Máster Universitario en Ingeniería Industrial + Máster en Industria Conectada/ Master in Smart Industry	es_ES
dc.description.abstract	El aprendizaje por refuerzo se considera el tercer paradigma del aprendizaje automático junto con el aprendizaje supervisado y el aprendizaje no supervisado. Es una clase de algoritmos en el campo del aprendizaje automático que permite a un agente aprender a cómo comportarse en un entorno donde la única realimentación consta de una señal de recompensa escalar, la cual indica cómo de bien lo está haciendo en el momento inmediato. El objetivo del agente consiste en ejecutar acciones que maximice la recompensa en el largo plazo o retorno. Si bien las técnicas de aprendizaje por refuerzo están siendo impulsadas por diversos grupos investigadores en varios ámbitos, sobre todo en los juegos de Atari y la robótica, la complejidad del movimiento de los brazos robóticos puede parecer a priori un hándicap para aplicar este proceso de aprendizaje que requiere de numerosos episodios para que el agente explore y aprenda a partir de prueba y error. Sin embargo, mediante el entrenamiento en entornos simulados y su posterior transferencia al mundo real se evitan los riesgos asociados a movimientos del robot que puedan resultar en posiciones singulares o en daños al medio y se favorece un aprendizaje más rápido ya que se inﬁeren los parámetros desde el modelo virtual y no se está limitado por restricciones físicas. En esta tesis se implementará el algoritmo de aprendizaje por refuerzo A3C con un modelo MuJoCo del brazo robótico IRB120 para realizar la tarea de alcanzar un objetivo en su área de trabajo.	es-ES
dc.description.abstract	Reinforcement learning is considered the third paradigm of machine learning along with supervised learning and unsupervised learning. It is a class of algorithms in the field of machine learning that allows an agent to learn how to behave in an environment with a scalar reward as feedback. The objective of the agent is to execute actions that maximize the long-term reward or return. Although reinforcement learning techniques are pushed forward by many research groups in many fields, such as Atari games and robotics, the movement complexity of robotics arms seems to be a problem to implement learning techniques that require numerous episodes for the agent to explore and learn from trial and error. Nonetheless, by training in a simulated environment and its later transfer to the real world, the risks associated with the movement of physical robot can be avoided. Using a simulated environment also increases the learning speed since it is not limited by physical constraints and the parameters can be inferred from the virtual model. In this thesis, the reinforcement learning algorithm A3C will be implemented using a MuJoCo model of the IRB120 robot manipulator to carry out a reach target task.	en-GB
dc.format.mimetype	application/pdf	es_ES
dc.language.iso	en-GB	es_ES
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	es_ES
dc.subject.other	H62-electronica (MII-N)	es_ES
dc.title	Integración y aplicación de técnicas de aprendizaje por refuerzo al robot IRB120 en el entorno virtual de MuJoCo	es_ES
dc.type	info:eu-repo/semantics/masterThesis	es_ES
dc.rights.accessRights	info:eu-repo/semantics/closedAccess	es_ES
dc.keywords	Aprendizaje por refuerzo, redes neuronales artificiales, aprendizaje automático, A3C, simulación, Mujoco	es-ES
dc.keywords	Reinforcement learning, artificial neural networks, machine learning, A3C, simulation, MuJoCo	en-GB

Ficheros en el ítem

Nombre:: TFM-Dong, Lixiang.pdf
Tamaño:: 2.509Mb
Formato:: PDF
Descripción:: Trabajo Fin de Máster

Ver/

Nombre:: AnexoI.pdf
Tamaño:: 69.65Kb
Formato:: PDF
Descripción:: Autorización

Ver/

Este ítem aparece en la(s) siguiente(s) colección(ones)

H62-Trabajos Fin de Máster

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivs 3.0 United States