CS Forum: Tuomas Haarnoja (UC Berkeley) on Acquiring Diverse Robot Skills via Deep Reinforcement Learning

2018-06-07 14:15:00 2018-06-07 15:00:00 Europe/Helsinki CS Forum: Tuomas Haarnoja (UC Berkeley) on Acquiring Diverse Robot Skills via Deep Reinforcement Learning CS forum is a seminar series arranged at the CS department - open to everyone free-of-charge. http://cs.aalto.fi/en/midcom-permalink-1e863c75a9cccb863c711e8aac229a61f12b898b898 Konemiehentie 2, 02150, Espoo

CS forum is a seminar series arranged at the CS department - open to everyone free-of-charge.

07.06.2018 / 14:15 - 15:00

 

zEROTGMM_400x400.jpg

 

Speaker: Tuomas Haarnoja
Affiliation: UC Berkeley, United States

Host: Professor Jaakko Lehtinen
Time: 14:15 (coffee at 14:00)
Venue: T5, CS building, Konemiehentie 2, Espoo

 

 

Acquiring Diverse Robot Skills via Deep Reinforcement Learning

Abstract:

The intersection of expressive, general-purpose function approximators, such as neural networks, with general-purpose model-free reinforcement learning (RL) algorithms holds the promise of automating a wide range of robotic behaviors: reinforcement learning provides the formalism for reasoning about sequential decision making, while large neural networks can process high-dimensional and noisy observations to provide a general representation for any behavior with minimal manual engineering. However, applying model-free RL algorithms with multilayer neural networks (i.e., deep RL) to real-world robotic control problems has proven to be very difficult in practice: the sample complexity of model-free methods tends to be quite high, and training tends to yield high-variance results. In the talk, I will discuss how maximum entropy principle can enable deep RL for real-world robotic applications. First, by representing policies as expressive energy-based models, maximum entropy RL leads to effective, multi-modal exploration that can reduce sample complexity. Second, maximum entropy policies can promote reusability through compositionality, meaning that existing policies can be combined to create new compound policies without extra interaction with the environment. And third, policies expressed via invertible transformations lead to natural formation of policy hierarchies that can be used to solve sparse reward tasks. I will demonstrate these properties in both simulated and real world robotic tasks.

Bio:

Tuomas Haarnoja is a PhD candidate in the Berkeley Artificial Intelligence Research Lab (BAIR) at UC Berkeley, advised by prof. Pieter Abbeel and prof. Sergey Levine. His research focus is on extending deep reinforcement learning to provide for flexible, effective robot control that can handle the diversity and variability of the real world. During his PhD, Tuomas has spent time as an intern at Google Brain, where he developed model-free algorithms for robotic applications requiring high sample efficiency.

Before joining BAIR, Tuomas received a master's degree in Space Robotics and Automation from Luleå University of Technology, Sweden, and Aalto University of Technology, Finland, and worked as a research scientist at VTT Technical Research Centre of Finland.