Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots
Date
Author
Institution
Degree Level
Degree
Department
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
An oft-ignored challenge of real-world reinforcement learning is that, unlike standard simulated environments, the real world does not pause when agents make learning updates. As standard simulated environments do not address this real-time aspect of learning, most available implementations of deep rein- forcement learning algorithms process environment interactions and learning updates sequentially. Consequently, when such implementations are deployed in the real world, they may not act responsively and learn efficiently. Asyn- chronous learning has been proposed to solve this issue, but no systematic comparison between sequential and asynchronous reinforcement learning was conducted using real-world environments. In this thesis, we set up two vision- based tasks with a robotic arm, implement an asynchronous learning sys- tem that extends a previous architecture, and compare sequential and asyn- chronous reinforcement learning across different action cycle times, sensory data dimensions, and mini-batch sizes. Our experiments show that when the time cost of learning updates increases, the action cycle time in sequential implementation could grow excessively long, while the asynchronous imple- mentation can always maintain a fixed and appropriate action cycle time. Consequently, when learning updates are expensive, the performance of se- quential learning diminishes and is outperformed by a substantial margin by asynchronous learning. Our system learns in real-time to reach and track vi- sual targets from pixels within two hours of experience and does so directly using real robots, learning completely from scratch.
