Uncertainty Methods in Active Reinforcement Learning
Date
Author
Institution
Degree Level
Degree
Department
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
Some real-world deployments of deep reinforcement learning (RL) may require a human-in-the-loop. Whether to ask-for-help, obtain new demonstrations and data, or handle out-of-distribution states, many methods rely on uncertainty estimates from a neural network to determine when to solicit a human's assistance. In existing work, it is common to rely on variance from an ensemble of models as a proxy for when the agent is uncertain about taking an action, however there has been little investigation into comparing the efficacy of other methods. This thesis compares three methods for uncertainty estimation in the action-advising framework: bootstrapped ensembles, Monte Carlo dropout and variance networks. Additionally, the methods are assessed on whether they produce "calibrated" uncertainty estimates. Variance networks are proposed as being advantageous in the action-advising setting due to their advice efficiency and ability to capture uncertainty about the environment dynamics.
