Monte Carlo Tree Search and Model Uncertainty
Date
Author
Institution
Degree Level
Degree
Department
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
Monte Carlo Tree Search (MCTS) is a popular tree search framework for choos- ing actions in decision-making problems. MCTS is traditionally applied to applications in which a perfect simulation model is available. However, when the model is imperfect, the performance of MCTS drops heavily. In this work, we introduce the Uncertainty Adapted MCTS (UA-MCTS) framework; an adaptation of the MCTS framework to model uncertainty. We define model uncertainty as the difference between the actual environment and the imperfect model. In UA-MCTS we modify each of the 4 steps selection, expansion, simulation, and backpropagation in MCTS so that they consider uncertainty. Although we provide a method to learn the uncertainty of the model, UA-MCTS is not restricted to our specific learning method. In the Reinforcement Learning (RL) domain, we propose the DQ-MCTS framework. DQ-MCTS uses the learned values from DQN, a state of the art model-free RL method, to improve MCTS performance. Since DQN is a model-free method, the errors in the model do not affect the learned values. DQ-MCTS uses DQN learned values to initialize the newly added nodes in the expansion step and to evaluate the last states in the simulation step. We experimentally evaluate UA-MCTS and DQ-MCTS on the determin- istic domains from the MinAtar test suite. Our results demonstrate that UA- MCTS strongly improves MCTS in the presence of model error, and that DQ-MCTS can perform better than MCTS but not better than DQN.
