Policy Selection for Transfer Learning in the Building Control Domain
Date
Institution
Degree Level
Degree
Department
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
The application of reinforcement learning (RL) to the optimal control of building systems has gained traction in recent years as it can reduce building energy consumption and improve human comfort, without requiring the knowledge of the building model. However, existing RL solutions for building control face challenges, such as slow convergence and suboptimal (or unsafe) actions during the training phase that may lead to high energy use or excessive discomfort. Additionally, the transferability of RL policies to different buildings remains a hurdle.
Offline policy selection is a new domain in RL that aims to efficiently select the best policies from a library of policies for a downstream task. Previous works have shown that diversity-induced RL helps generate policies that generalize to unseen environments, often surpassing baselines, even without retraining. This thesis explores various techniques to select a policy from a library of diverse policies to control the heating, ventilation, and air conditioning (HVAC) system of a commercial building. The main contribution of this thesis is an offline policy selection algorithm that can effectively identify the most suitable policy for transfer to an unseen building environment. Furthermore, an investigation into the impact of the offline dataset utilized for evaluation is also conducted, providing valuable insights into the efficacy of the proposed evaluation technique.
The outcomes of this research hold significant implications for energy conservation in the building sector. By enabling the adoption of RL-based control strategies that overcome the limitations of traditional approaches, this work could contribute to significant reductions in energy consumption and carbon emissions. The proposed framework empowers building operators to achieve energy-efficient control of building systems while minimizing occupant discomfort and facilitating the transfer of policies across different buildings.
