Structural Credit Assignment in Neural Networks using Reinforcement Learning
Date
Author
Institution
Degree Level
Degree
Department
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
Structural credit assignment in neural networks is a long-standing problem, with a variety of alternatives to backpropagation proposed to allow for local training of nodes. One of the early strategies was to treat each node as an agent and use a reinforcement learning method called REINFORCE to update each node locally with only a global reward signal. In this work, we revisit this approach and investigate if we can leverage other reinforcement learning approaches to improve learning. We first formalize training a neural network as a finite-horizon reinforcement learning problem and discuss how this facilitates using ideas from reinforcement learning like off-policy learning, exploration and planning. We first show that the standard REINFORCE approach can learn but is suboptimal due to on-policy training: each agent learns to output an activation under suboptimal action selection from the other agents. We show that we can overcome this suboptimality with an off-policy approach, that it is particularly effective with discretized actions. We provide several additional experiments, highlighting the utility of exploration, robustness to correlated samples when learning online and a study into the policy parameterization of each agent.
