On the Application of Continuous Deterministic Reinforcement Learning in Neural Architecture Search

Loading...
Thumbnail Image

Institution

http://id.loc.gov/authorities/names/n79058482

Degree Level

Master's

Degree

Master of Science

Department

Department of Electrical and Computer Engineering

Specialization

Computer Engineering

Supervisor / Co-Supervisor and Their Department(s)

Citation for Previous Publication

Link to Related Item

Abstract

Architecture evaluation is a major bottleneck of Neural Architecture Search (NAS). Recent trends have seen a shift in favor of weight-sharing networks capable of superimposing all possible candidate architectures in a search space. Nevertheless, this technique is not beyond reproach, and has already encountered significant criticism. Of these is the ability of weight-sharing supernets to accurately represent the characteristics of a single discrete architecture when they are purposefully designed to mimic the behaviour of many.

As the cost of NAS evaluation decreased, the complexity of search algorithms has grown. In this thesis, we explore the application of Reinforcement Learning (RL) in the problem space of weight-sharing NAS. Specifically, we focus on the usage of deterministic agents operating in a continuous action space. First, analogous to gradient-based optimization, we train both the supernet and agent simultaneously and interface them accordingly. Our agent consists of an actor-critic framework, where the actor generates architectures based on the teachings of the critic. Rewards are calculated to encourage the selection and further improvement of high-performance architectures.

Next, we refine the efficiency of our weight-sharing supernet, while decoupling optimization with the RL agent. These reforms lower the resource cost during architecture search and remove unhelpful biases the supernet may have imposed on the agent. We adapt the RL agent to these changes by redefining the state as statistical representation of the best architectures observed. Finally, in order to focus on only the most high-performance architectures, we incorporate the check loss into the critic.

Experimental results on DARTS show that our first scheme is capable of generating architectures that achieve over 97% test accuracy on CIFAR-10 and 81% test accuracy on CIFAR-100. Findings indicate that the agent of our second approach is capable of state-of-the-art test performance on NAS-Bench-201. Additionally, architectures generated by our second approach achieve over 97.4% test accuracy on CIFAR-10 and 75% top-1 accuracy on ImageNet.

Item Type

http://purl.org/coar/resource_type/c_46ec

Alternative

License

Other License Text / Link

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

en

Location

Time Period

Source