Black History Month is here! Discover ERA research focused on Black experiences in Canada and worldwide. Use our general search below to get started!

Budgeted Gradient Descent: Selective Gradient Optimization for Addressing Misclassifications in DNNs

Loading...
Thumbnail Image

Institution

http://id.loc.gov/authorities/names/n79058482

Degree Level

Master's

Degree

Master of Science

Department

Department of Computing Science

Supervisor / Co-Supervisor and Their Department(s)

Citation for Previous Publication

Link to Related Item

Abstract

Artificial neural networks have become a popular learning approach for theirability to generalize well to unseen data. However, misclassifications can still occur due to various data-related issues, such as adversarial inputs, out-of-distribution samples, and model-related challenges, such as underfitting and overfitting. While retraining and fine-tuning on misclassified samples are common corrective approaches, they can reduce generalizability and lead to sample memorization. In this thesis, we propose Budgeted Gradient Descent (BGD), an approach for correcting misclassifications by introducing sparse changes to network parameters. Our approach attempts to answer the question: What is the minimal set of network changes necessary to correctly predict a previously misclassified sample? BGD minimizes both the number of parameters updated and the magnitude of changes, aiming to correct misclassifications while preserving generalizability. Additionally, BGD does not require access to the training data to preserve said generalizability. We observe that sparse updates can effectively correct misclassifications while preserving learned representations, as not all gradients contribute equally to classifying difficult or out-of-distribution samples. Through empirical comparisons with existing approaches, we investigate the optimal level of sparsity for maintaining network performance and generalizability. Our results suggest that while second-order gradient updates can minimize the number of parameter changes, excessive sparsity can negatively impact the network. The contributions of this thesis include a novel approach to correcting misclassifications, insights into the relationship between parameter updates and generalizability, and a detailed examination of how different sparsity levels affect the long-term performance of neural networks in an online supervised learning setting.

Item Type

http://purl.org/coar/resource_type/c_46ec

Alternative

License

Other License Text / Link

This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

en

Location

Time Period

Source