CMStatistics 2022: Start Registration
View Submission - CMStatistics
B1128
Title: Sparse regularization of neural networks using a Hadamard parametrization-based optimization transfer approach Authors:  Chris Kolb - LMU Munich (Germany) [presenting]
David Ruegamer - LMU Munich (Germany)
Abstract: Neural networks are becoming an increasingly popular framework for estimating complex or high-dimensional regression models, allowing scaling up models to very large data sets using stochastic gradient descent (SGD). Incorporating sparsity into neural networks has shown to be difficult due to the non-smooth nature of the added penalty term, typically requiring specialized optimization routines such as projected gradient or coordinate descent methods. Instead, a method for inducing sparsity in neural networks with $\ell_p$ regularization ($0<p \leq 1$) is presented that is amenable to conventional first-order optimizers such as SGD or Adam. This is achieved by solving an equivalent surrogate problem, obtained by applying a Hadamard product reparametrization to the model parameters, under which smooth and strongly convex $\ell_2$ regularization (or weight decay) induce non-smooth and potentially non-convex $\ell_p$ regularization in the original parametrization. This optimization transfer approach can be readily extended to structured sparsity problems, yielding $\ell_{p,q}$ regularization of the original parameters for $0<p<q<2$.