View Submission

B0830

Title: Smoothing the edges: smooth optimization for sparse regularization using Hadamard overparametrization Authors: Chris Kolb - LMU Munich (Germany) [presenting]
Bernd Bischl - LMU Munich (Germany)
Christian L. Mueller - Simons Foundation (United States)
David Ruegamer - LMU Munich (Germany)
Abstract: Neural networks are becoming an increasingly popular framework for estimating complex or high-dimensional regression models, enabling scaling up models to large data sets using stochastic gradient descent (SGD). Incorporating sparsity into neural networks has shown to be difficult due to the non-smooth nature of the added penalty term, typically requiring specialized optimization routines. Instead, a method for inducing sparsity in neural networks with $\ell_q$ regularization is presented that is compatible with off-the-shelf optimizers such as SGD or Adam. This is achieved by solving an equivalent surrogate problem, obtained by applying an overparametrization to the model parameters so that smooth and strongly convex $\ell_2$ regularization of the surrogate parameters induces non-smooth and potentially non-convex regularization in the original parametrization. This optimization transfer approach can be readily extended to structured sparsity problems, and various applications of the framework are showcased for the sparse optimization of statistical models.