View Submission

A0980

Title: Pruning deep neural networks from a sparsity perspective Authors: Ganghua Wang - University of Minnesota (United States) [presenting]
Jie Ding - University of Minnesota (United States)
Yuhong Yang - University of Minnesota (United States)
Abstract: Recently, deep network pruning has attracted significant attention in order to enable the rapid deployment of AI into small devices with computation and memory constraints. Many deep pruning algorithms have been proposed with impressive empirical success. However, a theoretical understanding of model compression is still limited. One problem is to understand if a network is more compressible than another of the same structure. Another problem is quantifying how much one can prune a network with theoretically guaranteed degradation of accuracy. These two fundamental problems are addressed by using the sparsity-sensitive $l_q$-norm (0 < q < 1) to characterize compressibility and provide a relationship between the soft sparsity of the network weights and the degree of compression with a controlled accuracy degradation bound. Next, PQ Index (PQI) is proposed to measure the potential compressibility of deep neural networks and use this to develop a Sparsity-informed Adaptive Pruning (SAP) algorithm. Our experiments demonstrate that the proposed adaptive pruning algorithm with a proper choice of hyper-parameters is superior to the iterative pruning algorithms, such as the lottery ticket-based pruning methods, in terms of both compression efficiency and robustness.