View Submission

A0894

Title: Statistical analysis of fixed mini-batch gradient descent estimator Authors: Haobo Qi - Beijing Normal University (China) [presenting]
Feifei Wang - Renmin University of China (China)
Hansheng Wang - Peking University (China)
Abstract: A fixed mini-batch gradient descent (FMGD) algorithm is studied to solve optimization problems with massive datasets. In FMGD, the whole sample is split into multiple non-overlapping partitions. Once the partitions are formed, they are then fixed throughout the rest of the algorithm. For convenience, the fixed partitions are referred toas fixed mini-batches. Then, for each computation iteration, the gradients are sequentially calculated for each fixed mini-batch. Because the size of fixed mini-batches is typically much smaller than the whole sample size, it can be easily computed. This leads to much-reduced computation costs for each computational iteration. It makes FMGD computationally efficient and practically more feasible. To demonstrate the theoretical properties of FMGD, it starts with a linear regression model with a constant learning rate. Its numerical convergence and statistical efficiency properties are studied. It is found that sufficiently small learning rates are necessary for both numerical convergence and statistical efficiency. Nevertheless, an extremely small learning rate might lead to painfully slow numerical convergence. A diminishing learning rate scheduling strategy can be used to solve the problem. This leads to the FMGD estimator with faster numerical convergence and better statistical efficiency. Finally, the FMGD algorithms with random shuffling and a general loss function are also studied.