COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0373
Title: Assessment of case influence in the Lasso with a case-weight adjusted solution path Authors:  Zhenbang Jiao - The Ohio State University (United States) [presenting]
Yoonkyung Lee - Ohio State University (United States)
Abstract: Case influence in the Lasso regression is studied using Cook's distance, which measures the overall change in the fitted values when one observation is deleted. Unlike in ordinary least squares regression, the estimated coefficients in the Lasso do not have a closed form due to the nondifferentiability of the l1 penalty, and neither does Cook's distance. To find the case-deleted Lasso solution without refitting the model, a weight parameter ranging from 1 to 0 is introduced to approach it from the full data solution and generate a solution path indexed by this parameter. It is shown that the solution path is piecewise linear with respect to a simple function of the weight parameter under a fixed penalty. The resulting case influence is a function of the penalty and weight parameters, and it becomes Cook's distance when the weight is 0. As the penalty parameter changes, selected variables change, and the magnitude of Cook's distance for the same data point may vary with the subset of variables selected. In addition, a case influence graph is introduced to visualize how the contribution of each data point changes with the penalty parameter. From the graph, influential points can be identified at different levels of penalization and make modeling decisions accordingly. Moreover, it is found that case influence graphs exhibit different patterns between underfitting and overfitting phases, which can provide additional information for model selection.