CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1199
Title: Is memorization compatible with causal learning: The case of high-dimensional linear regression Authors:  Leena Chennuru Vankadara - Amazon Web Services (Germany) [presenting]
Abstract: Deep learning models exhibit a rather curious phenomenon. They optimize over hugely complex model classes and are often trained to memorize the training data. It is seemingly contradictory to classical statistical wisdom, which suggests avoiding interpolation to reduce the complexity of the prediction rules. A large body of recent work partially resolves this contradiction and suggests that interpolation does not necessarily harm statistical generalization, and it may even be necessary for optimal statistical generalization in some settings. It is, however, an incomplete picture. In modern ML, the purpose exceeds the building of good statistical models. The interest is to learn reliable models with good causal implications. Under a simple linear model in high dimensions, the role of interpolation and its counterpart (regularization) in learning better causal models are discussed.