A0921
Title: Valid f-screening in linear regression
Authors: Olivia McGough - University of Washington (United States)
Daniela Witten - University of Washington (United States)
Daniel Kessler - University of North Carolina at Chapel Hill (United States) [presenting]
Abstract: Suppose that a data analyst wishes to report the results of a linear regression only if the overall null hypothesis is rejected. This practice, which is referred to as F-screening, is in fact common practice across a number of applied fields. Unfortunately, it poses a problem: Standard guarantees for the inferential outputs of linear regression, such as Type 1 error control of hypothesis tests and nominal coverage of confidence intervals, hold unconditionally, but fail to hold conditional on rejection of the overall null hypothesis. An inferential toolbox is developed for the coefficients in a least squares model that are valid conditional on rejection of the overall null hypothesis. Selective p-values that lead to tests are developed that control the selective Type 1 error, i.e., the Type 1 error conditional on having rejected the overall null hypothesis. Furthermore, they can be computed without access to the raw data, i.e., using only the standard outputs of a least squares linear regression, and therefore are suitable for use in a retrospective analysis of a published study. Confidence intervals are also developed that attain nominal selective coverage, and point estimates that account for having rejected the overall null hypothesis. It is shown empirically that the selective procedure is preferable to an alternative approach that relies on sample splitting, and its performance is demonstrated via re-analysis of two datasets from the biomedical literature.