A1159
Title: Framework for large-scale classification: Error rate control and optimality
Authors: Yin Xia - Fudan University (China) [presenting]
Abstract: Classification is a fundamental task in supervised learning while achieving valid misclassification rate control remains challenging due to possibly the limited predictive capability of the classifiers or the intrinsic complexity of the classification task. Large-scale multi-class classification problems with general error rate guarantees to enhance algorithmic trustworthiness. To this end, a notion of group-wise classification is first introduced, which unifies the common class-wise and overall classifications as special cases. A unified algorithmic framework is then developed for the general group-wise classification that consists of three steps: Pre-classification, selective p-value construction, and large-scale post-classification decisions (PSP). Theoretically, PSP is distribution-free and provides valid finite-sample guarantees for controlling general group-wise false decision rates at target levels. To show the power of PSP, it is demonstrated that the step of post-classification decisions never degrades the power of pre-classification, provided that pre-classification has been sufficiently powerful to meet the target error levels. Additionally, general power optimality theories for PSP are further established from both non-asymptotic and asymptotic perspectives. Numerical results in both simulations and real data analysis validate the performance of the proposed PSP approach.