EcoSta 2023: Start Registration
View Submission - EcoSta2023
A1280
Title: Generalization analysis for contrastive deep representation learning Authors:  Yiming Ying - University of Sydney (Australia) [presenting]
Abstract: The performance of machine learning (ML) models often depends on the representation of data, which motivates a resurgence of contrastive representation learning (CRL) to learn a representation function. Recently, CRL has shown remarkable empirical performance, and it can even surpass the performance of supervised learning models in various domains, such as computer vision and natural language processing. Recent progress is presented in establishing the learning theory foundation for CRL. In particular, the following two theoretical questions are addressed: 1) how would the generalization behaviour of downstream ML models benefit from the representation function built from positive and negative pairs? 2) Especially how would the number of negative examples affect its learning performance? Specifically, generalization bounds for contrastive learning can be shown that do not depend on the number $k$ of negative examples up to logarithmic terms. The analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, the results are further improved by developing optimistic bounds, which imply fast rates in a low noise condition. The results are applied to learning with both linear representation and nonlinear representation by deep neural networks, for both of which explicit Rademacher complexity bounds are derived.