CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0982
Title: Contrastive learning: A statistical approach Authors:  Luca Scaffidi Domianello - University of Catania (Italy) [presenting]
Salvatore Ingrassia - University of Catania (Italy)
Abstract: Contrastive or self-supervised learning is an intuitive learning principle alternative to the likelihood-based one, usually employed to estimate unnormalized models, that is models for which the density is not normalized. It has wide applicability in different statistical areas, one of the most important being density estimation. The latter, which is related to unsupervised learning, can be reformulated through logistic regression modeling as a supervised learning task. This approach has been largely developed in the area of machine learning in different fields like natural language processing and image modeling, just to name a few. Nevertheless, from a statistical point of view, this approach has attracted very little attention so far. The main idea of contrastive learning is to learn to classify between the data of interest and some artificially generated ad hoc reference, also called noise data, by contrasting them. In the present contribution, a detailed statistical framework of contrastive learning approach and large numerical studies are provided, illustrating the properties of the learning principle. As expected, the reference distribution affects the parameter estimates related to the interest data, then the choice of the reference distribution is a crucial task that needs to be further analyzed. A final application on real data is presented for conclusion.