EcoSta 2024: Start Registration
View Submission - EcoSta 2025
A0340
Title: Multivariate species sampling models Authors:  Beatrice Franzolini - Bocconi University (Italy)
Antonio Lijoi - Bocconi University (Italy)
Igor Pruenster - Bocconi University (Italy) [presenting]
Giovanni Rebaudo - University of Turin and Collegio Carlo Alberto (Italy)
Abstract: Species sampling processes have long provided a fundamental framework for random discrete distributions and exchangeable sequences. However, analyzing data from distinct yet related sources requires a broader notion of probabilistic invariance, with partial exchangeability as the natural choice. Over the past two decades, numerous models for partially exchangeable data, known as dependent nonparametric priors, have emerged, including hierarchical, nested, and additive processes. Despite their widespread use in statistics and machine learning, a unifying framework remains elusive, leaving key questions about their learning mechanisms unanswered. This gap is filled by introducing multivariate species sampling models, a general class of nonparametric priors encompassing most existing dependent nonparametric processes. These models are defined by a partially exchangeable partition probability function, encoding the induced multivariate clustering structure. Their core distributional properties and dependence structure are established, showing that borrowing of information across groups is entirely determined by shared ties. This provides new insights into their learning mechanisms, including a principled explanation for the correlation structure observed in existing models. Beyond offering a cohesive theoretical foundation, the approach serves as a constructive tool for developing new models and opens new research directions aimed at capturing even richer dependence structures.