CMStatistics 2023: Start Registration
View Submission - CMStatistics
B0365
Title: Feature allocation models with EFPFs in product-form Authors:  Lorenzo Ghilotti - University of Milano-Bicocca (Italy) [presenting]
Federico Camerlenghi - University of Milano-Bicocca (Italy)
Tommaso Rigon - University of Milano-Bicocca (Italy)
Abstract: Species sampling models represent a large class of Bayesian nonparametric priors tailored for a population of animals, where each animal belongs to a single species. The random partition induced by a sample of animals is characterized by the exchangeable partition probability function. Feature allocation models constitute a primary extension of the species framework, where subjects can display multiple features recorded by binary variables. Feature allocations, analogous to clustering, are described by the exchangeable feature probability function (EFPF). The aim is to provide distribution results for a fundamental class of feature allocation models with EFPFs in product form, which have been recently investigated from a probabilistic perspective. These models serve as prominent priors in the feature setting, akin to Gibbs-type priors in the species framework, offering a balance between tractability and flexibility. A general theory is developed, analyzing the predictive structure, marginal distribution, and posterior distribution of the underlying statistical process. Noteworthy examples, such as mixtures of the Indian buffet process and beta-Bernoulli models, are examined. The methodology has significant applications in ecology, addressing species richness estimation using the accumulation curve, and in genomics, dealing with extrapolation problems for estimating the number of unseen genetic variants.