View Submission

A1063

Title: TabPFN: One model to rule them all? Authors: Yan Shuo Tan - National University of Singapore (Singapore) [presenting]
Qiong Zhang - Renmin University of China (China)
Qinglong Tian - University of Waterloo (Canada)
Pengfei Li - University of Waterloo (Canada)
Abstract: A recent study introduced TabPFN, a transformer-based deep learning model for regression and classification on tabular data, which is claimed to outperform all previous methods on datasets with up to 10,000 samples by a wide margin, using substantially less training time. Furthermore, it has called TabPFN a foundation model, as it can support data generation, density estimation, learning reusable embeddings, and fine-tuning. If these statements are well-supported, TabPFN may have the potential to supersede existing modeling approaches on a wide range of statistical tasks, mirroring a similar revolution in other areas of artificial intelligence that began with the advent of large language models. An explanation of how TabPFN works is provided by emphasizing its interpretation as an approximate Bayesian inference. More evidence of its foundation model capabilities is also provided: It is shown that an out-of-the-box application of TabPFN vastly outperforms specialized state-of-the-art methods for semi-supervised parameter estimation, prediction under covariate shift, and heterogeneous treatment effect estimation. It is further shown that TabPFN can outperform LASSO at sparse regression and can break a robustness-efficiency trade-off in classification.