EcoSta 2022: Start Registration
View Submission - EcoSta2022
A0635
Title: scSampler: Fast diversity-preserving subsampling of large-scale single-cell transcriptomic data Authors:  Nan Miles Xi - Loyola University Chicago (United States) [presenting]
Abstract: The number of cells measured in single-cell transcriptomic data has grown fast in recent years. For such large-scale data, subsampling is a powerful and often necessary tool for exploratory data analysis. However, the easiest random subsampling is not ideal from the perspective of preserving rare cell types. Therefore, diversity-preserving subsampling is required for the fast exploration of cell types in a large-scale dataset. We propose scSampler, an algorithm for fast diversity-preserving subsampling of single-cell transcriptomic data. Using simulated and real data, we show that scSampler consistently outperforms existing subsampling methods in terms of both the computational time and the Hausdorff distance between the full and subsampled datasets.