CFE-CMStatistics 2025: Start Registration
View Submission - CFE-CMStatistics 2025
A1163
Title: Recovering Isoform-level transcriptomics from sparse long-read single-cell RNA sequencing data Authors:  Wei Chen - University of Pittsburgh (United States) [presenting]
Abstract: Long read single-cell RNA sequencing (lrscRNA-seq) quantifies single-cell isoform-level expression. Because gene-level signals split into many isoforms, feature numbers rise while counts fall, producing extreme sparsity. The resulting extremely sparse data represent highly partial observations of the transcriptome profiles, limiting the effectiveness of downstream statistical and bioinformatics methods. A graph-based diffusion method is presented to refine isoform expression per cell. Cellcell similarity is computed using either gene-level or isoform-level expression data, by blending adapted SimRank similarity metrics with Gaussian-kernel weights from distances in a reduced space; a Markov process then propagates information across the similarity graph to estimate each cell's underlying transcriptomic profile. The framework preserves cell-type-specific signatures by limiting diffusion in the Markov process with adaptive neighborhood selection. Using simulated and real datasets, the method is shown to counteract sparsity, and biological information is recovered. Prior to refinement, important biological information, such as isoform correlations and differential expression along pseudotime, was largely hidden by dropout. After refinement, clear isoform correlations and changes along pseudotime emerge, improving interpretability. Overall, the approach enhances lrscRNA-seq by leveraging cell graphs to recover biological information and reveal isoform-level expression patterns.