EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0671
Title: Learning context-aware distributed gene representations in spatial transcriptomics with SpaCEX Authors:  Hao Wu - Shenzhen Institute of Advanced Technology (China) [presenting]
Abstract: Distributed gene representations are pivotal in genomic research, offering a means to understand the complexities of genomic data and providing the foundation for various data analysis tasks. Current gene representation learning methods demand costly pretraining on heterogeneous transcriptomic corpora, making them less approachable and prone to over-generalization. For spatial transcriptomics (ST), there are many methods for learning spot embeddings but lacking methods for generating gene embeddings from spatial gene profiles. To fill the gap, SpaCEX is presented, a pioneer self-supervised learning model that generates context-aware, semantically rich gene embeddings (SpaCEX-generated-Gene-Embeddings, SGEs) from ST data through exploiting spatial genomic context (SGC) identified as spatially co-expressed gene modules. As a few-shot learning method focusing on targeted single datasets, SpaCEX is cost-effective, context-sensitive, and robust to cross-sample technical artefacts. Real data analyses reveal the biological relevance of SpaCEX-identified SGC and affirm the functional and relational semantics of SGEs. Based on the SGEs, novel computational methods are developed for key downstream objectives: identifying disease-associated genes and gene-gene interactions, enhancing transcriptomic coverage of FISH-based ST, detecting spatially variable genes, and enhancing spatial clustering. Extensive real data results demonstrate these methods' superior performance.