A1158
Title: A framework for structural variation analysis: Realistic simulation, robust detection, and Haplotype inference
Authors: Yongyi Luo - The Chinese University of Hong Kong (China) [presenting]
Zhen Zhang - The Hong Kong University of Science and Technology (Hong Kong)
Jingyu Hao - The Hong Kong University of Science and Technology (Hong Kong)
Jiandong Shi - The Chinese University of Hong Kong (Hong Kong)
Weichuan Yu - The Hong Kong University of Science and Technology (Hong Kong)
Xiaodan Fan - The Chinese University of Hong Kong (Hong Kong)
Abstract: Structural variations (SVs) are critical genomic variants affecting evolution and disease susceptibility. Their analysis faces three major challenges: Existing simulators struggle to capture the complex genomic distribution of SVs; detection methods lack sensitivity for complex structural variants (CSVs) with nested or multi-breakpoint architectures; and haplotype reconstruction suffers from error propagation due to separate variant calling and phasing. To address these challenges, BVSim is developed, a benchmarking variation simulator that learns empirical distributions of SVs from real genomic data. BVSim accurately preserves length distributions, telomere-proximal enrichment, and tandem repeat associations seen in human SVs, outperforming existing simulators. Next, gSV is introduced, a general SV detector that combines alignment-based signal decomposition with assembly-based validation. gSV utilizes maximum exact match strategies and graph-cut optimization, enabling sensitive identification of CSVs without predefined assumptions about variant types. It excels particularly in detecting nested and multi-breakpoint variants. Finally, DIHap is proposed, a unified probabilistic framework for direct haplotype inference from sequencing data. DIHap simultaneously models sequencing error profiles and haplotype structures, enhancing accuracy in low-coverage and polyploid scenarios.