CMStatistics 2015: Start Registration
View Submission - CMStatistics
B0714
Title: Transcriptional landscape reconstruction from high-throughput sequencing count data via state space models Authors:  Bogdan Mirauta - EMBL-European Bioinformatics Institute (United Kingdom)
Hugues Richard - UPMC (France)
Pierre Nicolas - INRA (France) [presenting]
Abstract: The most common RNA-Seq strategy consists of random shearing, amplification, and high-throughput sequencing, of the RNA fraction. Methods to analyze transcription level variations along the genome from the read count profiles generated by this global RNA-Seq protocol are needed. We developed statistical approaches to estimate the local transcription levels and to identify transcript borders along the genome. Our transcriptional landscape reconstruction relies on a state-space model to describe transcription level variations in terms of abrupt shifts and more progressive drifts. A new emission model is introduced to capture not only the read count variance inside a transcript but also its short-range autocorrelation and the fraction of positions with zero-counts. The estimation relies on a Markov-Chain Monte Carlo approach involving a Sequential Monte Carlo algorithm, the Particle Gibbs.