COMPSTAT 2022: Start Registration
View Submission - COMPSTAT2022
A0430
Title: Clustering of single cell RNAseq data: An integrated analysis using multiple methods and robust clustering solutions Authors:  Mohamad Zafer Merhi - Hasselt University (Belgium) [presenting]
Ziv Shkedy - Hasselt University (Belgium)
Ahmed Essaghir - GlaxoSmithKline (Belgium)
Dan Lin - GlaxoSmithKline (Belgium)
Abstract: Clustering single cell RNA-seq data is a central step in the identification of cell types in single cell RNA-seq data experiments. Through the clustering unsupervised analysis, we are able to find groups of cells based on similarities in their expression profiles which allows us to associate subsets of cells (belong to the same cluster) with a biological pathway. Despite recent advancements in clustering tools and methods aimed at clustering single-cell RNA-seq data, many challenges and factors still need to be investigated. For example, a collection of clustering methods applied to the same single cell RNA-seq data often results in a variety of clustering solutions. Even in the case that a single clustering method is used, a change in the parameter settings typically produces a different clustering solution. In the current study, we assess the performance of selected clustering methods and focus on the similarity between the clustering solutions obtained for the different methods. We discuss the methodology to identify a robust clustering solution for a given single cell RNA-seq data and present diagnostic plots to investigate, for a given method, the influence of the parameter setting on the solution. All methods are applied to real-life (and publicly available) single cell RNAseq data. Software tools to conduct the proposed analysis are presented (and publicly available) as well.