CMStatistics 2020: Start Registration
View Submission - CMStatistics
B1010
Title: Applying machine learning methods to understand biological heterogeneity Authors:  Yingtong Chen - University of Oxford (United Kingdom) [presenting]
Maggie Cheang - The Institute of Cancer Research (United Kingdom)
Abstract: Machine learning methods are applied to analyse genetics data for patients diagnosed with sarcoma. Firstly, unsupervised learning methods such as PCA, TSNE and k-means clustering are used to cluster the data into several subgroups. Then survival analysis is done based on these different groups to identify the difference of survival condition for patients in different groups. It is indeed found some subtypes perform better than other subtypes. Cox regression is also applied to analyse each gene to filter genes that affect survival conditions. Unsupervised learning methods to cluster patients based on the selected genes generate a clearer division among subgroups. Gene enrichment analysis software can also help find whether any cluster of genes is related to certain pathway/drug targets.