Title: Modelling and clustering of private distributed data
Authors: Sharon Lee - University of Adelaide (Australia) [presenting]
Abstract: As collaborative data analysis is increasingly common, privacy has become a major concern. However, performing modelling and clustering on distributed data without disclosing the full data is a challenging task, especially in the commercial and healthcare settings where strict privacy agreements and policies must be followed. We present a privacy-enhanced modification of the EM algorithm for fitting mixture models in a multi-party setting. In particular, the scenario of horizontally partitioned data is addressed. Building on the concept of secure sum computation and adopting a cyclic communication network, the proposed two-cycle M-step approach offers protection against information leakage in the case of corrupted parties. The effectiveness of our methodology will be demonstrated through a comparative security analysis, illustrated using the normal and t-mixture models on real data.