CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1120
Title: Investigating the impact of content similarity on density-based clustering of social networks Authors:  Sara Geremia - University of Trieste (Italy)
Domenico De Stefano - University of Trieste (Italy) [presenting]
Abstract: A novel approach to density-based clustering in social networks is presented that incorporates content similarity among nodes. The aim is to improve the clustering process and provide a deeper understanding of network structure and dynamics. The proposed method uses content vectors to represent each node's characteristics and computes pairwise similarities using cosine similarity. Based on the resulting similarity matrix, a content adjacency matrix is constructed by retaining only the edges corresponding to the highest similarity values. A density-based clustering algorithm is then applied to the content network, and the influence of homophily (the tendency of nodes with similar content to be more connected) on the community detection performance is examined. Higher levels of homophily result in improved intra-cluster connectivity, distinct community boundaries, and enhanced cluster coherence. Conversely, heterophily impacts inter-cluster connectivity and community integration or segregation. Moderate heterophily fosters intercommunity interactions, but excessive levels can compromise accuracy and distinctiveness. Integrating content similarity significantly enhances the accuracy of community detection in social networks characterized by similar individuals' inclination to connect. The approach uncovers meaningful clusters that capture both structural and content-based patterns.