COMPSTAT 2023: Start Registration
View Submission - COMPSTAT2023
A0281
Title: Visualizing topic uncertainty in topic modelling Authors:  Peter Winker - University of Giessen (Germany) [presenting]
Abstract: Word clouds became a standard tool for presenting results of natural language processing methods such as topic modelling. They exhibit the most important words, where word size is often chosen proportional to the relevance of words within a topic. In the latent Dirichlet allocation (LDA) model, word clouds are graphical presentations of a vector of weights for words within a topic. These vectors are the result of a statistical procedure based on a specific corpus. Therefore, they are subject to uncertainty coming from different sources such as sample selection, random components in the optimization algorithm, or parameter settings. A novel approach for presenting word clouds, including information on such types of uncertainty, is introduced and illustrated with an application of the LDA model to conference abstracts.