Visualization Techniques for Topic Model Checking
Keywords:visualization, topic modeling, nlp, applications
Topic models remain a black box both for modelers and for end users in many respects. From the modelers' perspective, many decisions must be made which lack clear rationales and whose interactions are unclear — for example, how many topics the algorithms should find (K), which words to ignore (aka the "stop list"), and whether it is adequate to run the modeling process once or multiple times, producing different results due to the algorithms that approximate the Bayesian priors. Furthermore, the results of different parameter settings are hard to analyze, summarize, and visualize, making model comparison difficult. From the end users' perspective, it is hard to understand why the models perform as they do, and information-theoretic similarity measures do not fully align with humanistic interpretation of the topics. We present the Topic Explorer, which advances the state-of-the-art in topic model visualization for document-document and topic-document relations. It brings topic models to life in a way that fosters deep understanding of both corpus and models, allowing users to generate interpretive hypotheses and to suggest further experiments. Such tools are an essential step toward assessing whether topic modeling is a suitable technique for AI and cognitive modeling applications.