Diagnosing and Improving Topic Models by Analyzing Posterior Variability

Authors

  • Linzi Xing University of Colorado, Boulder
  • Michael Paul University of Colorado, Boulder

DOI:

https://doi.org/10.1609/aaai.v32i1.12033

Abstract

Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters, which has primarily been used to increase the robustness of parameter estimates. In this work, we explore other rich information that can be obtained by analyzing the posterior distributions in topic models. Experimenting with latent Dirichlet allocation on two datasets, we propose ideas incorporating information about the posterior distributions at the topic level and at the word level. At the topic level, we propose a metric called topic stability that measures the variability of the topic parameters under the posterior. We show that this metric is correlated with human judgments of topic quality as well as with the consistency of topics appearing across multiple models. At the word level, we experiment with different methods for adjusting individual word probabilities within topics based on their uncertainty. Humans prefer words ranked by our adjusted estimates nearly twice as often when compared to the traditional approach. Finally, we describe how the ideas presented in this work could potentially applied to other predictive or exploratory models in future work.

Downloads

Published

2018-04-26

How to Cite

Xing, L., & Paul, M. (2018). Diagnosing and Improving Topic Models by Analyzing Posterior Variability. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.12033