A Latent Variable Model for Discovering Bird Species Commonly Misidentified by Citizen Scientists

Authors

  • Jun Yu Oregon State University
  • Rebecca Hutchinson Oregon State University
  • Weng-Keen Wong Oregon State University

DOI:

https://doi.org/10.1609/aaai.v28i1.8763

Keywords:

Machine Learning, Probabilistic Graphical Model, Citizen Science, Crowdsourcing

Abstract

Data quality is a common source of concern for large-scale citizen science projects like eBird. In the case of eBird, a major cause of poor quality data is the misidentification of bird species by inexperienced contributors. A proactive approach for improving data quality is to discover commonly misidentified bird species and to teach inexperienced birders the differences between these species. To accomplish this goal, we develop a latent variable graphical model that can identify groups of bird species that are often confused for each other by eBird participants. Our model is a multi-species extension of the classic occupancy-detection model in the ecology literature. This multi-species extension requires a structure learning step as well as a computationally expensive parameter learning stage which we make efficient through a variational approximation. We show that our model can not only discover groups of misidentified species, but by including these misidentifications in the model, it can also achieve more accurate predictions of both species occupancy and detection.

Downloads

Published

2014-06-20

How to Cite

Yu, J., Hutchinson, R., & Wong, W.-K. (2014). A Latent Variable Model for Discovering Bird Species Commonly Misidentified by Citizen Scientists. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1). https://doi.org/10.1609/aaai.v28i1.8763

Issue

Section

Computational Sustainability and Artificial Intelligence