Where It Really Matters: Few-Shot Environmental Conservation Media Monitoring for Low-Resource Languages

Authors

  • Sameer Jain Carnegie Mellon University
  • Sedrick Scott Keh Carnegie Mellon University
  • Shova Chhetri World Wide Fund for Nature
  • Karun Dewan World Wide Fund for Nature
  • Pablo Izquierdo World Wide Fund for Nature
  • Johanna Prussmann World Wide Fund for Nature
  • Pooja Shrestha World Wide Fund for Nature
  • César Suárez World Wide Fund for Nature
  • Zheyuan Ryan Shi University of Pittsburgh
  • Lei Li Carnegie Mellon University
  • Fei Fang Carnegie Mellon University

DOI:

https://doi.org/10.1609/aaai.v38i20.30218

Keywords:

General

Abstract

Environmental conservation organizations routinely monitor news content on conservation in protected areas to maintain situational awareness of developments that can have an environmental impact. Existing automated media monitoring systems require large amounts of data labeled by domain experts, which is only feasible at scale for high-resource languages like English. However, such tools are most needed in the global south where the news of interest is mainly in local low-resource languages, and far fewer experts are available to annotate datasets on a sustainable basis. In this paper, we propose NewsSerow, a method to automatically recognize environmental conservation content in low-resource languages. NewsSerow is a pipeline of summarization, in-context few-shot classification, and self-reflection using large language models (LLMs). Using at most 10 demonstration example news articles in Nepali, NewsSerow significantly outperforms other few-shot methods and can achieve comparable performance with models fully fine-tuned using thousands of examples. With NewsSerow, Organization X has been able to deploy the media monitoring tool in Nepal, significantly reducing their operational burden, and ensuring that AI tools for conservation actually reach the communities that need them the most. NewsSerow has also been deployed for countries with other languages like Colombia.

Published

2024-03-24

How to Cite

Jain, S., Keh, S. S., Chhetri, S., Dewan, K., Izquierdo, P. ., Prussmann, J. . ., Shrestha, P., Suárez, C., Shi, Z. R., Li, L., & Fang, F. (2024). Where It Really Matters: Few-Shot Environmental Conservation Media Monitoring for Low-Resource Languages. Proceedings of the AAAI Conference on Artificial Intelligence, 38(20), 22141-22149. https://doi.org/10.1609/aaai.v38i20.30218