Improving Quantification with Minimal In-Domain Annotations: Beyond Classify and Count
DOI:
https://doi.org/10.1609/icwsm.v18i1.31411Abstract
Quantification is the task of estimating the class distribution in a given collection. With the growing availability of classification models, the use of classifiers for quantification has become increasingly popular, carrying the promise of eliminating the need for manual annotation. However, the naive classify and count approach presents clear limitations, especially evident in the face of domain discrepancies. In this work, we introduce two novel quantification methods, called CPCC and BCC, which can adapt to new target datasets with a small number of annotated in-domain samples (N = 100). To explore their real-world applicability, we apply our methods to a range of quantification tasks in the realm of hateful and offensive language, where they perform markedly better than classify and count and other existing methods.Downloads
Published
2024-05-28
How to Cite
von Däniken, P., Deriu, J. M., Rodrigo, A., & Cieliebak, M. (2024). Improving Quantification with Minimal In-Domain Annotations: Beyond Classify and Count. Proceedings of the International AAAI Conference on Web and Social Media, 18(1), 1585-1598. https://doi.org/10.1609/icwsm.v18i1.31411
Issue
Section
Full Papers