Robust Evaluation Measures for Evaluating Social Biases in Masked Language Models

Authors

  • Yang Liu Tianjin University

DOI:

https://doi.org/10.1609/aaai.v38i17.29834

Keywords:

NLP: Ethics -- Bias, Fairness, Transparency & Privacy, NLP: Safety and Robustness, NLP: Interpretability, Analysis, and Evaluation of NLP Models

Abstract

Many evaluation measures are used to evaluate social biases in masked language models (MLMs). However, we find that these previously proposed evaluation measures are lacking robustness in scenarios with limited datasets. This is because these measures are obtained by comparing the pseudo-log-likelihood (PLL) scores of the stereotypical and anti-stereotypical samples using an indicator function. The disadvantage is the limited mining of the PLL score sets without capturing its distributional information. In this paper, we represent a PLL score set as a Gaussian distribution and use Kullback-Leibler (KL) divergence and Jensen–Shannon (JS) divergence to construct evaluation measures for the distributions of stereotypical and anti-stereotypical PLL scores. Experimental results on the publicly available datasets StereoSet (SS) and CrowS-Pairs (CP) show that our proposed measures are significantly more robust and interpretable than those proposed previously.

Published

2024-03-24

How to Cite

Liu, Y. (2024). Robust Evaluation Measures for Evaluating Social Biases in Masked Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 18707-18715. https://doi.org/10.1609/aaai.v38i17.29834

Issue

Section

AAAI Technical Track on Natural Language Processing II