Learning Robust Rationales for Model Explainability: A Guidance-Based Approach

Authors

  • Shuaibo Hu School of Computer and Information, Hefei University of Technology
  • Kui Yu School of Computer and Information, Hefei University of Technology

DOI:

https://doi.org/10.1609/aaai.v38i16.29783

Keywords:

NLP: Interpretability, Analysis, and Evaluation of NLP Models

Abstract

Selective rationalization can be regarded as a straightforward self-explaining approach for enhancing model explainability in natural language processing tasks. It aims to provide explanations that are more accessible and understandable to non-technical users by first selecting subsets of input texts as rationales and then predicting based on chosen subsets. However, existing methods that follow this select-then-predict framework may suffer from the rationalization degeneration problem, resulting in sub-optimal or unsatisfactory rationales that do not align with human judgments. This problem may further lead to rationalization failure, resulting in meaningless rationales that ultimately undermine people's trust in the rationalization model. To address these challenges, we propose a Guidance-based Rationalization method (G-RAT) that effectively improves robustness against failure situations and the quality of rationales by using a guidance module to regularize selections and distributions. Experimental results on two synthetic settings prove that our method is robust to the rationalization degeneration and failure problems, while the results on two real datasets show its effectiveness in providing rationales in line with human judgments. The source code is available at https://github.com/shuaibo919/g-rat.

Published

2024-03-24

How to Cite

Hu, S., & Yu, K. (2024). Learning Robust Rationales for Model Explainability: A Guidance-Based Approach. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16), 18243-18251. https://doi.org/10.1609/aaai.v38i16.29783

Issue

Section

AAAI Technical Track on Natural Language Processing I