Adjective Scale Probe: Can Language Models Encode Formal Semantics Information?
Keywords:SNLP: Sentence-Level Semantics and Textual Inference, SNLP: Interpretability & Analysis of NLP Models, SNLP: Lexical & Frame Semantics, Semantic Parsing
AbstractIt is an open question what semantic representations transformer-based language models can encode and whether they have access to more abstract aspects of semantic meaning. Here, we propose a diagnostic dataset to investigate how well language models understand the degree semantics of adjectives. In the dataset, referred as the Adjective Scale Probe (ASP), we semi-automatically generate 8 tests of Natural Language Inference (NLI) questions to test 8 key capabilities of adjective interpretation. We apply the ASP dataset to evaluate the performance of 3 language models, i.e., BERT, DeBERTa, and T0. It is found that language models perform below the majority baseline for most tests of the ASP, even when the models have been fine-tuned to achieve high performance on the large-scale MNLI dataset. But after we fine-tune the pre-trained models on a subset of the ASP, DeBERTa can achieve high performance on the untrained adjectives and untrained tests, suggesting that DeBERTa may have captured degree semantic information of adjectives through pre-training but it needs specific training data to learn how to apply such information to the current tasks. In sum, the ASP provides an easy-to-use method to test fine-grained formal semantic properties of adjectives, and reveals language models' abilities to access formal semantic information.
How to Cite
Liu, W., Xiang, M., & Ding, N. (2023). Adjective Scale Probe: Can Language Models Encode Formal Semantics Information?. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 13282-13290. https://doi.org/10.1609/aaai.v37i11.26559
AAAI Technical Track on Speech & Natural Language Processing