Exploring Alternative Approaches to Language Modeling for Learning from Data and Knowledge

Authors

  • Yuxin Zi University of South Carolina
  • Kaushik Roy University of South Carolina
  • Vignesh Narayanan University of South Carolina
  • Amit Sheth University of South Carolina

DOI:

https://doi.org/10.1609/aaaiss.v3i1.31211

Keywords:

Language Modeling, Knowledge Graphs, Interpretability

Abstract

Despite their extensive application in language understanding tasks, large language models (LLMs) still encounter challenges including hallucinations - occasional fabrication of information - and alignment issues - lack of associations with human-curated world models (e.g., intuitive physics or common-sense knowledge). Moreover, the black-box nature of LLMs presents significant obstacles in training them effectively to achieve desired behaviors. In particular, modifying the concept embedding spaces of LLMs can be highly intractable. This process involves analyzing the implicit impact of such adjustments on the myriad parameters within LLMs and the resulting inductive biases. We propose a novel architecture that wraps powerful function approximation architectures within an outer, interpretable read-out layer. This read-out layer can be scrutinized to explicitly observe the effects of concept modeling during the training of the LLM. Our method stands in contrast with gradient-based implicit mechanisms, which depend solely on adjustments to the LLM parameters and thus evade scrutiny. By conducting extensive experiments across both generative and discriminative language modeling tasks, we evaluate the capabilities of our proposed architecture relative to state-of-the-art LLMs of similar sizes. Additionally, we offer a qualitative examination of the interpretable read-out layer and visualize the concepts it captures. The results demonstrate the potential of our approach for effectively controlling LLM hallucinations and enhancing the alignment with human expectations.

Downloads

Published

2024-05-20

Issue

Section

Empowering Machine Learning and Large Language Models with Domain and Commonsense Knowledge