Contrastive Learning Reduces Hallucination in Conversations

Weiwei Sun; Zhengliang Shi; Shen Gao; Pengjie Ren; Maarten de Rijke; Zhaochun Ren

doi:10.1609/aaai.v37i11.26596

Authors

Weiwei Sun Shandong University
Zhengliang Shi Shandong University
Shen Gao Shandong University
Pengjie Ren Shandong University
Maarten de Rijke University of Amsterdam
Zhaochun Ren Shandong University

DOI:

https://doi.org/10.1609/aaai.v37i11.26596

Keywords:

SNLP: Conversational AI/Dialogue Systems, SNLP: Applications, SNLP: Generation, SNLP: Language Models

Abstract

Pre-trained language models (LMs) store knowledge in their parameters and can generate informative responses when used in conversational systems. However, LMs suffer from the problem of “hallucination:” they may generate plausible-looking statements that are irrelevant or factually incorrect. To address this problem, we propose a contrastive learning scheme, named MixCL. A novel mixed contrastive objective is proposed to explicitly optimize the implicit knowledge elicitation process of LMs, and thus reduce their hallucination in conversations. We also examine negative sampling strategies of retrieved hard negatives and model-generated negatives. We conduct experiments on Wizard-of-Wikipedia, a public, open-domain knowledge-grounded dialogue benchmark, and assess the effectiveness of MixCL. MixCL effectively reduces the hallucination of LMs in conversations and achieves the highest performance among LM-based dialogue agents in terms of relevancy and factuality. We show that MixCL achieves comparable performance to state-of-the-art KB-based approaches while enjoying notable advantages in terms of efficiency and scalability.

Contrastive Learning Reduces Hallucination in Conversations

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription