Safe RAG by RAG: Untying the Bell That RAG Rang with the RAG Hand

Authors

  • Xun Liang Renmin University of China
  • Mengwei Wang Renmin University of China
  • Yuefeng Ma Qufu Normal University
  • Simin Niu Renmin University of China

DOI:

https://doi.org/10.1609/aaai.v40i38.40462

Abstract

Retrieval-augmented generation (RAG) is widely adopted for knowledge-intensive tasks, but unverified external knowledge can pose risks such as data injection and retrieval pollution, leading to unexpected generation. Existing defenses rely on patch-based fixes, which limit generalization and increase system latency. To address these issues, we propose RAG2RAG, a framework-level security solution specifically designed for RAG. Inspired by human intuition to reason about what can and cannot be said during RAG phase, RAG2RAG augments the main RAG module with a lightweight RAG-based security expert module composed of two components: (1) a Detective that dynamically retrieves supporting evidence, and (2) a Judge that makes final decisions based on retrieved context. The main and expert modules operate in parallel without causing noticeable delays. Experiments across two languages, six domains, and seven types of poisoning attacks demonstrate that RAG2RAG overall achieves higher accuracy and lower attack success rates than seven mainstream baselines. Furthermore, it integrates seamlessly with various RAG architectures, offering efficient protection across diverse threat scenarios.

Downloads

Published

2026-03-14

How to Cite

Liang, X., Wang, M., Ma, Y., & Niu, S. (2026). Safe RAG by RAG: Untying the Bell That RAG Rang with the RAG Hand. Proceedings of the AAAI Conference on Artificial Intelligence, 40(38), 31925–31933. https://doi.org/10.1609/aaai.v40i38.40462

Issue

Section

AAAI Technical Track on Natural Language Processing III