Domain-Specific Retrieval for Retrieval-Augmented Generation: A Case Study on Pertussis Research (Student Abstract)

Authors

  • Hiroki Takabatake Tokyo University of Science
  • Niken Prasasti Martono Tokyo University of Science
  • Asaomi Kuwae Kitasato University
  • Toshihiko Iuchi Tokyo University of Science
  • Hayato Ohwada Tokyo University of Science

DOI:

https://doi.org/10.1609/aaai.v40i48.42286

Abstract

Integrating knowledge from scientific literature is essential in biomedical research. However, the rapid growth of scientific literature makes staying up to date increasingly challenging. Retrieval-Augmented Generation (RAG) offers a promising framework, but its effectiveness in specialized biomedical domains remains unclear. In this work, we propose a two-stage retrieval pipeline for RAG, with a focus on Bordetella pertussis as a case study. Our method first applies hard filtering with synonym expansion to eliminate irrelevant passages, and then performs hybrid search, followed by reranking. We evaluate our approach using a dataset of 58 pertussis-related queries with automatic relevance judgments from multiple large language models (LLMs). Experimental results show that our pipeline improves MAP@10 by 13.4-20.4 points compared with existing methods and achieves the highest MRR@10. Furthermore, consistent improvements across different LLMs highlight the effectiveness of our approach.

Published

2026-03-14

How to Cite

Takabatake, H., Martono, N. P., Kuwae, A., Iuchi, T., & Ohwada, H. (2026). Domain-Specific Retrieval for Retrieval-Augmented Generation: A Case Study on Pertussis Research (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41403–41405. https://doi.org/10.1609/aaai.v40i48.42286