Domain-Specific Retrieval for Retrieval-Augmented Generation: A Case Study on Pertussis Research (Student Abstract)
DOI:
https://doi.org/10.1609/aaai.v40i48.42286Abstract
Integrating knowledge from scientific literature is essential in biomedical research. However, the rapid growth of scientific literature makes staying up to date increasingly challenging. Retrieval-Augmented Generation (RAG) offers a promising framework, but its effectiveness in specialized biomedical domains remains unclear. In this work, we propose a two-stage retrieval pipeline for RAG, with a focus on Bordetella pertussis as a case study. Our method first applies hard filtering with synonym expansion to eliminate irrelevant passages, and then performs hybrid search, followed by reranking. We evaluate our approach using a dataset of 58 pertussis-related queries with automatic relevance judgments from multiple large language models (LLMs). Experimental results show that our pipeline improves MAP@10 by 13.4-20.4 points compared with existing methods and achieves the highest MRR@10. Furthermore, consistent improvements across different LLMs highlight the effectiveness of our approach.Downloads
Published
2026-03-14
How to Cite
Takabatake, H., Martono, N. P., Kuwae, A., Iuchi, T., & Ohwada, H. (2026). Domain-Specific Retrieval for Retrieval-Augmented Generation: A Case Study on Pertussis Research (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41403–41405. https://doi.org/10.1609/aaai.v40i48.42286
Issue
Section
AAAI Student Abstract and Poster Program