PandemIQ Llama: A Domain-Adapted Foundation Model for Enhanced Pandemic Intelligence
DOI:
https://doi.org/10.1609/aaai.v40i46.41301Abstract
We introduce PandemIQ Llama, a domain-adapted large language model (LLM) designed specifically for pandemic intelligence applications. Building on the pre-trained Llama-3.1-8B model, we conducted continuous training using our curated Pandemic Corpus. This dataset was assembled from authoritative public health sources, scientific literature, and specialized knowledge repositories, comprising 508,924 documents totaling 5.8 billion tokens, which is the largest pandemic domain specific data cohort for LLM training. Benefited from our curated large data cohorts and through continuous training leveraging extensive computational resources, the developed PandemIQ Llama model can extract critical domain knowledge on pandemic, which is typically underrepresented in general-purpose language models, To evaluate its performance, we conducted comprehensive comparison of PandemIQ Llama with both prompt-engineered and task-specific fine-tuned baseline models using two tasks: the Biomedical Alert News Question Answering task (1,508 reports with 30 expert-generated questions each) and the Disease Event Type Classification benchmark (4,500 news snippets across eight disease categories). PandemIQ Llama demonstrated substantial improvements over strong baseline models, achieving performance gains ranging from 3.8% to 10.97%. These results suggest that PandemIQ Llama could significantly enhance public health surveillance and analysis capabilities. In addition, our result also suggests that the LLMs can perform better with continuous training than fine-tuning on domain specific tasks. Social Impact: The BEACON platform, powered by our model, launched and now serves over 100 government and multilateral public health organizations and users across 154 countries. Analytics from the platform is being integrated into the Epidemic Intelligence from Open Sources system run by the World Health Organization. This integration will provide public health decision-makers with a powerful LLM-based tool for pandemic surveillance.Published
2026-03-14
How to Cite
Yang, J., Talaei, M., Lassmann, B., Bhadelia, N., & Paschalidis, I. C. (2026). PandemIQ Llama: A Domain-Adapted Foundation Model for Enhanced Pandemic Intelligence. Proceedings of the AAAI Conference on Artificial Intelligence, 40(46), 39504–39512. https://doi.org/10.1609/aaai.v40i46.41301
Issue
Section
AAAI Special Track on AI for Social Impact II