Efficient Context Retention in LLMs: Enhancing In-Context Memorization as an Alternative
DOI:
https://doi.org/10.1609/aaaiss.v7i1.36933Abstract
Large Language Models (LLMs) are widely utilized for tasks requiring contextual understanding; however, their reliance on large context windows introduces significant computational overhead due to the transformer's quadratic complexity. This inefficiency is a critical barrier to their deployment in resource-constrained settings like rural healthcare, where processing longitudinal patient data from Electronic Health Records (EHRs) is essential. To achieve this, our research investigates an alternative paradigm: training lightweight, specialized models for complete knowledge internalization, enabling them to function as persistent and efficient knowledge bases on local hardware. Our methodology involves training a 12-layer, 124-million-parameter nanoGPT model de novo on specialized subsets of the MMLU benchmark, including domains relevant to healthcare. The training objective was explicitly data internalization, not generalization. The entire domain-specific corpus, consisting of over 250,000 tokens formatted for a question-and-answer recall task, was used for training until the model achieved near-zero training loss. Performance was then evaluated on the model's ability to perfectly reproduce answers from a "seen" validation set, with recall certainty quantified via softmax probabilities. The resulting models successfully internalized their respective knowledge domains, achieving near-100% accuracy on recall tasks with high confidence scores. This outcome validates that targeted training for memorization can produce reliable and computationally efficient expert agents. For rural health, this approach offers a practical alternative to large context windows, enabling the deployment of a fleet of specialized models on local hardware for tasks like patient history recall or clinical guideline retrieval. This drastically reduces computational costs and latency, providing a scalable solution without requiring continuous, high-bandwidth cloud access.Downloads
Published
2025-11-23
How to Cite
Patel, B., & Kim, E. (2025). Efficient Context Retention in LLMs: Enhancing In-Context Memorization as an Alternative. Proceedings of the AAAI Symposium Series, 7(1), 566-566. https://doi.org/10.1609/aaaiss.v7i1.36933
Issue
Section
Safe, Ethical, Certified, Uncertainty-aware, Robust, and Explainable AI for Health (SECURE-AI4H)