Fine-Tuning Large Language Models for Structured Clinical Report Generation Using GRPO
DOI:
https://doi.org/10.1609/aaaiss.v7i1.36923Abstract
The generation of structured medical reports using large language models (LLMs) presents unique challenges, particularly in maintaining clinical relevance and adhering to strict formatting requirements. In this work, we investigate the effectiveness of fine-tuning LLMs for structured report generation using DeepSeek R1 models. We conduct experiments with two model variants: DeepSeek R1 8B and DeepSeek R1 14B. For both models, we apply Group Relative Policy Optimization (GRPO) using the Medical Information Mart for Intensive Care (MIMIC-IV) dataset, leveraging Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning. Our results show that the GRPO fine-tuned DeepSeek-R1 8B and 14B models outperformed all baseline models, including the larger 32B DeepSeek-R1 model, demonstrating the effectiveness of parameter-efficient tuning. These findings underscore the potential of reinforcement learning-based fine-tuning of LLMs for generating structured reports in the medical domain.Downloads
Published
2025-11-23
How to Cite
Devulapalli, U., Satsangi, A., & Narayan, A. (2025). Fine-Tuning Large Language Models for Structured Clinical
Report Generation Using GRPO. Proceedings of the AAAI Symposium Series, 7(1), 496–500. https://doi.org/10.1609/aaaiss.v7i1.36923
Issue
Section
Safe, Ethical, Certified, Uncertainty-aware, Robust, and Explainable AI for Health (SECURE-AI4H)