Fine-Tuning Large Language Models for Structured Clinical Report Generation Using GRPO

Authors

  • Uday Devulapalli Western University, London, ON, Canada International Center for Applied Systems Science for Sustainable Development (ICASSSD), Cambridge, ON, Canada
  • Aarat Satsangi Western University, London, ON, Canada International Center for Applied Systems Science for Sustainable Development (ICASSSD), Cambridge, ON, Canada
  • Apurva Narayan Western University, London, ON, Canada

DOI:

https://doi.org/10.1609/aaaiss.v7i1.36923

Abstract

The generation of structured medical reports using large language models (LLMs) presents unique challenges, particularly in maintaining clinical relevance and adhering to strict formatting requirements. In this work, we investigate the effectiveness of fine-tuning LLMs for structured report generation using DeepSeek R1 models. We conduct experiments with two model variants: DeepSeek R1 8B and DeepSeek R1 14B. For both models, we apply Group Relative Policy Optimization (GRPO) using the Medical Information Mart for Intensive Care (MIMIC-IV) dataset, leveraging Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning. Our results show that the GRPO fine-tuned DeepSeek-R1 8B and 14B models outperformed all baseline models, including the larger 32B DeepSeek-R1 model, demonstrating the effectiveness of parameter-efficient tuning. These findings underscore the potential of reinforcement learning-based fine-tuning of LLMs for generating structured reports in the medical domain.

Downloads

Published

2025-11-23

How to Cite

Devulapalli, U., Satsangi, A., & Narayan, A. (2025). Fine-Tuning Large Language Models for Structured Clinical Report Generation Using GRPO. Proceedings of the AAAI Symposium Series, 7(1), 496–500. https://doi.org/10.1609/aaaiss.v7i1.36923

Issue

Section

Safe, Ethical, Certified, Uncertainty-aware, Robust, and Explainable AI for Health (SECURE-AI4H)