Fine-Tuning Large Language Models for Structured Clinical
Report Generation Using GRPO

Uday Devulapalli; Aarat Satsangi; Apurva Narayan

doi:10.1609/aaaiss.v7i1.36923

Authors

Uday Devulapalli Western University, London, ON, Canada International Center for Applied Systems Science for Sustainable Development (ICASSSD), Cambridge, ON, Canada
Aarat Satsangi Western University, London, ON, Canada International Center for Applied Systems Science for Sustainable Development (ICASSSD), Cambridge, ON, Canada
Apurva Narayan Western University, London, ON, Canada

DOI:

https://doi.org/10.1609/aaaiss.v7i1.36923

Abstract

The generation of structured medical reports using large language models (LLMs) presents unique challenges, particularly in maintaining clinical relevance and adhering to strict formatting requirements. In this work, we investigate the effectiveness of fine-tuning LLMs for structured report generation using DeepSeek R1 models. We conduct experiments with two model variants: DeepSeek R1 8B and DeepSeek R1 14B. For both models, we apply Group Relative Policy Optimization (GRPO) using the Medical Information Mart for Intensive Care (MIMIC-IV) dataset, leveraging Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning. Our results show that the GRPO fine-tuned DeepSeek-R1 8B and 14B models outperformed all baseline models, including the larger 32B DeepSeek-R1 model, demonstrating the effectiveness of parameter-efficient tuning. These findings underscore the potential of reinforcement learning-based fine-tuning of LLMs for generating structured reports in the medical domain.

Fine-Tuning Large Language Models for Structured Clinical Report Generation Using GRPO

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information