On the Potential of Large Language Models in ECG-based AFib and Sinus Rhythm Detection and Justification
DOI:
https://doi.org/10.1609/aaaiss.v6i1.36071Abstract
Atrial fibrillation (AFib) is a common arrhythmia that is associated with increased stroke and mortality risk. It requires early and accurate detection for improved patient healthcare support. This study explores the application of vision-enabled large language models (LLMs)—specifically Llama-3.2-11B-Vision-Instruct and Qwen2-VL-7B-Instruct —for AFib and sinus rhythm detection using ECG images. We designed structured prompts to simulate clinical reasoning, evaluate rhythm features, and elicit model confidence. Models were tested on a curated PTB-XL subset under both full 12-lead and dual-lead (Lead II + V1) configurations. Results show that while Llama achieves higher diagnostic accuracy, especially with Chain-of-Thought prompting (up to 97% for AFib), both models struggle with consistent feature-level interpretation, particularly for sinus rhythm. Our findings underscore both the promise and current limitations of LLMs in ECG-based diagnosis. Bridging the gap between AI-generated outputs and clinical standards will require fine-tuning on ECG-specific data, robust prompting strategies, and hybrid approaches that integrate signal-level reasoning for improved interpretability and reliability in real-world settings.Downloads
Published
2025-08-01
How to Cite
Slim, M., Abbas, C., Assi, J., El Jebbawi, H., El Ghazawi, A., Awad, M., … Zouein, F. (2025). On the Potential of Large Language Models in ECG-based AFib and Sinus Rhythm Detection and Justification. Proceedings of the AAAI Symposium Series, 6(1), 341–349. https://doi.org/10.1609/aaaiss.v6i1.36071
Issue
Section
Human-AI Collaboration: Exploring Diversity of Human Cognitive Abilities and Varied AI Models for Hybrid Intelligent Systems