Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis
DOI:
https://doi.org/10.1609/aaai.v38i7.28586Keywords:
CV: Representation Learning for Vision, DMKM: ApplicationsAbstract
Obtaining large-scale radiology reports can be difficult for medical images due to ethical concerns, limiting the effectiveness of contrastive pre-training in the medical image domain and underscoring the need for alternative methods. In this paper, we propose eye-tracking as an alternative to text reports, as it allows for the passive collection of gaze signals without ethical issues. By tracking the gaze of radiologists as they read and diagnose medical images, we can understand their visual attention and clinical reasoning. When a radiologist has similar gazes for two medical images, it may indicate semantic similarity for diagnosis, and these images should be treated as positive pairs when pre-training a computer-assisted diagnosis (CAD) network through contrastive learning. Accordingly, we introduce the Medical contrastive Gaze Image Pre-training (McGIP) as a plug-and-play module for contrastive learning frameworks. McGIP uses radiologist gaze to guide contrastive pre-training. We evaluate our method using two representative types of medical images and two common types of gaze data. The experimental results demonstrate the practicality of McGIP, indicating its high potential for various clinical scenarios and applications.Downloads
Published
2024-03-24
How to Cite
Zhao, Z., Wang, S., Wang, Q., & Shen, D. (2024). Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 7543-7551. https://doi.org/10.1609/aaai.v38i7.28586
Issue
Section
AAAI Technical Track on Computer Vision VI