Pairing-free Group-level Knowledge Distillation for Robust Gastrointestinal Lesion Classification in White-Light Endoscopy

Qiang Hu; Qimei Wang; Yingjie Guo; Qiang Li; Zhiwei Wang

doi:10.1609/aaai.v40i6.42494

Authors

Qiang Hu Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology
Qimei Wang Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology
Yingjie Guo Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology
Qiang Li Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology
Zhiwei Wang Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v40i6.42494

Abstract

White-Light Imaging (WLI) is the standard for endoscopic cancer screening, but Narrow-Band Imaging (NBI) offers superior diagnostic details. A key challenge is transferring knowledge from NBI to enhance WLI-only models, yet existing methods are critically hampered by their reliance on paired NBI-WLI images of the same lesion, a costly and often impractical requirement that leaves vast amounts of clinical data untapped. In this paper, we break this paradigm by introducing PaGKD, a novel Pairing-free Group-level Knowledge Distillation framework that that enables effective cross-modal learning using unpaired WLI and NBI data. Instead of forcing alignment between individual, often semantically mismatched image instances, PaGKD operates at the group level to distill more complete and compatible knowledge across modalities. Central to PaGKD are two complementary modules: (1) Group-level Prototype Distillation (GKD-Pro) distills compact group representations by extracting modality-invariant semantic prototypes via shared lesion-aware queries; (2) Group-level Dense Distillation (GKD-Den) performs dense cross-modal alignment by guiding group-aware attention with activation-derived relation maps. Together, these modules enforce global semantic consistency and local structural coherence without requiring image-level correspondence. Extensive experiments on four clinical datasets demonstrate that PaGKD consistently and significantly outperforms state-of-the-art methods, boosting AUC by 3.3%, 1.1%, 2.8%, and 3.2%, respectively, establishing a new direction for cross-modal learning from unpaired data.

Pairing-free Group-level Knowledge Distillation for Robust Gastrointestinal Lesion Classification in White-Light Endoscopy

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information