Enhancing Vision-Language Models with Morphological and Taxonomic Knowledge: Towards Coral Recognition for Ocean Health

Authors

  • Hongyong Han Beijing University of Posts and Telecommunications
  • Wei Wang Beijing University of Posts and Telecommunications
  • Gaowei Zhang Beijing University of Posts and Telecommunications
  • Mingjie Li Technology Innovation Center for South China Sea Remote Sensing, Surveying and Mapping Collaborative Application, Ministry of Natural Resources South China Sea Development Research Institute, Ministry of Natural Resources
  • Yi Wang Beijing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v39i27.35023

Abstract

Coral reefs play a crucial role in marine ecosystems, offering a nutrient-rich environment and safe shelter for numerous marine species. Automated coral image recognition aids in monitoring ocean health at a scale without experts' manual effort. Recently, large vision-language models like CLIP have greatly enhanced zero-shot and low-shot classification capabilities for various visual tasks. However, these models struggle with fine-grained coral-related tasks due to a lack of specific knowledge. To bridge this gap, we compile a fine-grained coral image dataset consisting of 16,659 images with taxonomy labels (from Kingdom to Species), accompanied by morphology-specific text descriptions for each species. Based on the dataset, we propose CORAL-Adapter, integrating two complementary kinds of coral-specific knowledge (biological taxonomy and coral morphology) with general knowledge learned by CLIP. CORAL-Adapter is a simple yet powerful extension of CLIP with only a few parameter updates and can be used as a plug-and-play module with various CLIP-based methods. We show improvements in accuracy across diverse coral recognition tasks, e.g., recognizing corals unseen during training that are prone to bleaching or originate from different oceans.

Downloads

Published

2025-04-11

How to Cite

Han, H., Wang, W., Zhang, G., Li, M., & Wang, Y. (2025). Enhancing Vision-Language Models with Morphological and Taxonomic Knowledge: Towards Coral Recognition for Ocean Health. Proceedings of the AAAI Conference on Artificial Intelligence, 39(27), 28052–28060. https://doi.org/10.1609/aaai.v39i27.35023