Pyramid Attention Aggregation Network for Semantic Segmentation of Surgical Instruments

Authors

  • Zhen-Liang Ni University of Chinese Academy of Sciences
  • Gui-Bin Bian University of Chinese Academy of Sciences
  • Guan-An Wang University of Chinese Academy of Sciences
  • Xiao-Hu Zhou Chinese Academy of Sciences
  • Zeng-Guang Hou University of Chinese Academy of Sciences
  • Hua-Bin Chen University of Chinese Academy of Sciences
  • Xiao-Liang Xie Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v34i07.6850

Abstract

Semantic segmentation of surgical instruments plays a critical role in computer-assisted surgery. However, specular reflection and scale variation of instruments are likely to occur in the surgical environment, undesirably altering visual features of instruments, such as color and shape. These issues make semantic segmentation of surgical instruments more challenging. In this paper, a novel network, Pyramid Attention Aggregation Network, is proposed to aggregate multi-scale attentive features for surgical instruments. It contains two critical modules: Double Attention Module and Pyramid Upsampling Module. Specifically, the Double Attention Module includes two attention blocks (i.e., position attention block and channel attention block), which model semantic dependencies between positions and channels by capturing joint semantic information and global contexts, respectively. The attentive features generated by the Double Attention Module can distinguish target regions, contributing to solving the specular reflection issue. Moreover, the Pyramid Upsampling Module extracts local details and global contexts by aggregating multi-scale attentive features. It learns the shape and size features of surgical instruments in different receptive fields and thus addresses the scale variation issue. The proposed network achieves state-of-the-art performance on various datasets. It achieves a new record of 97.10% mean IOU on Cata7. Besides, it comes first in the MICCAI EndoVis Challenge 2017 with 9.90% increase on mean IOU.

Downloads

Published

2020-04-03

How to Cite

Ni, Z.-L., Bian, G.-B., Wang, G.-A., Zhou, X.-H., Hou, Z.-G., Chen, H.-B., & Xie, X.-L. (2020). Pyramid Attention Aggregation Network for Semantic Segmentation of Surgical Instruments. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 11782-11790. https://doi.org/10.1609/aaai.v34i07.6850

Issue

Section

AAAI Technical Track: Vision