Pyramid Attention Aggregation Network for Semantic Segmentation of Surgical Instruments

Zhen-Liang Ni; Gui-Bin Bian; Guan-An Wang; Xiao-Hu Zhou; Zeng-Guang Hou; Hua-Bin Chen; Xiao-Liang Xie

doi:10.1609/aaai.v34i07.6850

Authors

Zhen-Liang Ni University of Chinese Academy of Sciences
Gui-Bin Bian University of Chinese Academy of Sciences
Guan-An Wang University of Chinese Academy of Sciences
Xiao-Hu Zhou Chinese Academy of Sciences
Zeng-Guang Hou University of Chinese Academy of Sciences
Hua-Bin Chen University of Chinese Academy of Sciences
Xiao-Liang Xie Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v34i07.6850

Abstract

Semantic segmentation of surgical instruments plays a critical role in computer-assisted surgery. However, specular reflection and scale variation of instruments are likely to occur in the surgical environment, undesirably altering visual features of instruments, such as color and shape. These issues make semantic segmentation of surgical instruments more challenging. In this paper, a novel network, Pyramid Attention Aggregation Network, is proposed to aggregate multi-scale attentive features for surgical instruments. It contains two critical modules: Double Attention Module and Pyramid Upsampling Module. Specifically, the Double Attention Module includes two attention blocks (i.e., position attention block and channel attention block), which model semantic dependencies between positions and channels by capturing joint semantic information and global contexts, respectively. The attentive features generated by the Double Attention Module can distinguish target regions, contributing to solving the specular reflection issue. Moreover, the Pyramid Upsampling Module extracts local details and global contexts by aggregating multi-scale attentive features. It learns the shape and size features of surgical instruments in different receptive fields and thus addresses the scale variation issue. The proposed network achieves state-of-the-art performance on various datasets. It achieves a new record of 97.10% mean IOU on Cata7. Besides, it comes first in the MICCAI EndoVis Challenge 2017 with 9.90% increase on mean IOU.

Pyramid Attention Aggregation Network for Semantic Segmentation of Surgical Instruments

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription