MR-NET: Exploiting Mutual Relation for Visual Relationship Detection

Authors

  • Yi Bin University of Electronic Science and Technology of China
  • Yang Yang University of Electronic Science and Technology of China
  • Chaofan Tao University of Electronic Science and Technology of China
  • Zi Huang University of Queensland
  • Jingjing Li University of Electronic Science and Technology of China
  • Heng Tao Shen University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v33i01.33018110

Abstract

Inferring the interactions between objects, a.k.a visual relationship detection, is a crucial point for vision understanding, which captures more definite concepts than object detection. Most previous work that treats the interaction between a pair of objects as a one way fail to exploit the mutual relation between objects, which is essential to modern visual application. In this work, we propose a mutual relation net, dubbed MR-Net, to explore the mutual relation between paired objects for visual relationship detection. Specifically, we construct a mutual relation space to model the mutual interaction of paired objects, and employ linear constraint to optimize the mutual interaction, which is called mutual relation learning. Our mutual relation learning does not introduce any parameters, and can adapt to improve the performance of other methods. In addition, we devise a semantic ranking loss to discriminatively penalize predicates with semantic similarity, which is ignored by traditional loss function (e.g., cross entropy with softmax). Then, our MR-Net optimizes the mutual relation learning together with semantic ranking loss with a siamese network. The experimental results on two commonly used datasets (VG and VRD) demonstrate the superior performance of the proposed approach.

Downloads

Published

2019-07-17

How to Cite

Bin, Y., Yang, Y., Tao, C., Huang, Z., Li, J., & Shen, H. T. (2019). MR-NET: Exploiting Mutual Relation for Visual Relationship Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 8110-8117. https://doi.org/10.1609/aaai.v33i01.33018110

Issue

Section

AAAI Technical Track: Vision