Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval

Authors

  • Yue Zhuang Zhejiang University
  • Yan Wang Zhejiang University
  • Fei Wu Zhejiang University
  • Yin Zhang Zhejiang University
  • Wei Lu Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v27i1.8603

Keywords:

Multi-modal Retrieval, Cross-media Retrieval, Dictionary Learning, Supervised Learning

Abstract

A better similarity mapping function across heterogeneous high-dimensional features is very desirable for many applications involving multi-modal data. In this paper, we introduce coupled dictionary learning (DL) into supervised sparse coding for multi-modal (cross-media) retrieval. We call this Supervised coupled dictionary learning with group structures for Multi-Modal retrieval (SliM2). SliM2 formulates the multi-modal mapping as a constrained dictionary learning problem. By utilizing the intrinsic power of DL to deal with the heterogeneous features, SliM2 extends unimodal DL to multi-modal DL. Moreover, the label information is employed in SliM2 to discover the shared structure inside intra-modality within the same class by a mixed norm (i.e., `l1/l2`-norm). As a result, the multimodal retrieval is conducted via a set of jointly learned mapping functions across multi-modal data. The experimental results show the effectiveness of our proposed model when applied to cross-media retrieval.

Downloads

Published

2013-06-30

How to Cite

Zhuang, Y., Wang, Y., Wu, F., Zhang, Y., & Lu, W. (2013). Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 27(1), 1070-1076. https://doi.org/10.1609/aaai.v27i1.8603