Distributional Semantics Meets Multi-Label Learning

Vivek Gupta; Rahul Wadbude; Nagarajan Natarajan; Harish Karnick; Prateek Jain; Piyush Rai

doi:10.1609/aaai.v33i01.33013747

Authors

Vivek Gupta University of Utah
Rahul Wadbude AlphaGrep
Nagarajan Natarajan Microsoft Research
Harish Karnick Indian Institute of Technology Kanpur
Prateek Jain Microsoft Research
Piyush Rai Indian Institute of Technology Kanpur

DOI:

https://doi.org/10.1609/aaai.v33i01.33013747

Abstract

We present a label embedding based approach to large-scale multi-label learning, drawing inspiration from ideas rooted in distributional semantics, specifically the Skip Gram Negative Sampling (SGNS) approach, widely used to learn word embeddings. Besides leading to a highly scalable model for multi-label learning, our approach highlights interesting connections between label embedding methods commonly used for multi-label learning and paragraph embedding methods commonly used for learning representations of text data. The framework easily extends to incorporating auxiliary information such as label-label correlations; this is crucial especially when many training instances are only partially annotated. To facilitate end-to-end learning, we develop a joint learning algorithm that can learn the embeddings as well as a regression model that predicts these embeddings for the new input to be annotated, via efficient gradient based methods. We demonstrate the effectiveness of our approach through an extensive set of experiments on a variety of benchmark datasets, and show that the proposed models perform favorably as compared to state-of-the-art methods for large-scale multi-label learning.

Distributional Semantics Meets Multi-Label Learning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information