GIRNet: Interleaved Multi-Task Recurrent State Sequence Models


  • Divam Gupta Indian Institute of Technology Delhi
  • Tanmoy Chakraborty Indraprastha Institute of Information Technology Delhi
  • Soumen Chakrabarti Indian Institute of Technology Bombay



In several natural language tasks, labeled sequences are available in separate domains (say, languages), but the goal is to label sequences with mixed domain (such as code-switched text). Or, we may have available models for labeling whole passages (say, with sentiments), which we would like to exploit toward better position-specific label inference (say, target-dependent sentiment annotation). A key characteristic shared across such tasks is that different positions in a primary instance can benefit from different ‘experts’ trained from auxiliary data, but labeled primary instances are scarce, and labeling the best expert for each position entails unacceptable cognitive burden. We propose GIRNet, a unified position-sensitive multi-task recurrent neural network (RNN) architecture for such applications. Auxiliary and primary tasks need not share training instances. Auxiliary RNNs are trained over auxiliary instances. A primary instance is also submitted to each auxiliary RNN, but their state sequences are gated and merged into a novel composite state sequence tailored to the primary inference task. Our approach is in sharp contrast to recent multi-task networks like the crossstitch and sluice networks, which do not control state transfer at such fine granularity. We demonstrate the superiority of GIRNet using three applications: sentiment classification of code-switched passages, part-of-speech tagging of codeswitched text, and target position-sensitive annotation of sentiment in monolingual passages. In all cases, we establish new state-of-the-art performance beyond recent competitive baselines.




How to Cite

Gupta, D., Chakraborty, T., & Chakrabarti, S. (2019). GIRNet: Interleaved Multi-Task Recurrent State Sequence Models. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 6497-6504.



AAAI Technical Track: Natural Language Processing