Exploiting Sentence Embedding for Medical Question Answering


  • Yu Hao Tsinghua University
  • Xien Liu Tsinghua University
  • Ji Wu Tsinghua University
  • Ping Lv Tsinghua University




Despite the great success of word embedding, sentence embedding remains a not-well-solved problem. In this paper, we present a supervised learning framework to exploit sentence embedding for the medical question answering task. The learning framework consists of two main parts: 1) a sentence embedding producing module, and 2) a scoring module. The former is developed with contextual self-attention and multi-scale techniques to encode a sentence into an embedding tensor. This module is shortly called Contextual self-Attention Multi-scale Sentence Embedding (CAMSE). The latter employs two scoring strategies: Semantic Matching Scoring (SMS) and Semantic Association Scoring (SAS). SMS measures similarity while SAS captures association between sentence pairs: a medical question concatenated with a candidate choice, and a piece of corresponding supportive evidence. The proposed framework is examined by two Medical Question Answering(MedicalQA) datasets which are collected from real-world applications: medical exam and clinical diagnosis based on electronic medical records (EMR). The comparison results show that our proposed framework achieved significant improvements compared to competitive baseline approaches. Additionally, a series of controlled experiments are also conducted to illustrate that the multi-scale strategy and the contextual self-attention layer play important roles for producing effective sentence embedding, and the two kinds of scoring strategies are highly complementary to each other for question answering problems.




How to Cite

Hao, Y., Liu, X., Wu, J., & Lv, P. (2019). Exploiting Sentence Embedding for Medical Question Answering. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 938-945. https://doi.org/10.1609/aaai.v33i01.3301938



AAAI Technical Track: Applications