Generation-Focused Table-Based Intermediate Pre-training for Free-Form Question Answering


  • Peng Shi University of Waterloo
  • Patrick Ng AWS AI Labs
  • Feng Nan AWS AI Labs
  • Henghui Zhu AWS AI Labs
  • Jun Wang AWS AI Labs
  • Jiarong Jiang AWS AI Labs
  • Alexander Hanbo Li AWS AI Labs
  • Rishav Chakravarti AWS AI Labs
  • Donald Weidner AWS AI Labs
  • Bing Xiang AWS AI Labs
  • Zhiguo Wang AWS AI Labs



Speech & Natural Language Processing (SNLP)


Question answering over semi-structured tables has attracted significant attention in the NLP community. However, most of the existing work focus on questions that can be answered with short-form answer, i.e. the answer is often a table cell or aggregation of multiple cells. This can mismatch with the intents of users who want to ask more complex questions that require free-form answers such as explanations. To bridge the gap, most recently, pre-trained sequence-to-sequence language models such as T5 are used for generating free-form answers based on the question and table inputs. However, these pre-trained language models have weaker encoding abilities over table cells and schema. To mitigate this issue, in this work, we present an intermediate pre-training framework, Generation-focused Table-based Intermediate Pre-training (GENTAP), that jointly learns representations of natural language questions and tables. GENTAP learns to generate via two training objectives to enhance the question understanding and table representation abilities for complex questions. Based on experimental results, models that leverage GENTAP framework outperform the existing baselines on FETAQA benchmark. The pre-trained models are not only useful for free-form question answering, but also for few-shot data-to-text generation task, thus showing good transfer ability by obtaining new state-of-the-art results.




How to Cite

Shi, P., Ng, P., Nan, F., Zhu, H., Wang, J., Jiang, J., Li, A. H., Chakravarti, R., Weidner, D., Xiang, B., & Wang, Z. (2022). Generation-Focused Table-Based Intermediate Pre-training for Free-Form Question Answering. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10), 11312-11320.



AAAI Technical Track on Speech and Natural Language Processing