Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training

Authors

  • Yuting Ning University of Science and Technology of China State Key Laboratory of Cognitive Intelligence
  • Zhenya Huang University of Science and Technology of China State Key Laboratory of Cognitive Intelligence
  • Xin Lin University of Science and Technology of China State Key Laboratory of Cognitive Intelligence
  • Enhong Chen University of Science and Technology of China State Key Laboratory of Cognitive Intelligence
  • Shiwei Tong University of Science and Technology of China State Key Laboratory of Cognitive Intelligence
  • Zheng Gong University of Science and Technology of China State Key Laboratory of Cognitive Intelligence
  • Shijin Wang State Key Laboratory of Cognitive Intelligence iFLYTEK AI Research (Central China), iFLYTEK Co., Ltd.

DOI:

https://doi.org/10.1609/aaai.v37i11.26573

Keywords:

SNLP: Applications, APP: Education

Abstract

Understanding mathematical questions effectively is a crucial task, which can benefit many applications, such as difficulty estimation. Researchers have drawn much attention to designing pre-training models for question representations due to the scarcity of human annotations (e.g., labeling difficulty). However, unlike general free-format texts (e.g., user comments), mathematical questions are generally designed with explicit purposes and mathematical logic, and usually consist of more complex content, such as formulas, and related mathematical knowledge (e.g., Function). Therefore, the problem of holistically representing mathematical questions remains underexplored. To this end, in this paper, we propose a novel contrastive pre-training approach for mathematical question representations, namely QuesCo, which attempts to bring questions with more similar purposes closer. Specifically, we first design two-level question augmentations, including content-level and structure-level, which generate literally diverse question pairs with similar purposes. Then, to fully exploit hierarchical information of knowledge concepts, we propose a knowledge hierarchy-aware rank strategy (KHAR), which ranks the similarities between questions in a fine-grained manner. Next, we adopt a ranking contrastive learning task to optimize our model based on the augmented and ranked questions. We conduct extensive experiments on two real-world mathematical datasets. The experimental results demonstrate the effectiveness of our model.

Downloads

Published

2023-06-26

How to Cite

Ning, Y., Huang, Z., Lin, X., Chen, E., Tong, S., Gong, Z., & Wang, S. (2023). Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 13409-13418. https://doi.org/10.1609/aaai.v37i11.26573

Issue

Section

AAAI Technical Track on Speech & Natural Language Processing