Semantic Alignment of Malicious Question Based on Contrastive Semantic Networks and Data Augmentation (Abstract Reprint)

Authors

  • Xinyan Wang School of Cyber Science and Engineering, Wuhan University
  • Jinshuo Liu School of Cyber Science and Engineering, Wuhan University
  • Juan Deng School of Cyber Science and Engineering, Wuhan University
  • Meng Wang School of Cyber Science and Engineering, Wuhan University
  • Qian Deng School of Cyber Science and Engineering, Wuhan University
  • Youcheng Yan School of Cyber Science and Engineering, Wuhan University
  • Lina Wang School of Cyber Science and Engineering, Wuhan University
  • Yunsong Ma School of Computer Science, University of Sydney
  • Jeff Z. Pan The University of Edinburgh, Edinburgh

DOI:

https://doi.org/10.1609/aaai.v40i47.41418

Abstract

The identification and filtration of malicious texts in social media environments represent a significant technical challenge aimed at protecting users from online violence and disinformation. This complexity stems from the diversity and innovativeness of social media texts, which include unique expressions and special sentence structures. Particularly, malicious texts in interrogative forms pose alignment challenges with traditional corpora due to existing methods’ failure to exploit the text’s deep global semantic representations. This issue is compounded by the scant research on Chinese texts, leading to inefficiencies in recognition accuracy. To mitigate these challenges, we introduce an innovative framework based on a Global Contrastive Semantic Network (GCSN), designed to enhance malicious text recognition efficiency and accuracy by deeply learning global semantic knowledge. It comprises an encoder for global semantic information modelling and a graph-matching network for semantic similarity evaluation between question pairs, enabling the accurate identification and filtering of malicious texts with complex structures. Furthermore, we introduce a semantic consistency-based data augmentation method (COMBINE), using real-world data to generate balanced positive and negative samples, enriching the dataset and enhancing the model’s ability to distinguish semantic consistency through contrastive learning. Experimental validation on two Chinese datasets demonstrates our model’s exceptional performance, affirming its applicationa value in social media malicious text recognition. Our code is available at https://github.com/Wxy13131313131/GCSN-COMBINE

Downloads

Published

2026-03-14

How to Cite

Wang, X., Liu, J., Deng, J., Wang, M., Deng, Q., Yan, Y., … Pan, J. Z. (2026). Semantic Alignment of Malicious Question Based on Contrastive Semantic Networks and Data Augmentation (Abstract Reprint). Proceedings of the AAAI Conference on Artificial Intelligence, 40(47), 39903–39903. https://doi.org/10.1609/aaai.v40i47.41418