SpaceGTN: A Time-Agnostic Graph Transformer Network for Handwritten Diagram Recognition and Segmentation

Authors

  • Haoxiang Hu Beijing Key Laboratory of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Cangjun Gao Beijing Key Laboratory of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Yaokun Li Beijing Key Laboratory of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Xiaoming Deng Beijing Key Laboratory of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • YuKun Lai Cardiff University
  • Cuixia Ma Beijing Key Laboratory of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Yong-Jin Liu Tsinghua University
  • Hongan Wang Beijing Key Laboratory of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v38i3.27994

Keywords:

CV: Object Detection & Categorization, CV: Segmentation

Abstract

Online handwriting recognition is pivotal in domains like note-taking, education, healthcare, and office tasks. Existing diagram recognition algorithms mainly rely on the temporal information of strokes, resulting in a decline in recognition performance when dealing with notes that have been modified or have no temporal information. The current datasets are drawn based on templates and cannot reflect the real free-drawing situation. To address these challenges, we present SpaceGTN, a time-agnostic Graph Transformer Network, leveraging spatial integration and removing the need for temporal data. Extensive experiments on multiple datasets have demonstrated that our method consistently outperforms existing methods and achieves state-of-the-art performance. We also propose a pipeline that seamlessly connects offline and online handwritten diagrams. By integrating a stroke restoration technique with SpaceGTN, it enables intelligent editing of previously uneditable offline diagrams at the stroke level. In addition, we have also launched the first online handwritten diagram dataset, OHSD, which is collected using a free-drawing method and comes with modification annotations.

Published

2024-03-24

How to Cite

Hu, H., Gao, C., Li, Y., Deng, X., Lai, Y., Ma, C., Liu, Y.-J., & Wang, H. (2024). SpaceGTN: A Time-Agnostic Graph Transformer Network for Handwritten Diagram Recognition and Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(3), 2211-2219. https://doi.org/10.1609/aaai.v38i3.27994

Issue

Section

AAAI Technical Track on Computer Vision II