Axis-Aligned Document Dewarping

Authors

  • Chaoyun Wang National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center of Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University
  • I-Chao Shen The University of Tokyo
  • Takeo Igarashi The University of Tokyo
  • Caigui Jiang National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center of Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v40i12.37933

Abstract

Document dewarping is crucial for many applications. However, existing learning-based methods rely heavily on supervised regression with annotated data without fully leveraging the inherent geometric properties of physical documents. Our key insight is that a well-dewarped document is defined by its axis-aligned feature lines. This property aligns with the inherent axis-aligned nature of the discrete grid geometry in planar documents. Harnessing this property, we introduce three synergistic contributions: for the training phase, we propose an axis-aligned geometric constraint to enhance document dewarping; for the inference phase, we propose an axis alignment preprocessing strategy to reduce the dewarping difficulty; and for the evaluation phase, we introduce a new metric, Axis-Aligned Distortion (AAD), that not only incorporates geometric meaning and aligns with human visual perception but also demonstrates greater robustness. As a result, our method achieves state-of-the-art performance on multiple existing benchmarks, improving the AAD metric by 18.2% to 34.5%.

Downloads

Published

2026-03-14

How to Cite

Wang, C., Shen, I.-C., Igarashi, T., & Jiang, C. (2026). Axis-Aligned Document Dewarping. Proceedings of the AAAI Conference on Artificial Intelligence, 40(12), 9702–9710. https://doi.org/10.1609/aaai.v40i12.37933

Issue

Section

AAAI Technical Track on Computer Vision IX