Out of Length Text Recognition with Sub-String Matching

Authors

  • Yongkun Du School of Computer Science, Fudan University, China
  • Zhineng Chen School of Computer Science, Fudan University, China
  • Caiyan Jia School of Computer Science and Technology, Beijing Jiaotong University, China
  • Xieping Gao Laboratory for Artificial Intelligence and International Communication, Hunan Normal University, China
  • Yu-Gang Jiang School of Computer Science, Fudan University, China

DOI:

https://doi.org/10.1609/aaai.v39i3.32285

Abstract

Scene Text Recognition (STR) methods have demonstrated robust performance in word-level text recognition. However, in real applications the text image is sometimes long due to detected with multiple horizontal words. It triggers the requirement to build long text recognition models from readily available short (i.e., word-level) text datasets, which has been less studied previously. In this paper, we term this task Out of Length (OOL) text recognition. We establish the first Long Text Benchmark (LTB) to facilitate the assessment of different methods in long text recognition. Meanwhile, we propose a novel method called OOL Text Recognition with sub-String Matching (SMTR). SMTR comprises two cross-attention-based modules: one encodes a sub-string containing multiple characters into next and previous queries, and the other employs the queries to attend to the image features, matching the sub-string and simultaneously recognizing its next and previous character. SMTR can recognize text of arbitrary length by iterating the process above. To avoid being trapped in recognizing highly similar sub-strings, we introduce a regularization training to compel SMTR to effectively discover subtle differences between similar sub-strings for precise matching. In addition, we propose an inference augmentation strategy to alleviate confusion caused by identical sub-strings in the same text and improve the overall recognition efficiency. Extensive experimental results reveal that SMTR, even when trained exclusively on short text, outperforms existing methods in public short text benchmarks and exhibits a clear advantage on LTB.

Downloads

Published

2025-04-11

How to Cite

Du, Y., Chen, Z., Jia, C., Gao, X., & Jiang, Y.-G. (2025). Out of Length Text Recognition with Sub-String Matching. Proceedings of the AAAI Conference on Artificial Intelligence, 39(3), 2798–2806. https://doi.org/10.1609/aaai.v39i3.32285

Issue

Section

AAAI Technical Track on Computer Vision II