SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition

Chengwei Zhang; Yunlu Xu; Zhanzhan Cheng; Shiliang Pu; Yi Niu; Fei Wu; Futai Zou

doi:10.1609/aaai.v35i4.16442

Authors

Chengwei Zhang Shanghai Jiaotong University
Yunlu Xu Hikvision Research Institute
Zhanzhan Cheng Zhejiang University Hikvision Research Institute
Shiliang Pu Hikvision Research Institute
Yi Niu Hikvision Research Institute
Fei Wu Zhejiang University
Futai Zou Shanghai Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v35i4.16442

Keywords:

Applications

Abstract

Arbitrary text appearance poses a great challenge in scene text recognition tasks. Existing works mostly handle with the problem in consideration of the shape distortion, including perspective distortions, line curvature or other style variations. Rectification (i.e., spatial transformers) as the preprocessing stage is one popular approach and extensively studied. However, chromatic difficulties in complex scenes have not been paid much attention on. In this work, we introduce a new learnable geometric-unrelated rectification, Structure-Preserving Inner Offset Network (SPIN), which allows the color manipulation of source data within the network. This differentiable module can be inserted before any recognition architecture to ease the downstream tasks, giving neural networks the ability to actively transform input intensity rather than only the spatial rectification. It can also serve as a complementary module to known spatial transformations and work in both independent and collaborative ways with them. Extensive experiments show the proposed transformation outperforms existing rectification networks and has comparable performance among the state-of-the-arts.

SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription