LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network

Yuchen Su; Zhineng Chen; Zhiwen Shao; Yuning Du; Zhilong Ji; Jinfeng Bai; Yong Zhou; Yu-Gang Jiang

doi:10.1609/aaai.v38i5.28302

Authors

Yuchen Su Shanghai Collaborative Innovation Center of Intelligent Visual Computing, School of Computer Science, Fudan University Baidu Inc.
Zhineng Chen Shanghai Collaborative Innovation Center of Intelligent Visual Computing, School of Computer Science, Fudan University
Zhiwen Shao China University of Mining and Technology
Yuning Du Baidu Inc.
Zhilong Ji Tomorrow Advancing Life
Jinfeng Bai Tomorrow Advancing Life
Yong Zhou China University of Mining and Technology
Yu-Gang Jiang Shanghai Collaborative Innovation Center of Intelligent Visual Computing, School of Computer Science, Fudan University

DOI:

https://doi.org/10.1609/aaai.v38i5.28302

Keywords:

CV: Scene Analysis & Understanding, CV: Object Detection & Categorization

Abstract

Recently, regression-based methods, which predict parameterized text shapes for text localization, have gained popularity in scene text detection. However, the existing parameterized text shape methods still have limitations in modeling arbitrary-shaped texts due to ignoring the utilization of text-specific shape information. Moreover, the time consumption of the entire pipeline has been largely overlooked, leading to a suboptimal overall inference speed. To address these issues, we first propose a novel parameterized text shape method based on low-rank approximation. Unlike other shape representation methods that employ data-irrelevant parameterization, our approach utilizes singular value decomposition and reconstructs the text shape using a few eigenvectors learned from labeled text contours. By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation. Next, we propose a dual assignment scheme for speed acceleration. It adopts a sparse assignment branch to accelerate the inference speed, and meanwhile, provides ample supervised signals for training through a dense assignment branch. Building upon these designs, we implement an accurate and efficient arbitrary-shaped text detector named LRANet. Extensive experiments are conducted on several challenging benchmarks, demonstrating the superior accuracy and efficiency of LRANet compared to state-of-the-art methods. Code is available at: https://github.com/ychensu/LRANet.git

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription