TextBoxes: A Fast Text Detector with a Single Deep Neural Network

Authors

  • Minghui Liao Huazhong University of Science and Technology
  • Baoguang Shi Huazhong University of Science and Technology
  • Xiang Bai Huazhong University of Science and Technology
  • Xinggang Wang Huazhong University of Science and Technology
  • Wenyu Liu Huazhong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v31i1.11196

Keywords:

scene text, convolutional neural network, text localization, word spotting, end to end recognition

Abstract

This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks.

Downloads

Published

2017-02-12

How to Cite

Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W. (2017). TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.11196