LORE: Logical Location Regression Network for Table Structure Recognition

Authors

  • Hangdi Xing Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University
  • Feiyu Gao DAMO Academy, Alibaba Group, Hangzhou, China
  • Rujiao Long DAMO Academy, Alibaba Group, Hangzhou, China
  • Jiajun Bu Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University
  • Qi Zheng DAMO Academy, Alibaba Group, Hangzhou, China
  • Liangcheng Li Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University
  • Cong Yao DAMO Academy, Alibaba Group, Hangzhou, China
  • Zhi Yu Zhejiang Provincial Key Laboratory of Service Robot, School of Software Technology, Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v37i3.25402

Keywords:

CV: Object Detection & Categorization, CV: Scene Analysis & Understanding

Abstract

Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes, or learning to generate the corresponding markup sequences from the table images. However, they either count on additional heuristic rules to recover the table structures, or require a huge amount of training data and time-consuming sequential decoders. In this paper, we propose an alternative paradigm. We model TSR as a logical location regression problem and propose a new TSR framework called LORE, standing for LOgical location REgression network, which for the first time combines logical location regression together with spatial location regression of table cells. Our proposed LORE is conceptually simpler, easier to train and more accurate than previous TSR models of other paradigms. Experiments on standard benchmarks demonstrate that LORE consistently outperforms prior arts. Code is available at https:// github.com/AlibabaResearch/AdvancedLiterateMachinery/tree/main/DocumentUnderstanding/LORE-TSR.

Downloads

Published

2023-06-26

How to Cite

Xing, H., Gao, F., Long, R., Bu, J., Zheng, Q., Li, L., Yao, C., & Yu, Z. (2023). LORE: Logical Location Regression Network for Table Structure Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 37(3), 2992-3000. https://doi.org/10.1609/aaai.v37i3.25402

Issue

Section

AAAI Technical Track on Computer Vision III