RFL: Simplifying Chemical Structure Recognition with Ring-Free Language

Authors

  • Qikai Chang NERC-SLIP, University of Science and Technology of China
  • Mingjun Chen NERC-SLIP, University of Science and Technology of China
  • Changpeng Pi iFLYTEK Research
  • Pengfei Hu NERC-SLIP, University of Science and Technology of China
  • Zhenrong Zhang NERC-SLIP, University of Science and Technology of China
  • Jiefeng Ma NERC-SLIP, University of Science and Technology of China
  • Jun Du NERC-SLIP, University of Science and Technology of China
  • Baocai Yin iFLYTEK Research
  • Jinshui Hu iFLYTEK Research

DOI:

https://doi.org/10.1609/aaai.v39i2.32197

Abstract

The primary objective of Optical Chemical Structure Recognition is to identify chemical structure images into corresponding markup sequences. However, the complex two-dimensional structures of molecules, particularly those with rings and multiple branches, present significant challenges for current end-to-end methods to learn one-dimensional markup directly. To overcome this limitation, we propose a novel Ring-Free Language (RFL), which utilizes a divide-and-conquer strategy to describe chemical structures in a hierarchical form. RFL allows complex molecular structures to be decomposed into multiple parts, ensuring both uniqueness and conciseness while enhancing readability. This approach significantly reduces the learning difficulty for recognition models. Leveraging RFL, we propose a universal Molecular Skeleton Decoder (MSD), which comprises a skeleton generation module that progressively predicts the molecular skeleton and individual rings, along with a branch classification module for predicting branch information. Experimental results demonstrate that the proposed RFL and MSD can be applied to various mainstream methods, achieving superior performance compared to state-of-the-art approaches in both printed and handwritten scenarios.

Published

2025-04-11

How to Cite

Chang, Q., Chen, M., Pi, C., Hu, P., Zhang, Z., Ma, J., … Hu, J. (2025). RFL: Simplifying Chemical Structure Recognition with Ring-Free Language. Proceedings of the AAAI Conference on Artificial Intelligence, 39(2), 2007–2015. https://doi.org/10.1609/aaai.v39i2.32197

Issue

Section

AAAI Technical Track on Computer Vision I