FIND: A Framework for Discovering Formulas in Data

Authors

  • Tingxiong Xiao Tsinghua University, Tsinghua University
  • Yuxiao Cheng Tsinghua University, Tsinghua University
  • Jinli Suo Tsinghua University, Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v39i20.35469

Abstract

Scientific discovery serves as the cornerstone for advances in various fields, from the fundamental laws of physics to the intricate mechanisms of biology. However, two existing mainstream methods---symbolic regression and dimensional analysis, are significantly limited in this task: the former suffers from low computational efficiency due to the vast search space and often results in formulas without physical meaning; the latter provides a useful theoretical framework but also struggles in searching in a huge space because of lacking effective analysis for the latent variables. To address this issue, here we propose a framework for efficiently discovering underlying formulas in data, named FIND. We draw inspiration from Buckingham’s Pi theorem, imposing dimensional constraints on the input and output, thereby ensuring discovered expressions possess physical meaning. Additionally, we propose a theoretical scheme for identifying the latent structure as well as a coarse-to-fine framework, significantly reducing the search space of latent variables. This framework not only improves computational efficiency but also enhances model interpretability. From comprehensive experimental validation, FIND showcases its potential to uncover meaningful scientific insights across various domains, providing a robust tool for advancing our understanding of unknown systems.

Downloads

Published

2025-04-11

How to Cite

Xiao, T., Cheng, Y., & Suo, J. (2025). FIND: A Framework for Discovering Formulas in Data. Proceedings of the AAAI Conference on Artificial Intelligence, 39(20), 21653–21660. https://doi.org/10.1609/aaai.v39i20.35469

Issue

Section

AAAI Technical Track on Machine Learning VI