SheetBrain: A Neuro-Symbolic Agent for Accurate Reasoning over Complex and Large Spreadsheets

Ziwei Wang; Jiayuan Su; Mengyu Zhou; Huaxing Zeng; Mengni Jia; Xiao Lv; Haoyu Dong; Xiaojun Ma; Shi Han; Dongmei Zhang

doi:10.1609/aaai.v40i40.40671

Authors

Ziwei Wang Carnegie Mellon University
Jiayuan Su Zhejiang University
Mengyu Zhou Microsoft Research
Huaxing Zeng Brown University
Mengni Jia University of Cambridge
Xiao Lv Microsoft Research
Haoyu Dong Microsoft Research
Xiaojun Ma Microsoft Research
Shi Han Microsoft Research
Dongmei Zhang Microsoft Research

DOI:

https://doi.org/10.1609/aaai.v40i40.40671

Abstract

Understanding and reasoning over complex spreadsheets remain fundamental challenges for large language models (LLMs), which often struggle with intricate structures and rely solely on neural computation. In this work, we propose SheetBrain, a neuro-symbolic dual-workflow agent framework for precise and interpretable reasoning over tabular data. SheetBrain consists of an understanding module that produces a comprehensive overview of the spreadsheet, including structural summaries and query-specific analyses to guide execution; an execution module that integrates a Python sandbox with preloaded table-processing libraries and an Excel helper toolkit for effective data manipulation; and a validation module that verifies the correctness of reasoning and answers, triggering re-execution if necessary. We evaluate SheetBrain on multiple public QA and manipulation benchmarks, and introduce SheetBench, a new benchmark targeting large, multi-table, and structurally complex spreadsheets. Experimental results show that SheetBrain significantly improves reasoning performance on both existing benchmarks and the more challenging scenarios presented in SheetBench.

SheetBrain: A Neuro-Symbolic Agent for Accurate Reasoning over Complex and Large Spreadsheets

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information