QueryGym: Step-by-Step Interaction with Relational Databases

Authors

  • Haritha Ananthakrishnan IBM Research
  • Harsha Kokel IBM Research
  • Kelsey Sikes Colorado State University
  • Debarun Bhattacharjya IBM Research
  • Michael Katz IBM Research
  • Shirin Sohrabi IBM Research
  • Kavitha Srinivas IBM Research

DOI:

https://doi.org/10.1609/aaai.v40i48.42334

Abstract

We introduce QueryGym, an interactive environment for building, testing, and evaluating LLM-based query planning agents. Existing frameworks often tie agents to specific query language dialects or obscure their reasoning; QueryGym instead requires agents to construct explicit sequences of relational algebra operations, ensuring engine-agnostic evaluation and transparent step-by-step planning. The environment is implemented as a Gymnasium interface that supplies observations---including schema details, intermediate results, and execution feedback---and receives actions that represent database exploration (e.g., previewing tables, sampling column values, retrieving unique values) as well as relational algebra operations (e.g., filter, project, join).We detail the motivation and the design of the environment. In the demo, we showcase the utility of the environment by contrasting it with contemporary LLMs that query databases. QueryGym serves as a practical testbed for research in error remediation, transparency, and reinforcement learning for query generation.

Downloads

Published

2026-03-14

How to Cite

Ananthakrishnan, H., Kokel, H., Sikes, K., Bhattacharjya, D., Katz, M., Sohrabi, S., & Srinivas, K. (2026). QueryGym: Step-by-Step Interaction with Relational Databases. Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41544–41546. https://doi.org/10.1609/aaai.v40i48.42334