ToolSmith: A Multi-Agent Framework for Enterprise Tool Creation

Authors

  • Purna Chandra Sekhar Vakudavathu Indian Institute of Technology, Delhi IBM Research
  • Kushal Mukherjee IBM Research
  • Jayachandu Bandlamudi IBM Research
  • Renuka Sindhgatta IBM Research
  • Sameep Mehta IBM Research

DOI:

https://doi.org/10.1609/aaai.v40i48.42388

Abstract

Although LLMs can generate tools for generic domains and tasks, they struggle with enterprise-related domains that involve proprietary APIs and data schemas. We present ToolSmith, a framework for autonomously generating and validating agent-compatible tools. Given an API specification and a Tool Specification Requirement (TSR), ToolSmith produces a tool function and verifies it through a closed-loop process: it creates natural language (NL) tests and executes the tool in a secure agent sandbox for validation. For state-changing tools, ToolSmith confirms outcomes by querying the API with parameters derived from the NL tests. If the tool fails to produce the desired output, ToolSmith generates diagnostic feedback to iteratively regenerate it. By ensuring both functional correctness and agent compatibility, ToolSmith enables reliable automation of enterprise workflows.

Published

2026-03-14

How to Cite

Vakudavathu, P. C. S., Mukherjee, K., Bandlamudi, J., Sindhgatta, R., & Mehta, S. (2026). ToolSmith: A Multi-Agent Framework for Enterprise Tool Creation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41706–41708. https://doi.org/10.1609/aaai.v40i48.42388