Uncovering Systemic and Environment Errors in Autonomous Systems Using Differential Testing

Authors

  • Yashwanthi Anand Oregon State University
  • Rahil P Mehta Oregon State University
  • Manish Motwani Oregon State University
  • Sandhya Saisubramanian Oregon State University

DOI:

https://doi.org/10.1609/aaaiss.v7i1.36877

Abstract

Deploying autonomous agents in complex environments requires distinguishing between undesirable behaviors caused by the impreciseness of the agent's reasoning model or its policy (i.e. systemic agent error) and those due to inherently unsolvable tasks (environment error). We introduce AIProbe, a novel black-box differential testing framework to validate autonomous agents under varied and challenging environment configurations. We first describe how AIProbe generates diverse environmental configurations and tasks for testing the agent, by modifying configurable parameters using Latin Hypercube sampling. It then solves each generated task using a search-based planner, independent of the agent. By comparing the agent's performance to the planner's solution, AIProbe identifies whether failures are due to errors in the agent's model or policy, or due to unsolvable task conditions. We then demonstrate its broad applicability to both model-free and model-based agents operating in discrete and continuous domains. Our evaluation across multiple domains shows that AIProbe significantly outperforms state-of-the-art techniques in detecting unique errors, thereby contributing to a reliable deployment of autonomous agents.

Downloads

Published

2025-11-23

How to Cite

Anand, Y., Mehta, R. P., Motwani, M., & Saisubramanian, S. (2025). Uncovering Systemic and Environment Errors in Autonomous Systems Using Differential Testing. Proceedings of the AAAI Symposium Series, 7(1), 122–130. https://doi.org/10.1609/aaaiss.v7i1.36877

Issue

Section

AI Trustworthiness and Risk Assessment for Challenged Contexts (ATRACC)