Rating Composite AI Models for Robustness Through Probabilistic Planning

Kausik Lakkaraju; Sunandita Patra; Parisa Zehtabi; Biplav Srivastava

doi:10.1609/icaps.v36i1.42867

Authors

Kausik Lakkaraju University of South Carolina
Sunandita Patra J.P. Morgan AI Research
Parisa Zehtabi J.P. Morgan AI Research
Biplav Srivastava University of South Carolina

DOI:

https://doi.org/10.1609/icaps.v36i1.42867

Abstract

Many real-world AI systems combine several primitive component models, such as translators and sentiment analyzers, into larger composite models, like chatbots. Understanding how these compositions behave under uncertainty and how properties like bias or instability move through a composite model is increasingly important, yet most evaluation methods still focus on primitive models. We introduce a new use of probabilistic planning to assess the robustness of composite AI models. Each component model call is represented as a stochastic action in the RDDL domain, and the reward combines robustness metrics to the cost of components (actions). The planner runs each primitive model on randomly drawn data batches, allowing robustness to be assessed under variation in both the data and the model outputs induced by those data. We demonstrate via case studies and experiments in multilingual sentiment analysis and a synthetic domain, the planner consistently identifies more stable composite configurations than baseline methods, showing that probabilistic planning can serve as a practical, scalable approach for reasoning about reliability in complex, composite AI models.

Rating Composite AI Models for Robustness Through Probabilistic Planning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information