Bayesian Persuasion in Sequential Decision-Making

Jiarui Gan; Rupak Majumdar; Goran Radanovic; Adish Singla

doi:10.1609/aaai.v36i5.20434

Authors

Jiarui Gan Max Planck Institute for Software Systems
Rupak Majumdar Max Planck Institute for Software Systems
Goran Radanovic Max Planck Institute for Software Systems
Adish Singla Max Planck Institute for Software Systems

DOI:

https://doi.org/10.1609/aaai.v36i5.20434

Keywords:

Game Theory And Economic Paradigms (GTEP), Multiagent Systems (MAS), Planning, Routing, And Scheduling (PRS)

Abstract

We study a dynamic model of Bayesian persuasion in sequential decision-making settings. An informed principal observes an external parameter of the world and advises an uninformed agent about actions to take over time. The agent takes actions in each time step based on the current state, the principal's advice/signal, and beliefs about the external parameter. The action of the agent updates the state according to a stochastic process. The model arises naturally in many applications, e.g., an app (the principal) can advice the user (the agent) on possible choices between actions based on additional real-time information the app has. We study the problem of designing a signaling strategy from the principal's point of view. We show that the principal has an optimal strategy against a myopic agent, who only optimizes their rewards locally, and the optimal strategy can be computed in polynomial time. In contrast, it is NP-hard to approximate an optimal policy against a far-sighted agent. Further, we show that if the principal has the power to threaten the agent by not providing future signals, then we can efficiently design a threat-based strategy. This strategy guarantees the principal's payoff as if playing against an agent who is far-sighted but myopic to future signals.

Bayesian Persuasion in Sequential Decision-Making

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information