Agent Incentives: A Causal Perspective

Tom Everitt; Ryan Carey; Eric D. Langlois; Pedro A. Ortega; Shane Legg

doi:10.1609/aaai.v35i13.17368

Agent Incentives: A Causal Perspective

Authors

Tom Everitt DeepMind
Ryan Carey University of Oxford
Eric D. Langlois DeepMind; University of Toronto; Vector Institute
Pedro A. Ortega DeepMind
Shane Legg DeepMind

DOI:

https://doi.org/10.1609/aaai.v35i13.17368

Keywords:

Safety, Robustness & Trustworthiness

Abstract

We present a framework for analysing agent incentives using causal influence diagrams. We establish that a well-known criterion for value of information is complete. We propose a new graphical criterion for value of control, establishing its soundness and completeness. We also introduce two new concepts for incentive analysis: response incentives indicate which changes in the environment affect an optimal decision, while instrumental control incentives establish whether an agent can influence its utility via a variable X. For both new concepts, we provide sound and complete graphical criteria. We show by example how these results can help with evaluating the safety and fairness of an AI system

Downloads

Published

2021-05-18

How to Cite

Everitt, T., Carey, R., Langlois, E. D., Ortega, P. A., & Legg, S. (2021). Agent Incentives: A Causal Perspective. Proceedings of the AAAI Conference on Artificial Intelligence, 35(13), 11487-11495. https://doi.org/10.1609/aaai.v35i13.17368

Download Citation

Issue

Vol. 35 No. 13: AAAI-21 Technical Tracks 13

Section

AAAI Technical Track on Philosophy and Ethics of AI

Agent Incentives: A Causal Perspective

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription