Multi-Agent Learning with Policy Prediction

Authors

  • Chongjie Zhang University of Massachusetts Amherst
  • Victor Lesser University of Massachusetts Amherst

DOI:

https://doi.org/10.1609/aaai.v24i1.7639

Keywords:

Multi-agent reinforcement learning, Policy Prediction, Games, Nash Equilibrium

Abstract

Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting the basic gradient ascent approach with policy prediction. We prove that this augmentation results in a stronger notion of convergence than the basic gradient ascent, that is, strategies converge to a Nash equilibrium within a restricted class of iterated games. Motivated by this augmentation, we then propose a new practical multi-agent reinforcement learning (MARL) algorithm exploiting approximate policy prediction. Empirical results show that it converges faster and in a wider variety of situations than state-of-the-art MARL algorithms.

Downloads

Published

2010-07-04

How to Cite

Zhang, C., & Lesser, V. (2010). Multi-Agent Learning with Policy Prediction. Proceedings of the AAAI Conference on Artificial Intelligence, 24(1), 927-934. https://doi.org/10.1609/aaai.v24i1.7639

Issue

Section

AAAI Technical Track: Multiagent Systems