Exploration via State influence Modeling

Authors

  • Yongxin Kang School of artificial intelligence, University of Chinese Academy of Sciences;Institute of Automation, Chinese Academy of Sciences
  • Enmin Zhao Institute of Automation, Chinese Academy of Sciences;School of artificial intelligence, University of Chinese Academy of Sciences
  • Kai Li Institute of Automation, Chinese Academy of Sciences
  • Junliang Xing Institute of Automation, Chinese Academy of Sciences

Keywords:

Reinforcement Learning

Abstract

This paper studies the challenging problem of reinforcement learning (RL) in hard exploration tasks with sparse rewards. It focuses on the exploration stage before the agent gets the first positive reward, in which case, traditional RL algorithms with simple exploration strategies often work poorly. Unlike previous methods using some attribute of a single state as the intrinsic reward to encourage exploration, this work leverages the social influence between different states to permit more efficient exploration. It introduces a general intrinsic reward construction method to evaluate the social influence of states dynamically. Three kinds of social influence are introduced for a state: conformity, power, and authority. By measuring the state’s social influence, agents quickly find the focus state during the exploration process. The proposed RL framework with state social influence evaluation works well in hard exploration task. Extensive experimental analyses and comparisons in Grid Maze and many hard exploration Atari 2600 games demonstrate its high exploration efficiency.

Downloads

Published

2021-05-18

How to Cite

Kang, Y., Zhao, E., Li, K., & Xing, J. (2021). Exploration via State influence Modeling. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 8047-8054. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16981

Issue

Section

AAAI Technical Track on Machine Learning II