Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)


  • Wang Qisheng Southeast University
  • Wang Qichao Southeast University
  • Li Xiao Southeast University




Exploration efficiency challenges for multi-agent reinforcement learning (MARL), as the policy learned by confederate MARL depends on the interaction among agents. Less informative reward also restricts the learning speed of MARL in comparison with the informative label in supervised learning. This paper proposes a novel communication method which helps agents focus on different exploration subarea to guide MARL to accelerate exploration. We propose a predictive network to forecast the reward of current state-action pair and use the guidance learned by the predictive network to modify the reward function. An improved prioritized experience replay is employed to help agents better take advantage of the different knowledge learned by different agents. Experimental results demonstrate that the proposed algorithm outperforms existing methods in cooperative multi-agent environments.




How to Cite

Qisheng, W., Qichao, W., & Xiao, L. (2020). Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 34(10), 13949-13950. https://doi.org/10.1609/aaai.v34i10.7247



Student Abstract Track