An Enhanced Advising Model in Teacher-Student Framework using State Categorization

Authors

  • Daksh Anand International Institute of Information Technology, Hyderabad
  • Vaibhav Gupta International Institute of Information Technology, Hyderabad
  • Praveen Paruchuri International Institute of Information Technology, Hyderabad
  • Balaraman Ravindran Indian Institute of Technology, Madras

DOI:

https://doi.org/10.1609/aaai.v35i8.16823

Keywords:

Reinforcement Learning

Abstract

The teacher-student framework aims to improve the sample efficiency of RL algorithms by deploying an advising mechanism in which a teacher helps a student by guiding its exploration. Prior work in this field has considered an advising mechanism where the teacher advises the student about the optimal action to take in a given state. However, real-world teachers can leverage domain expertise to provide more informative signals. Using this insight, we propose to extend the current advising framework wherein the teacher would provide not only the optimal action but also a qualitative assessment of the state. We introduce a novel architecture, namely Advice Replay Memory (ARM), to effectively reuse the advice provided by the teacher. We demonstrate the robustness of our approach by showcasing our experiments on multiple Atari 2600 games using a fixed set of hyper-parameters. Additionally, we show that a student taking help even from a sub-optimal teacher can achieve significant performance boosts and eventually outperform the teacher. Our approach outperforms the baselines even when provided with comparatively suboptimal teachers and an advising budget, which is smaller by orders of magnitude. The contributions of our paper are 4-fold (a) effectively leveraging a teacher's knowledge by richer advising (b) introduction of ARM to effectively reuse the advice throughout learning (c) ability to achieve significant performance boost even with a coarse state categorization (d) enabling the student to outperform the teacher.

Downloads

Published

2021-05-18

How to Cite

Anand, D., Gupta, V., Paruchuri, P., & Ravindran, B. (2021). An Enhanced Advising Model in Teacher-Student Framework using State Categorization. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 6653-6660. https://doi.org/10.1609/aaai.v35i8.16823

Issue

Section

AAAI Technical Track on Machine Learning I