Towered Actor Critic For Handling Multiple Action Types In Reinforcement Learning For Drug Discovery

Authors

  • Sai Krishna Gottipati 99andBeyond
  • Yashaswi Pathak International Institute of Information Technology, Hyderabad
  • Boris Sattarov 99andBeyond
  • Sahir Department of Computing Science, University of Alberta
  • Rohan Nuttall Department of Computing Science, University of Alberta
  • Mohammad Amini 99andBeyond
  • Matthew E. Taylor Department of Computing Science, University of Alberta Alberta Machine Intelligence Institute (Amii) Canada CIFAR AI Chair
  • Sarath Chandar Mila - Quebec AI Institute Canada CIFAR AI Chair Ecole Polytechnique Montreal

DOI:

https://doi.org/10.1609/aaai.v35i1.16087

Keywords:

Healthcare, Medicine & Wellness, Reinforcement Learning, Other Applications

Abstract

Reinforcement learning (RL) has made significant progress in both abstract and real-world domains, but the majority of state-of-the-art algorithms deal only with monotonic actions. However, some applications require agents to reason over different types of actions. Our application simulates reaction-based molecule generation, used as part of the drug discovery pipeline, and includes both uni-molecular and bi-molecular reactions. This paper introduces a novel framework, towered actor critic (TAC), to handle multiple action types. The TAC framework is general in that it is designed to be combined with any existing RL algorithms for continuous action space. We combine it with TD3 to empirically obtain significantly better results than existing methods in the drug discovery setting. TAC is also applied to RL benchmarks in OpenAI Gym and results show that our framework can improve, or at least does not hurt, performance relative to standard TD3.

Downloads

Published

2021-05-18

How to Cite

Gottipati, S. K., Pathak, Y., Sattarov, B., Sahir, . ., Nuttall, R., Amini, M., Taylor, M. E., & Chandar, S. (2021). Towered Actor Critic For Handling Multiple Action Types In Reinforcement Learning For Drug Discovery. Proceedings of the AAAI Conference on Artificial Intelligence, 35(1), 142-150. https://doi.org/10.1609/aaai.v35i1.16087

Issue

Section

AAAI Technical Track on Application Domains