Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Authors

  • Keerthiram Murugesan IBM Research
  • Mattia Atzeni IBM Research EPFL
  • Pavan Kapanipathi IBM Research
  • Pushkar Shukla TTI Chicago
  • Sadhana Kumaravel IBM Research
  • Gerald Tesauro IBM Research
  • Kartik Talamadupula IBM Research
  • Mrinmaya Sachan ETH Zurich
  • Murray Campbell IBM Research

Keywords:

Reinforcement Learning, Common-Sense Reasoning, Information Extraction

Abstract

Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform look-ahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called TextWorld Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to estimate human performance on TWC and show that there is ample room for future improvement.

Downloads

Published

2021-05-18

How to Cite

Murugesan, K., Atzeni, M., Kapanipathi, P., Shukla, P., Kumaravel, S., Tesauro, G., Talamadupula, K., Sachan, M., & Campbell, M. (2021). Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines. Proceedings of the AAAI Conference on Artificial Intelligence, 35(10), 9018-9027. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17090

Issue

Section

AAAI Technical Track on Machine Learning III