Learning Interpretable Spatial Operations in a Rich 3D Blocks World

Yonatan Bisk; Kevin Shih; Yejin Choi; Daniel Marcu

doi:10.1609/aaai.v32i1.12026

Learning Interpretable Spatial Operations in a Rich 3D Blocks World

Authors

Yonatan Bisk University of Washington
Kevin Shih University of Illinois at Urbana-Champaign
Yejin Choi University of Washington
Daniel Marcu Amazon Inc.

DOI:

https://doi.org/10.1609/aaai.v32i1.12026

Keywords:

grounding, natural language, spatial, actions

Abstract

In this paper, we study the problem of mapping natural language instructions to complex spatial actions in a 3D blocks world. We first introduce a new dataset that pairs complex 3D spatial operations to rich natural language descriptions that require complex spatial and pragmatic interpretations such as “mirroring”, “twisting”, and “balancing”. This dataset, built on the simulation environment of Bisk, Yuret, and Marcu (2016), attains language that is significantly richer and more complex, while also doubling the size of the original dataset in the 2D environment with 100 new world configurations and 250,000 tokens. In addition, we propose a new neural architecture that achieves competitive results while automatically discovering an inventory of interpretable spatial operations (Figure 5).

Downloads

Published

2018-04-27

How to Cite

Bisk, Y., Shih, K., Choi, Y., & Marcu, D. (2018). Learning Interpretable Spatial Operations in a Rich 3D Blocks World. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.12026

Download Citation

Issue

Vol. 32 No. 1 (2018): Thirty-Second AAAI Conference on Artificial Intelligence

Section

Main Track: NLP and Machine Learning

Learning Interpretable Spatial Operations in a Rich 3D Blocks World

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription