Return to Article Details Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement Download Download PDF