Shiang, S.-R., Rosenthal, S., Gershman, A., Carbonell, J., & Oh, J. (2017). Vision-Language Fusion for Object Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.11187