Multi-Modal Multi-Task Learning for Automatic Dietary Assessment

Authors

  • Qi Liu Singapore University of Technology and Design
  • Yue Zhang Singapore University of Technology and Design
  • Zhenguang Liu Zhejiang Gongshang University
  • Ye Yuan Singapore University of Technology and Design
  • Li Cheng A*STAR
  • Roger Zimmermann National University of Singapore

DOI:

https://doi.org/10.1609/aaai.v32i1.11848

Keywords:

Dietary Assessment, Multi-modal Learning, Memory Network

Abstract

We investigate the task of automatic dietary assessment: given meal images and descriptions uploaded by real users, our task is to automatically rate the meals and deliver advisory comments for improving users' diets. To address this practical yet challenging problem, which is multi-modal and multi-task in nature, an end-to-end neural model is proposed. In particular, comprehensive meal representations are obtained from images, descriptions and user information. We further introduce a novel memory network architecture to store meal representations and reason over the meal representations to support predictions. Results on a real-world dataset show that our method outperforms two strong image captioning baselines significantly.

Downloads

Published

2018-04-26

How to Cite

Liu, Q., Zhang, Y., Liu, Z., Yuan, Y., Cheng, L., & Zimmermann, R. (2018). Multi-Modal Multi-Task Learning for Automatic Dietary Assessment. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11848

Issue

Section

Main Track: Machine Learning Applications