MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification

Te-Lin Wu; Shikhar Singh; Sayan Paul; Gully Burns; Nanyun Peng

doi:10.1609/aaai.v35i16.17657

Authors

Te-Lin Wu University of California, Los Angeles (UCLA)
Shikhar Singh University of Southern California
Sayan Paul Intuit
Gully Burns Chan Zuckerberg Initiative
Nanyun Peng University of California, Los Angeles (UCLA)

DOI:

https://doi.org/10.1609/aaai.v35i16.17657

Keywords:

Language Grounding & Multi-modal NLP, Bioinformatics, Biology & Cell microscopy

Abstract

We introduce a new dataset, MELINDA, for Multimodal biomEdicaL experImeNt methoD clAssification. The dataset is collected in a fully automated distant supervision manner, where the labels are obtained from an existing curated database, and the actual contents are extracted from papers associated with each of the records in the database. We benchmark various state-of-the-art NLP and computer vision models, including unimodal models which only take either caption texts or images as inputs, and multimodal models. Extensive experiments and analysis show that multimodal models, despite outperforming unimodal ones, still need improvements especially on a less-supervised way of grounding visual concepts with languages, and better transferability to low resource domains. We release our dataset and the benchmarks to facilitate future research in multimodal learning, especially to motivate targeted improvements for applications in scientific domains.

MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription