Knowledge Enhanced Representation Learning for Drug Discovery

Authors

  • Thanh Lam Hoang IBM Research, Dublin, Ireland
  • Marco Luca Sbodio IBM Research, Dublin, Ireland
  • Marcos Martinez Galindo IBM Research, Dublin, Ireland
  • Mykhaylo Zayats IBM Research, Dublin, Ireland
  • Raul Fernandez-Diaz IBM Research, Dublin, Ireland University College Dublin, Ireland
  • Victor Valls IBM Research, Dublin, Ireland
  • Gabriele Picco IBM Research, Dublin, Ireland
  • Cesar Berrospi IBM Research, Zurich, Switzerland
  • Vanessa Lopez IBM Research, Dublin, Ireland

DOI:

https://doi.org/10.1609/aaai.v38i9.28924

Keywords:

KRR: Other Foundations of Knowledge Representation & Reasoning, ML: Multimodal Learning

Abstract

Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about existing relations among molecules and proteins. Our study reveals that enhanced representations, derived from multimodal knowledge graphs describing relations among molecules and proteins, lead to state-of-the-art results in well-established benchmarks (first place in the leaderboard for Therapeutics Data Commons benchmark ``Drug-Target Interaction Domain Generalization Benchmark", with an improvement of 8 points with respect to previous best result). Moreover, our results significantly surpass those achieved in standard benchmarks by using conventional pre-trained representations that rely only on sequence or SMILES data. We release our multimodal knowledge graphs, integrating data from seven public data sources, and which contain over 30 million triples. Pretrained models from our proposed graphs and benchmark task source code are also released.

Published

2024-03-24

How to Cite

Hoang, T. L., Sbodio, M. L., Martinez Galindo, M., Zayats, M., Fernandez-Diaz, R., Valls, V., Picco, G., Berrospi, C., & Lopez, V. (2024). Knowledge Enhanced Representation Learning for Drug Discovery. Proceedings of the AAAI Conference on Artificial Intelligence, 38(9), 10544-10552. https://doi.org/10.1609/aaai.v38i9.28924

Issue

Section

AAAI Technical Track on Knowledge Representation and Reasoning