Tensorized Attention for Understanding Multi-Object Relationships

Authors

  • Makoto Nakatsuji NTT Human Informatics Laboratories
  • Yasuhiro Fujiwara NTT Communication Science Laboratories
  • Atsushi Otsuka NTT Human Informatics Laboratories
  • Narichika Nomoto NTT Human Informatics Laboratories
  • Yoshihide Sato NTT Human Informatics Laboratories

DOI:

https://doi.org/10.1609/aaai.v39i23.34675

Abstract

Attention mechanisms have played a crucial role in the success of Transformer models, as seen in platforms like ChatGPT. However, since they compute attentions from relationships between only one or two object types, they fail to effectively capture multi-object relationships in real-world scenarios, resulting in low prediction accuracy. In fact, they cannot calculate attention weights among diverse object types, such as the `comments,' `replies,' and `subjects' that naturally constitute conversations on platforms like Reddit or X, representing relationships simultaneously observed in real-world contexts. To overcome this limitation, we introduce the Tensorized Attention Model (TAM), which uses the Tucker decomposition to calculate attention weights across various object types and seamlessly integrates them into the Transformer models. Evaluations show that TAM significantly outperforms existing encoder methods, and its integration into the LoRA adapter for Llama2 enhances fine-tuning accuracy.

Downloads

Published

2025-04-11

How to Cite

Nakatsuji, M., Fujiwara, Y., Otsuka, A., Nomoto, N., & Sato, Y. (2025). Tensorized Attention for Understanding Multi-Object Relationships. Proceedings of the AAAI Conference on Artificial Intelligence, 39(23), 24921–24929. https://doi.org/10.1609/aaai.v39i23.34675

Issue

Section

AAAI Technical Track on Natural Language Processing II