BraSTORM: A Dual-Branch Self-Supervised Framework for EEG Representation Learning via Input-Level Spatio-Temporal Decomposition

Authors

  • Yifan Wang College of Computer Science and Technology, Zhejiang University ZJU-UIUC Institute, Zhejiang University
  • Der-Horng Lee ZJU-UIUC Institute, Zhejiang University
  • Bruce X.B. Yu ZJU-UIUC Institute, Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v40i21.38838

Abstract

Prevalent pre-training strategies for Brain-Computer Interfaces (BCIs) are often constrained by spatio-temporal entanglement. This critical issue arises from processing multi-channel Electroencephalography (EEG) signals as monolithic sequences, which intertwines the signal's temporal dynamics with its spatial topography and hinders the learning of robust and generalizable representations. To address this, we introduce BraSTORM, a framework that explicitly disentangles EEG data into separate temporal and spatial streams at the input level. Two streams are processed by parallel encoders trained with a composite dual-objective: a masked signal reconstruction loss captures fine-grained, intra-modal details, while a cross-modal contrastive loss enforces high-level semantic alignment. Extensive fine-tuning experiments on six benchmarks covering three major BCI downstream tasks—Emotion Recognition, Sleep Staging, and Motor Imagery—demonstrate that BraSTORM achieves state-of-the-art performance. Our findings validate that resolving spatio-temporal entanglement at the input level can be a competitive pre-training framework for the BCI field.

Downloads

Published

2026-03-14

How to Cite

Wang, Y., Lee, D.-H., & Yu, B. X. (2026). BraSTORM: A Dual-Branch Self-Supervised Framework for EEG Representation Learning via Input-Level Spatio-Temporal Decomposition. Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), 17805-17813. https://doi.org/10.1609/aaai.v40i21.38838

Issue

Section

AAAI Technical Track on Humans and AI