BraSTORM: A Dual-Branch Self-Supervised Framework for EEG Representation Learning via Input-Level Spatio-Temporal Decomposition

Yifan Wang; Der-Horng Lee; Bruce X.B. Yu

doi:10.1609/aaai.v40i21.38838

Authors

Yifan Wang College of Computer Science and Technology, Zhejiang University ZJU-UIUC Institute, Zhejiang University
Der-Horng Lee ZJU-UIUC Institute, Zhejiang University
Bruce X.B. Yu ZJU-UIUC Institute, Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v40i21.38838

Abstract

Prevalent pre-training strategies for Brain-Computer Interfaces (BCIs) are often constrained by spatio-temporal entanglement. This critical issue arises from processing multi-channel Electroencephalography (EEG) signals as monolithic sequences, which intertwines the signal's temporal dynamics with its spatial topography and hinders the learning of robust and generalizable representations. To address this, we introduce BraSTORM, a framework that explicitly disentangles EEG data into separate temporal and spatial streams at the input level. Two streams are processed by parallel encoders trained with a composite dual-objective: a masked signal reconstruction loss captures fine-grained, intra-modal details, while a cross-modal contrastive loss enforces high-level semantic alignment. Extensive fine-tuning experiments on six benchmarks covering three major BCI downstream tasks—Emotion Recognition, Sleep Staging, and Motor Imagery—demonstrate that BraSTORM achieves state-of-the-art performance. Our findings validate that resolving spatio-temporal entanglement at the input level can be a competitive pre-training framework for the BCI field.

BraSTORM: A Dual-Branch Self-Supervised Framework for EEG Representation Learning via Input-Level Spatio-Temporal Decomposition

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information