CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information

Kaifan Zhang; Lihuo He; Xin Jiang; Wen Lu; Di Wang; Xinbo Gao

doi:10.1609/aaai.v39i13.33587

Authors

Kaifan Zhang School of Electronic Engineering, Xidian University, Xi’an, China
Lihuo He School of Electronic Engineering, Xidian University, Xi’an, China
Xin Jiang School of Electronic Engineering, Xidian University, Xi’an, China
Wen Lu School of Electronic Engineering, Xidian University, Xi’an, China
Di Wang School of Computer Science and Technology, Xidian University, Xi’an, China
Xinbo Gao School of Electronic Engineering, Xidian University, Xi’an, China Chongqing University of Posts and Telecommunications, Chongqing, China

DOI:

https://doi.org/10.1609/aaai.v39i13.33587

Abstract

Electroencephalogram (EEG) signals have attracted significant attention from researchers due to their non-invasive nature and high temporal sensitivity in decoding visual stimuli. However, most recent studies have focused solely on the relationship between EEG and image data pairs, neglecting the valuable "beyond-image-modality" information embedded in EEG signals. This results in the loss of critical multimodal information in EEG. To address the limitation, this paper proposes a unified framework that fully leverages multimodal data to represent EEG signals, named CognitionCapturer. Specifically, CognitionCapturer trains modality expert encoders for each modality to extract cross-modal information from the EEG modality. Then, it introduces a diffusion prior to map the EEG embedding space to the CLIP embedding space, followed by using a pretrained generative model, the proposed framework can reconstruct visual stimuli with high semantic and structural fidelity. Notably, the framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities. Through extensive experiments, we demonstrate that CognitionCapturer outperforms state-of-the-art methods both qualitatively and quantitatively.

CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information