BrainLMM: A Label-Free Framework for Mapping Multi-Semantic Representation in the Human Visual Cortex

Authors

  • Tan Gao School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
  • Mufan Xue School of Interdisciplinary Science, Beijing Institute of Technology, Beijing, China
  • Haofang Zheng School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing, China
  • Shuo Lv School of Medical Technology, Beijing Institute of Technology, Beijing, China
  • Jia Xu School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
  • Dabin Sheng School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
  • Ziming Mao School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
  • Xinyu Wu School of Medical Technology, Beijing Institute of Technology, Beijing, China
  • Andrew Luo Institute of Data Science, University of Hong Kong, Hong Kong, China
  • Guoyuan Yang School of Interdisciplinary Science, Beijing Institute of Technology, Beijing, China School of Medical Technology, Beijing Institute of Technology, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v40i6.42413

Abstract

Previous studies leveraging artificial neural networks have been used to investigate the semantic coding within human visual cortex. However, building an interpretable label-free framework that can effectively map brain responses to multiple coexisting semantic concepts remains largely unexplored. Here, we propose BrainLMM, a label-free framework for multi-semantic mapping of voxel responses by combining diverse vision encoders with the Describe-and-Dissect strategy, enabling a hypothesis-free analysis of the human high-level visual cortex. First, we construct voxel-wise encoding models leveraging diverse vision encoders to predict visual cortical responses to natural scene images. Then, we use BrainLMM to map individual brain voxels to multiple semantics without requiring any predefined labels. To evaluate the effectiveness of our method, we compute Pearson correlation coefficients to compare the multi-semantic mappings produced by BrainLMM and CLIP-MSM with ground-truth voxel responses within selective cortical areas. Our findings indicate that BrainLMM achieves more accurate predictions of visual responses compared to CLIP-MSM. Finally, to demonstrate the multi-semantic mapping capability of our method, we project multiple representative semantic concepts onto the cortical surface for visualization. Our method enables the discovery of voxels that exhibit strong activation in response to previously undefined semantic concepts across two independent datasets: the Natural Scenes Dataset (NSD) and the Natural Object Dataset (NOD).

Published

2026-03-14

How to Cite

Gao, T., Xue, M., Zheng, H., Lv, S., Xu, J., Sheng, D., … Yang, G. (2026). BrainLMM: A Label-Free Framework for Mapping Multi-Semantic Representation in the Human Visual Cortex. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4176–4184. https://doi.org/10.1609/aaai.v40i6.42413

Issue

Section

AAAI Technical Track on Computer Vision III