When Genes Speak: A Semantic-Guided Framework for Spatially Resolved Transcriptomics Data Clustering

Authors

  • Jiangkai Long China University of Geosciences, Wuhan, China
  • Yanran Zhu China University of Geosciences, Wuhan, China
  • Chang Tang Huazhong University of Science and Technology, Wuhan, China
  • Kun Sun China University of Geosciences, Wuhan, China
  • Yuanyuan Liu China University of Geosciences, Wuhan, China
  • Xuesong Yan China University of Geosciences, Wuhan, China

DOI:

https://doi.org/10.1609/aaai.v40i1.37047

Abstract

Spatial transcriptomics enables gene expression profiling with spatial context, offering unprecedented insights into the tissue microenvironment. However, most computational models treat genes as isolated numerical features, ignoring the rich biological semantics encoded in their symbols. This prevents a truly deep understanding of critical biological characteristics. To overcome this limitation, we present SemST, a semantic-guided deep learning framework for spatial transcriptomics data clustering. SemST leverages Large Language Models (LLMs) to enable genes to "speak" through their symbolic meanings, transforming gene sets within each tissue spot into biologically informed embeddings. These embeddings are then fused with the spatial neighborhood relationships captured by Graph Neural Networks (GNNs), achieving a coherent integration of biological function and spatial structure. We further introduce the Fine-grained Semantic Modulation (FSM) module to optimally exploit these biological priors. The FSM module learns spot-specific affine transformations that empower the semantic embeddings to perform an element-wise calibration of the spatial features, thus dynamically injecting high-order biological knowledge into the spatial context. Extensive experiments on public spatial transcriptomics datasets show that SemST achieves state-of-the-art clustering performance. Crucially, the FSM module exhibits plug-and-play versatility, consistently improving the performance when integrated into other baseline methods.

Downloads

Published

2026-03-14

How to Cite

Long, J., Zhu, Y., Tang, C., Sun, K., Liu, Y., & Yan, X. (2026). When Genes Speak: A Semantic-Guided Framework for Spatially Resolved Transcriptomics Data Clustering. Proceedings of the AAAI Conference on Artificial Intelligence, 40(1), 800–808. https://doi.org/10.1609/aaai.v40i1.37047

Issue

Section

AAAI Technical Track on Application Domains I