The development of spatial transcriptomics (ST) technologies has revolutionized the way we map the complex organization and functions of tissues. These technologies offer valuable insights into the organization and function of complex biological systems. However, existing methods often overly focus on specific data characteristics, thereby hindering the comprehensive capture of multi-layered biological heterogeneity. STMSC is proposed as a multi-slice joint analysis framework featuring a pre-correction mechanism that enables the precise identification of complex spatial domains, advancing disease pathology insights. STMSC posits that precise three-dimensional (3D) reconstruction is a prerequisite for in-depth investigation of tissue components and mechanisms. Incorporating hematoxylin and eosin (H&E) imaging data, STMSC enhances slice alignment accuracy in 3D reconstruction. By deconstructing microenvironments, it reconstructs fine-grained cellular landscapes and emphasizes collective cellular behavior in defining spatial domains. Its graph attention autoencoder with pre-correction balances biological information at different levels, improving the accuracy and interpretability of spatial transcriptomic analyses. By analyzing consecutive tissue slices and pathological datasets, STMSC accurately reconstructs 3D structures and improves interpretability in complex cancer environments. Specifically, STMSC captures intra- and inter-stage heterogeneity in cancer development, offering profound insights into the complexity of pathological tissue structures.
Figure.1 Overview of STMSC. ① Slice Alignment: Uses ICP to align multiple slices, establishing the three-dimensional positions of spots and constructing a global 3D structure. ② Microenvironmental Deconstruction: trains a cell-to-spot mapping matrix using spatial transcriptomics gene expression data and single-cell data via a spatially informed contrastive learning model. In this model, the similarity of positive pairs (i.e., spatially adjacent spots) is maximized, while the similarity of negative pairs (i.e., spatially non-adjacent spots) is minimized. ③ Construction and correction of 3D neighborhood graph. ④ Joint Modeling: STMSC jointly models multiple slices and employs a graph attention mechanism-based autoencoder to learn latent spot representations with 3D spatial information. ⑤ Downstream analysis: the learned latent representations serve as inputs for downstream analysis, including 3D spatial domain identification and spatial trajectory inference.
The spatial transcriptomics and scRNA-seq datasets used in this study are publicly accessible. The DLPFC dataset can be obtained from the following link: http://research.libd.org/spatialLIBD/32. The mouse coronal brain dataset from the 10X Visium platform is available at https://squidpy.readthedocs.io/en/latest/notebooks/tutorials/tutorial_visium_hne.html39, while the human breast cancer dataset from the 10X Visium platform can be accessed at https://support.10xgenomics.com/spatial-gene-expression/datasets/1.1.0. For the scRNA-seq data, the human dorsolateral prefrontal cortex dataset, analyzed using the 10x Genomics Chromium platform, is available under accession number GSE14413670. The mouse brain dataset, also analyzed using the 10x Genomics Chromium platform, can be found under E-MTAB-1111552. Additionally, the single-cell transcriptomic data for human breast cancer is accessible at https://singlecell.broadinstitute.org/single_cell/study/SCP103971.
Upon the usage the users are requested to use the following citation:
· Daijun Zhang, Ren Qi, Xun Lan, Bin Liu. STMSC: A Novel Multi-Slice Framework for Precision 3D Spatial Domain Reconstruction and Disease Pathology Analysis (Submitted)