The datasets used to develop and evaluate DECANT were curated from single-cell chemical perturbation resources and external mechanism-annotation databases. The primary perturbation dataset was the Sci-Plex 3 single-cell chemical perturbation dataset, which contains single-cell transcriptomes annotated with compound identity, cell line, dose, treatment time and control status. After feature selection, condition filtering, control matching and minimum cell-count filtering, the final DECANT modeling dataset contained 2,438 matched perturbation bags from 188 drugs across A549, K562 and MCF7 cellular backgrounds. The matched-bag formulation allows each perturbation condition to be represented by treated cells and matched control cells, which is the required input format for DECANT inference.
External LINCS/CMap transcriptional-response signatures were used to construct fixed teacher prototypes for the teacher-guided training variant of DECANT. These teacher prototypes were used only as optional training refinement signals and were not used for validation/test inference or held-out-drug lookup. ChEMBL-derived target or mechanism-family annotations and MSigDB-derived consequence modules were reserved for post hoc biological evaluation and interpretation, not for model fitting, model selection or mechanism-space shaping.
Sci-Plex 3 / single-cell chemical perturbation resource: https://www.science.org/doi/10.1126/science.aax6234
LINCS Project: https://lincsproject.org/
Connectivity Map / CLUE: https://clue.io/
ChEMBL: https://www.ebi.ac.uk/chembl/
MSigDB: https://www.gsea-msigdb.org/gsea/msigdb/
The DECANT framework is implemented in Python and uses PyTorch for model construction, checkpoint loading and neural-network inference. NumPy and pandas are used for array manipulation, metadata processing and output formatting. scikit-learn is used for preprocessing and program-level feature construction. AnnData/Scanpy-compatible input is supported for single-cell expression matrices when users upload perturbation data in .h5ad format.
The DECANT web server loads the pretrained DECANT checkpoint and exported prototype library to perform mechanism-oriented analysis of user-submitted matched treated-control perturbation profiles. Given treated and matched-control single-cell inputs with cell-line, dose and treatment-time metadata, the server computes the context-suppressed mechanism embedding, predicts gene-level and program-level perturbation responses, retrieves nearest DECANT drug prototypes, and summarizes downstream consequence programs.
PyTorch: https://pytorch.org/
NumPy: https://numpy.org/
pandas: https://pandas.pydata.org/
scikit-learn: https://scikit-learn.org/
Scanpy: https://scanpy.readthedocs.io/
AnnData: https://anndata.readthedocs.io/
[1] Srivatsan, S. R., McFaline-Figueroa, J. L., Ramani, V., Saunders, L., Cao, J., Packer, J., et al. "Massively multiplex chemical transcriptomics at single-cell resolution." Science 367 (2020): 45-51. [2] Subramanian, A., Narayan, R., Corsello, S. M., Peck, D. D., Natoli, T. E., Lu, X., et al. "A next generation Connectivity Map: L1000 platform and the first 1,000,000 profiles." Cell 171 (2017): 1437-1452.e17. [3] Mendez, D., Gaulton, A., Bento, A. P., Chambers, J., De Veij, M., Felix, E., et al. "ChEMBL: towards direct deposition of bioassay data." Nucleic Acids Research 47 (2019): D930-D940. [4] Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., et al. "Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles." Proceedings of the National Academy of Sciences 102 (2005): 15545-15550. [5] Wolf, F. A., Angerer, P., and Theis, F. J. "SCANPY: large-scale single-cell gene expression data analysis." Genome Biology 19 (2018): 15. [6] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., et al. "PyTorch: an imperative style, high-performance deep learning library." Advances in Neural Information Processing Systems 32 (2019). [7] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. "Scikit-learn: machine learning in Python." Journal of Machine Learning Research 12 (2011): 2825-2830.
Ren Qi, Wenjie Teng, Xin Yang, Yue Cheng, Bin Liu*
DECANT: Decoupling mechanism from context in single-cell drug perturbation representation.
Submitted.