Document
1. Datasets
The datasets utilized in this study were curated from public single-cell RNA sequencing (scRNA-seq) repositories and pharmacogenomic databases. We integrated scRNA-seq drug-response cohorts from the NCBI Gene Expression Omnibus (GEO)[1], encompassing diverse cancer types and therapeutic agents (e.g., GSE111014, GSE149214, GSE131984). Additionally, drug mechanism features were constructed using target annotations from the Genomics of Drug Sensitivity in Cancer (GDSC)[2] and transcriptional perturbation signatures from LINCS L1000[3]. These datasets can be accessed via the following links:
GEO Repository: https://www.ncbi.nlm.nih.gov/geo/
GDSC Portal: https://www.cancerrxgene.org/
LINCS L1000: https://lincsproject.org/
2. Tools
The scRADAR framework is implemented in Python and leverages the scverse ecosystem for single-cell data handling. The pipeline integrates specific tools for pathway activity inference and downstream machine learning analysis, including PROGENy[4] for signaling pathway estimation, GSVA[5] for metabolic pathway enrichment (ssGSEA), and scikit-learn[6] for model evaluation. Detailed documentation for these libraries is provided below:
scverse (Scanpy/AnnData): https://scverse.org/
PROGENy: https://saezlab.github.io/progeny/
GSVA: https://www.bioconductor.org/packages/release/bioc/html/GSVA.html
scikit-learn: https://scikit-learn.org/
3. References
[1] Barrett, Tanya, et al. "NCBI GEO: archive for functional genomics data sets—update." Nucleic acids research 41.D1 (2012): D991-D995. [2] Yang, Wanjuan, et al. "Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells." Nucleic acids research 41.D1 (2012): D955-D961. [3] Subramanian, Aravind, et al. "A next generation connectivity map: L1000 platform and the first 1,000,000 profiles." Cell 171.6 (2017): 1437-1452. [4] Schubert, Michael, et al. "Perturbation-response genes reveal signaling footprints in cancer gene expression." Nature communications 9.1 (2018): 20. [5] Hänzelmann, Sonja, Robert Castelo, and Justin Guinney. "GSVA: gene set variation analysis for microarray and RNA-seq data." BMC bioinformatics 14.1 (2013): 1-15. [6] Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." Journal of machine learning research 12 (2011): 2825-2830.