PVADRP

Document


1. Datasets and Codes

we constructed two independent datasets, GDSC1 and GDSC2. Specifically, the GDSC1-based dataset comprises 116,966 response pairs (448 cell lines, 284 drugs) with an 8.0% missing value rate. The GDSC2-based dataset includes 91,536 pairs (449 cell lines, 222 drugs) with an 8.1% missing value rate. Our case studies are mainly based on drugs from GDSC2, which contains more recent drug response data. Our dataset can be downloaded from Hugging Face .

Source code and data:   github link

2. References

[1] Ross, J., B. Belgodere, V. Chenthamarakshan, et al., Large-scale chemical language representations capture molecular structure and properties. Nature Machine Intelligence(2022).

Cite

Upon the usage the users are requested to use the following citation:

Ren Qi, Shujia Liu, Tianhong Quan and Bin Liu*.
Drug Response Prediction Using core aware perturbation Augmentation and Virtual Omics with Attention Mechanisms