HGT-PepPI

Document


If you use HGT-PepPI for research, please cite this paper:

Ke Yan, Tianyi Liu, Shutao Chen, Meijing Li, and Bin Liu*. HGT-PepPI : A Heterogeneous Graph-Based Framework Leveraging Pragmatic Analysis for Protein-Peptide Interaction Prediction. ( Submitted )


Download:


1.The source code of HGT-PepPI:   HGT-PepPI.zip

2. The dataset used in HGT-PepPI:  

The datasets used in this study are derived from the RCSB PDB[1], with pairs showing over 80% similarity between the training and independent test datasets removed using CD-HIT[2]. The processed datasets can be downloaded from the following links:

Train-Sequences.fasta Test167.fasta   Test251.fasta   Test1440.fasta

3.Readme:   readme.md


Tool:


HGT-PepPI utilizes ProtT5[3] for peptide and protein feature extraction. To run HGT-PepPI locally, this tool must be properly configured. Detailed instructions for installation and configuration are provided in the following links:

References


[1] Burley, S. K. et al. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res. 51, D488-D508 (2023). [2] Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150-3152 (2012). [3] Elnaggar, A. et al. Prottrans: Toward understanding the language of life through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112-7127 (2021).