KGIPA

Standalone Package Download


1. Download

You can obtain the KGIPA standalone package either by downloading the ZIP file or cloning the GitHub repository: https://github.com/ShutaoChen97/KGIPA

To clone the repository, run the following command:

git clone git@github.com:ShutaoChen97/KGIPA.git
cd KGIPA/

2. Installation

2.1 Create Conda Environment

conda create -n kgipa python=3.10
conda activate kgipa

2.2 Requirements

We recommend installing the environment using the provided environment.yaml file to ensure compatibility:

conda env update -f environment.yaml --prune

If this approach fails or Conda is not available, you can manually install the main dependencies as listed below:

python 3.10
biopython 1.84
huggingface-hub 0.26.1
numpy 2.1.2
transformers 4.46.0
tokenizers 0.20.1
sentencepiece 0.2.0
torch 2.5.0+cpu
torchaudio 2.5.0+cpu
torchvision 0.20.0+cpu
torch-geometric 2.6.1
shap 0.48.0

Note: If you have an available GPU, you can install the accelerated version of KGIPA using the corresponding CUDA toolkit. Change the URL below to reflect your version of the CUDA toolkit (cu118 for CUDA 11.6/11.8, cu121 for CUDA 12.1). Do not provide a number greater than your installed CUDA toolkit version. For more information on other CUDA versions, see the PyTorch installation documentation.

pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118

2.3 Tools

Feature extraction tools and databases on which KGIPA relies. For more details on installation and usage, please refer to KGIPA GitHub repository .

SCRATCH-1D 1.2
IUPred2A
ncbi-blast 2.13.0
ProtT5
trRosetta

Databases and model:

Database / Model Description Download
nrdb90 NCBI BLAST sequence database Download
uniclust30_2018_08 HHsuite sequence database Download
model_res2net_202108 Pre-trained network models of trRosetta Download

2.4 Install KGIPA

Finally, configure the default paths of the tools and databases in conf.py.


3. Usage

To predict peptide-protein binary interaction and peptide-protein-specific binding residues, follow these steps:

  1. Replace the default peptide sequence in example/Peptide_Seq.fasta and protein sequence in example/Protein_Seq.fasta with your own sequences (FASTA format).
  2. Run the predictor:
conda activate kgipa
python run_predictor.py -uip example

If you want to retrain KGIPA on your private dataset, locate the original KGIPA model in model.py. The KGIPA source code is implemented in PyTorch and can be easily imported by instantiating the model.


4. Problem Feedback

If you have questions on how to use KGIPA, feel free to raise them in the discussions section. If you identify any potential bugs, please report them in the issue tracker. In addition, if you have any further questions about KGIPA, you can contact us directly at stchen@bliulab.net.

Cite

Upon the usage the users are requested to use the following citation:

Shutao Chen, Ke Yan, Jiangyi Shao, Xiangxiang Zeng, and Bin Liu*.
Pragmatic analysis with knowledge-guided for unraveling peptide-protein pairwise non-covalent mechanisms. (Submitted)