IIDL-PepPI

Introduction


Protein complex structural data is growing at an unprecedented pace, but its complexity and diversity pose significant challenges for protein function research. Although deep learning models have been widely used to capture the syntactic structure, word semantics, or semantic meanings of polypeptide and protein sequences, these models often overlook the complex contextual information of sequences. Here, we propose IIDL-PepPI, a deep learning model designed to tackle these challenges by employing data-driven and interpretable pragmatic analysis to profile peptide-protein interactions (PepPIs). IIDL-PepPI constructs bidirectional attention modules to represent the contextual information of peptides and proteins, enabling pragmatic analysis. It then adopts a progressive transfer learning framework to simultaneously predict PepPIs and identify binding residues in specific interactions, providing a solution for multi-level in-depth profiling. We validate the performance and robustness of IIDL-PepPI in accurately predicting peptide-protein binary interactions and identifying binding residues compared with state-of-the-art methods. We further demonstrate the capability of IIDL-PepPI in peptide virtual drug screening, binding affinity assessment, and complex residue flexibility prediction, which is expected to advance artificial intelligence-based peptide drug discovery and protein function elucidation.

Figure.1 Data preparation workflow and network architecture of IIDL-PepPI. a Data preparation workflow of IIDL-PepPI, in which the public databases used include RCSB PDB, PDBe, and UniProt. b Network architecture of IIDL-PepPI for peptide-protein binary interaction prediction and binding residue recognition, including sequence representation, feature encoding, bi-attentional module, and decoding. Based on the biological sequence pragmatic analysis, the bi-attention module explicitly integrates features from the peptide and protein sides to distinguish different peptide-protein-specific interactions. c The progressive transfer learning architecture. The initial stage of IIDL-PepPI commences with pre-training peptide-protein binary interactions using sequence-level datasets and the coarse-grained learning of basic network parameters. Subsequently, in the second phase, we transfer the parameters of the basic network, replace the decoder, and conduct fine-grained fine-tuning of the model using residue-level dataset for precise prediction of peptide- and protein-binding residues in specific peptide-protein pairs.


References

Upon the usage the users are requested to use the following citation:

Shutao Chen, Ke Yan, Xuelong Li, and Bin Liu*.
Protein language pragmatic analysis and progressive transfer learning for profiling peptide-protein interactions. (Submitted)




Introduction

In this study, we present IIDL-PepPI - a deep learning model enabling protein language pragmatic analysis and progressive transfer learning for peptide-protein binary interaction prediction and pair-specific binding residue identification.

NOTE

If you are interested in this research area or have any questions, please do not hesitate to contact us and we will do our best to answer them in order to facilitate mutual learning and progress. If you use our research results, please cite this article.

Copyright © bliu@bliulab rights reserved.

Back to home