PreHom-PCLM: Protein Remote Homology Detection by Combing Motifs and Protein Cubic Language Model

| Home | Server | Tutorial | Dataset | Citation |


The training, validation and test datasets can be downloaded from the following links:

training.fa ,   validation.fa ,   test.fa

For the independent test set, we extracted the proteins added in the SCOPe database from 2020-7 to 2021-12, and then reduced its redundancy by MMseqs (0.95 sequence identity and 10e-4 evalue) [1]. The independent test set contains 5990 proteins covering 839 superfamilies, which can be downloaded from the following links:



1. Zhu, J., et al., Improving protein fold recognition by extracting fold-specific features from predicted residue-residue contacts. Bioinformatics (Oxford, England), 2017. 33(23): p. 3749-3757.


Bin Liu
School of Computer Science and Technology, Beijing Institute of Technology, China.

Copyright@ By Liu Lab, Beijing Institute of Technology.

网站备案号: 粤ICP备19041859号-1