Tutorial

--The related information of each file is writen in this tutorial:

Source

--The source code of FoldRec-C2C predictor:

--Three re-ranking models:

seq-to-seq model(S2S): [seq-to-seq-V1]a
seq-to-cluster model(S2C): [seq-to-cluster-V1]
cluster-to-cluster model(C2C): [cluster-to-cluster-V1]
a: S2S model refers to the Fold-LTR-TCP [1].

Features

--The benchmark dataset:

a: This dataset was proposed by Lindahl, E. Elofsson, A [2].

--The extracted features

a: HHSearch is contained in HHsuite [4];

b: Top-1-gram feature is generated by Pse-in-one2.0 [3];

c: Top-2-grams feature is generated by Pse-in-one2.0 [3];

d: DeepFR is the result of this study [5];

e: 84-features is the result of this study [6].

--The features for LTR training

a: The processing of features refers to Fold-LTR-TCP [1].

--The trained Learning to Rank model

a: LTR is a powerful information retrieval algorithm proposed by Burges, Christopher JC [7]

--The dependency softwares of FoldRec-C2C

a: HHsuite is an open-source software package, which contans HHSearch and HHblits. [4];

b: Pse-in-one2.0 is proposed by Liu B, Wu H, Chou K-C [3];

c: RankLib is a library of learning to rank algorithms [7].

Results

--The ranking result of three re-ranking models

a: FoldRec-C2C based on S2S;

b: FoldRec-C2C based on S2C;

c: FoldRec-C2C based on C2C.

Evaluation

--The specificity-sensitivity data of all the methods used for comparison

Reference

1. Liu B, Zhu Y, Yan K. Fold-LTR-TCP: protein fold recognition based on triadic closure principle, Brief Bioinform 2019.

2. Lindahl E, Elofsson A. Identification of related proteins on family, superfamily and fold level, J Mol Biol 2000;295:613-625.

3. Liu B, Wu H, Chou K-C. Pse-in-One 2.0: An Improved Package of Web Servers for Generating Various Modes of Pseudo Components of DNA, RNA, and Protein Sequences, Natural Science 2017;Vol.09No.04:23

4. Remmert M, Biegert A, Hauser A et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment, Nat Methods 2011;9:173-175.

5. Zhu J, Zhang H, Li SC et al. Improving protein fold recognition by extracting fold-specific features from predicted residue–residue contacts, Bioinformatics 2017;33:3749-3757.

6. Jo T, Cheng J. Improving protein fold recognition by random forest, BMC Bioinformatics 2014;15 Suppl 11:S14.

7. Burges CJ. From ranknet to lambdarank to lambdamart: An overview, Learning 2010;11:81.