FoldRec-C2C web server

Tutorial

--The related information of each file is writen in this tutorial:

Source

--The source code of FoldRec-C2C predictor:

[FoldRec-C2C-V1]

--Three re-ranking models:

seq-to-seq model(S2S): [seq-to-seq-V1]^a

seq-to-cluster model(S2C): [seq-to-cluster-V1]

cluster-to-cluster model(C2C): [cluster-to-cluster-V1]

^a: S2S model refers to the Fold-LTR-TCP [1].

Features

--The benchmark dataset:

[LINDAHL dataset]^a

^a: This dataset was proposed by Lindahl, E. Elofsson, A [2].

--The extracted features

[Top-2-grams]^c

[84-features]^e

^a: HHSearch is contained in HHsuite [4];

^b: Top-1-gram feature is generated by Pse-in-one2.0 [3];

^c: Top-2-grams feature is generated by Pse-in-one2.0 [3];

^d: DeepFR is the result of this study [5];

^e: 84-features is the result of this study [6].

--The features for LTR training

[all the features]^a

^a: The processing of features refers to Fold-LTR-TCP [1].

--The trained Learning to Rank model

[LTR model]^a

^a: LTR is a powerful information retrieval algorithm proposed by Burges, Christopher JC [7]

--The dependency softwares of FoldRec-C2C

[Pse-in-one-2.0]^b

[RankLib-2.13]^c

^a: HHsuite is an open-source software package, which contans HHSearch and HHblits. [4];

^b: Pse-in-one2.0 is proposed by Liu B, Wu H, Chou K-C [3];

^c: RankLib is a library of learning to rank algorithms [7].

Results

--The ranking result of three re-ranking models

[The result of S2S]^a

[The result of S2C]^b

[The result of C2C]^c

^a: FoldRec-C2C based on S2S;

^b: FoldRec-C2C based on S2C;

^c: FoldRec-C2C based on C2C.

Evaluation

--The specificity-sensitivity data of all the methods used for comparison

[The specificity-sensitivity data]

Reference

1. Liu B, Zhu Y, Yan K. Fold-LTR-TCP: protein fold recognition based on triadic closure principle, Brief Bioinform 2019.

2. Lindahl E, Elofsson A. Identification of related proteins on family, superfamily and fold level, J Mol Biol 2000;295:613-625.

3. Liu B, Wu H, Chou K-C. Pse-in-One 2.0: An Improved Package of Web Servers for Generating Various Modes of Pseudo Components of DNA, RNA, and Protein Sequences, Natural Science 2017;Vol.09No.04:23

4. Remmert M, Biegert A, Hauser A et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment, Nat Methods 2011;9:173-175.

5. Zhu J, Zhang H, Li SC et al. Improving protein fold recognition by extracting fold-specific features from predicted residue–residue contacts, Bioinformatics 2017;33:3749-3757.

6. Jo T, Cheng J. Improving protein fold recognition by random forest, BMC Bioinformatics 2014;15 Suppl 11:S14.

7. Burges CJ. From ranknet to lambdarank to lambdamart: An overview, Learning 2010;11:81.