PreHom-PCLM: Protein Remote Homology Detection by Combing Motifs and Protein Cubic Language Model
PreHom-PCLM web server is constructed to predict super-family class for query protein sequence, which is compatible with most major browsers, and the parallel speed-up is implemented. It offers an open and interactive web service, which accepts query sequences in FASTA format and returns the search results in a user-friendly manner. For the convenience of the experimental scientists, a step-by-step guide on how to use the PreHom-PCLM web server is given below.
Visit the web server by clicking the link at http://bliulab.net/PreHom-PCLM/server and you will see the page as shown in Fig. 1. The Microsoft Edge and Google Chrome browsers are recommended.
STEP 1: Input the query protein sequences
You can directly enter/paste the query protein sequences into the input box, or upload them via file by clicking the Choose File button. All the input sequences should be in the FASTA format. A sequence in FASTA format consists of a single line beginning with the symbol ">" and multiple lines of base sequence data. You can click the Examples button to automatically input the built-in sequence examples, as show in Fig. 2, and click the Reset button to empty all input sequences. The Email address is optional, please input your valid Email address so that we can send the results to you.
The predicting probability threshold parameter is used to filter predicting results of PreHom-PCLM with higher confidence. Specifically, given a query protein sequence, PreHom-PCLM will generate a probalility distribution over the 1960 superfamilies (based on the label space of the training, test and validation dataset) where a greater probalility indicates better prediction. Using the predicting probability threshold parameter to filter predictions with higher probabilities, users can check the most thrusting predicting results of query proteins, which is an intuitional way to feel the excellent performance of PreHom-PCLM.
STEP 2: Submit your queries
If you enter query protein sequences, and then click the Submit button, you will see a processing page as shown in Fig. 3. Your job is being processed. The results will be shown on your screen when it is finished. You can also close this browser window, and reload the results by using the link.
STEP 3: View results
The results will be shown on your screen when it is finished, as shown in Fig. 4. The results will be stored for 7 days, and you can click the Download link to download it. You can find the protein primary sequence by clicking the protein button in Visualization column of the table. Corresponding predicted results can be download by clicking the corresponding download buttons in Visualization column of the table.
Here, the predicting score is the predicting probability of the predicted superfamily.
School of Computer Science and Technology, Beijing Institute of Technology, China.
Copyright@ By Liu Lab, Beijing Institute of Technology.