a platform for analyzing DNA, RNA, and protein sequences based on biological language models

Home Server Tutorial Description Download Citation


The source code for BioSeq-BLM:

The stand-alone package and manual for BioSeq-BLM:

Version-1.0[created on 2021-8-22]:

Quick Start:

The tutorial for BioSeq-BLM webserver:

The datasets used in BioSeq-BLM:

1. Identification DNase I hypersensitive sites

2. Identification of real microRNA precursors

3. Identification of DNA binding proteins

4. Identification of intrinsically disordered regions in proteins

5. RNA-binding protein identification

6. RNA secondary structure prediction

Supporting Information for the physicochemical indices:


1.Chen, W., Zhang, X.T., Brooker, J., Lin, H., Zhang, L.Q. and Chou, K.C. (2015) PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions.Bioinformatics, 31, 119.

2.Friedel, M., Nikolajewa, S., Suhnel, J. and Wilhelm, T. (2009) DiProDB: a database for dinucleotide properties. Nucleic Acids Research, 37, D37-D40.

3.Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T. and Kanehisa, M. (2008) AAindex: amino acid index database, progress report 2008.Nucleic Acids Research, 36,D202-D205.