2 Quickstart

Some simple examples of how to use executables contained in the PDB-BRE package.

2.1 Data collection

The PDB-BRE package performs the analysis based on the PDB ID in the RCSB PDB database (https://www.rcsb.org/). Specifically, complexes that may include protein-protein interactions, peptide-protein interactions, DNA-protein interactions, RNA-protein interactions, NA hybrid-protein interactions and ligand-protein interactions can be screened by advanced search in the RCSB PDB database. Comma-separated PDB IDs are available through the ‘Download All’ button after an advanced search (consistent with the input format of PDB-BRE).

Sample datasets containing the above six interactions are collected under the ./doc path:

File Name

Included PDB IDs

protein-protein.txt

1A6A,1AFQ,1AIS,1AL2

peptide-protein.txt

1A07,1A08,148L,1G32

DNA-protein.txt

10MH,173D,185D,193D

RNA-protein.txt

1A1T,1A34,1A4T,1A9N

NA_hybrid-protein.txt

1D9D,1D9F,1NH3,1SI2

ligand-protein.txt

101M,102L,102M,103L

2.2 PDB-BRE-InterPair

The executable file PDB-BRE-InterPair is used to extract specific interaction pairs and binding residues from PDB file of the complex. Specifically, the interaction types of entities in the complexes supported by PDB-BRE include protein-protein interaction, peptide-protein interaction, DNA protein interaction, RNA-protein interaction, NA hybrid-protein interaction and ligand-protein interaction.

2.2.1 Example description of PDB-BRE-InterPair

The command line examples for extracting the above six interaction pairs are as follows:

cd <INSTALL_DIR>

# If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below.

# For protein-protein interaction (the following two commands have the same effect):
./PDB-BRE-InterPair -f ./doc/protein-protein.txt -t 'protein' -c 50 -n 4
# ./PDB-BRE-InterPair -i '1A6A,1AFQ,1AIS,1AL2' -t 'protein' -c 50 -n 4

# For peptide-protein interaction:
./PDB-BRE-InterPair -f ./doc/peptide-protein.txt -t 'peptide' -c 50 -n 4
# ./PDB-BRE-InterPair -i '1A07,1A08,148L,1G32' -t 'peptide' -c 50 -n 4

# For DNA-protein interaction:
./PDB-BRE-InterPair -f ./doc/DNA-protein.txt -t 'DNA' -c 0 -n 4
# ./PDB-BRE-InterPair -i '10MH,173D,185D,193D' -t 'DNA' -c 0 -n 4

# For RNA-protein interaction:
./PDB-BRE-InterPair -f ./doc/RNA-protein.txt -t 'RNA' -c 0 -n 4
# ./PDB-BRE-InterPair -i '1A1T,1A34,1A4T,1A9N' -t 'RNA' -c 0 -n 4

# For NA hybrid-protein interaction:
./PDB-BRE-InterPair -f ./doc/NA_hybrid-protein.txt -t 'DNA&RNA hybrid' -c 0 -n 4
# ./PDB-BRE-InterPair -i '1D9D,1D9F,1NH3,1SI2' -t 'DNA&RNA hybrid' -c 0 -n 4

# For small molecules ligand-protein interaction:
./PDB-BRE-InterPair -f ./doc/ligand-protein.txt -t 'ligand' -c 0 -n 4
# ./PDB-BRE-InterPair -i '101M,102L,102M,103L' -t 'ligand' -c 0 -n 4

2.2.2 Output description of PDB-BRE-InterPair

The output files obtained by executing the above command line are stored in the ./Result directory:

  • InterPair_<type>_5.0_Any.csv: Store all the interaction pairs obtained by the analysis in the file.

  • InterPairDeRedun_<type>_5.0_Any.csv: Store all the de-redundant interaction pairs obtained by the analysis in the file, except for ligand-protein interaction.

Besides, there may be four txt files that record exceptions during PDB-BRE-InterPair analysis are stored in the ./Exception/InterPairException directory:

  • PDBFile_Exception_5.0.txt: No PDB format file or PDB ID exception.

  • WebCon_Exception_5.0.txt: Web page connection exception.

  • NoInterPair_Exception_5.0.txt: No interaction pair in a given PDB ID.

  • NoSpecType_Exception_5.0.txt: No analysis type specified in a given PDB ID.

Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, Any represents the way to judge the distance between residues, and <type> should be one of protein, peptide, DNA, RNA, DNA&RNA hybrid or ligand.

2.3 PDB-BRE-DonSeqLabel

The executable file PDB-BRE-DonSeqLabel is used to extract the donor sequence labels from the output file of PDB-BRE-InterPair. The types of donors supported by the executable file include: peptides, DNA, RNA and DNA&RNA hybrid.

The command examples for extracting the donor sequence labels in interaction pairs are as follows:

cd <INSTALL_DIR>

# For peptide-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_peptide_5.0_Any.csv

# For DNA-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_DNA_5.0.csv

# For RNA-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_RNA_5.0.csv

# For NA hybrid-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_DNA&RNA hybrid_5.0.csv

# General format:
# Note that <distance method> is valid only if type is 'peptide', optional 'Any' or 'CA'.
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_<type>_<distance threshold>_<distance method>.csv

The output file obtained by executing the above command are DonSeqLabel_<type>_5.0_Any.csv in the ./Result directory. Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, Any represents the way to judge the distance between residues, and <type> should be one of peptides, DNA, RNA or DNA&RNA hybrid.

2.4 PDB-BRE-ProSeqLabel

The executable file PDB-BRE-ProSeqLabel is used to extract the protein sequence labels from the output file of PDB-BRE-InterPair. When using the executable file to extract protein sequence labels, the donor type in the PDB-BRE-InterPair output file should be one of peptides, DNA, RNA or DNA&RNA hybrid.

The command examples for extracting the protein sequence labels in interaction pairs are as follows:

cd <INSTALL_DIR>

# If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below.

# For peptide-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_peptide_5.0_Any.csv -n 4

# For DNA-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_DNA_5.0.csv -n 4

# For RNA-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_RNA_5.0.csv -n 4

# For NA hybrid-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_DNA&RNA hybrid_5.0.csv -n 4

# General format:
# Note that <distance method> is valid only if type is 'peptide', optional 'Any' or 'CA'.
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_<type>_<distance threshold>_<distance method>.csv -n 4

The output file obtained by executing the above command are ProSeqLabel_<type>_5.0_Any.csv in the ./Result directory. Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, Any represents the way to judge the distance between residues, and <type> should be one of peptides, DNA, RNA or DNA&RNA hybrid.

2.5 PDB-BRE-PPISeqLabel

The executable file PDB-BRE-ProSeqLabel is used to extract the protein sequence labels from the output file of PDB-BRE-InterPair. When using the executable file to extract protein sequence labels, the donor type in the PDB-BRE-InterPair output file should be protein.

The command examples for extracting the protein sequence labels in interaction pairs are as follows:

cd <INSTALL_DIR>

# If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below.
./PDB-BRE-PPISeqLabel -i ./Result/InterPairDeRedun_protein_5.0_Any.csv -n 4

The output file obtained by executing the above command are ProSeqLabel_1_protein_5.0_Any.csv and ProSeqLabel_2_protein_5.0_Any.csv in the ./Result directory. Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, and Any represents the way to judge the distance between residues.