2 Quickstart
Some simple examples of how to use executables contained in the PDB-BRE package.
2.1 Data collection
The PDB-BRE package performs the analysis based on the PDB ID in the RCSB PDB database (https://www.rcsb.org/). Specifically, complexes that may include protein-protein interactions, peptide-protein interactions, DNA-protein interactions, RNA-protein interactions, NA hybrid-protein interactions and ligand-protein interactions can be screened by advanced search in the RCSB PDB database. Comma-separated PDB IDs are available through the ‘Download All’ button after an advanced search (consistent with the input format of PDB-BRE).
Sample datasets containing the above six interactions are collected under the ./doc path:
File Name |
Included PDB IDs |
---|---|
protein-protein.txt |
1A6A,1AFQ,1AIS,1AL2 |
peptide-protein.txt |
1A07,1A08,148L,1G32 |
DNA-protein.txt |
10MH,173D,185D,193D |
RNA-protein.txt |
1A1T,1A34,1A4T,1A9N |
NA_hybrid-protein.txt |
1D9D,1D9F,1NH3,1SI2 |
ligand-protein.txt |
101M,102L,102M,103L |
2.2 PDB-BRE-InterPair
The executable file PDB-BRE-InterPair is used to extract specific interaction pairs and binding residues from PDB file of the complex. Specifically, the interaction types of entities in the complexes supported by PDB-BRE include protein-protein interaction, peptide-protein interaction, DNA protein interaction, RNA-protein interaction, NA hybrid-protein interaction and ligand-protein interaction.
2.2.1 Example description of PDB-BRE-InterPair
The command line examples for extracting the above six interaction pairs are as follows:
cd <INSTALL_DIR>
# If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below.
# For protein-protein interaction (the following two commands have the same effect):
./PDB-BRE-InterPair -f ./doc/protein-protein.txt -t 'protein' -c 50 -n 4
# ./PDB-BRE-InterPair -i '1A6A,1AFQ,1AIS,1AL2' -t 'protein' -c 50 -n 4
# For peptide-protein interaction:
./PDB-BRE-InterPair -f ./doc/peptide-protein.txt -t 'peptide' -c 50 -n 4
# ./PDB-BRE-InterPair -i '1A07,1A08,148L,1G32' -t 'peptide' -c 50 -n 4
# For DNA-protein interaction:
./PDB-BRE-InterPair -f ./doc/DNA-protein.txt -t 'DNA' -c 0 -n 4
# ./PDB-BRE-InterPair -i '10MH,173D,185D,193D' -t 'DNA' -c 0 -n 4
# For RNA-protein interaction:
./PDB-BRE-InterPair -f ./doc/RNA-protein.txt -t 'RNA' -c 0 -n 4
# ./PDB-BRE-InterPair -i '1A1T,1A34,1A4T,1A9N' -t 'RNA' -c 0 -n 4
# For NA hybrid-protein interaction:
./PDB-BRE-InterPair -f ./doc/NA_hybrid-protein.txt -t 'DNA&RNA hybrid' -c 0 -n 4
# ./PDB-BRE-InterPair -i '1D9D,1D9F,1NH3,1SI2' -t 'DNA&RNA hybrid' -c 0 -n 4
# For small molecules ligand-protein interaction:
./PDB-BRE-InterPair -f ./doc/ligand-protein.txt -t 'ligand' -c 0 -n 4
# ./PDB-BRE-InterPair -i '101M,102L,102M,103L' -t 'ligand' -c 0 -n 4
2.2.2 Output description of PDB-BRE-InterPair
The output files obtained by executing the above command line are stored in the ./Result directory:
InterPair_<type>_5.0_Any.csv: Store all the interaction pairs obtained by the analysis in the file.
InterPairDeRedun_<type>_5.0_Any.csv: Store all the de-redundant interaction pairs obtained by the analysis in the file, except for ligand-protein interaction.
Besides, there may be four txt files that record exceptions during PDB-BRE-InterPair analysis are stored in the ./Exception/InterPairException directory:
PDBFile_Exception_5.0.txt: No PDB format file or PDB ID exception.
WebCon_Exception_5.0.txt: Web page connection exception.
NoInterPair_Exception_5.0.txt: No interaction pair in a given PDB ID.
NoSpecType_Exception_5.0.txt: No analysis type specified in a given PDB ID.
Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, Any represents the way to judge the distance between residues, and <type> should be one of protein, peptide, DNA, RNA, DNA&RNA hybrid or ligand.
2.3 PDB-BRE-DonSeqLabel
The executable file PDB-BRE-DonSeqLabel is used to extract the donor sequence labels from the output file of PDB-BRE-InterPair. The types of donors supported by the executable file include: peptides, DNA, RNA and DNA&RNA hybrid.
The command examples for extracting the donor sequence labels in interaction pairs are as follows:
cd <INSTALL_DIR>
# For peptide-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_peptide_5.0_Any.csv
# For DNA-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_DNA_5.0.csv
# For RNA-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_RNA_5.0.csv
# For NA hybrid-protein interaction:
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_DNA&RNA hybrid_5.0.csv
# General format:
# Note that <distance method> is valid only if type is 'peptide', optional 'Any' or 'CA'.
./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_<type>_<distance threshold>_<distance method>.csv
The output file obtained by executing the above command are DonSeqLabel_<type>_5.0_Any.csv in the ./Result directory. Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, Any represents the way to judge the distance between residues, and <type> should be one of peptides, DNA, RNA or DNA&RNA hybrid.
2.4 PDB-BRE-ProSeqLabel
The executable file PDB-BRE-ProSeqLabel is used to extract the protein sequence labels from the output file of PDB-BRE-InterPair. When using the executable file to extract protein sequence labels, the donor type in the PDB-BRE-InterPair output file should be one of peptides, DNA, RNA or DNA&RNA hybrid.
The command examples for extracting the protein sequence labels in interaction pairs are as follows:
cd <INSTALL_DIR>
# If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below.
# For peptide-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_peptide_5.0_Any.csv -n 4
# For DNA-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_DNA_5.0.csv -n 4
# For RNA-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_RNA_5.0.csv -n 4
# For NA hybrid-protein interaction:
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_DNA&RNA hybrid_5.0.csv -n 4
# General format:
# Note that <distance method> is valid only if type is 'peptide', optional 'Any' or 'CA'.
./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_<type>_<distance threshold>_<distance method>.csv -n 4
The output file obtained by executing the above command are ProSeqLabel_<type>_5.0_Any.csv in the ./Result directory. Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, Any represents the way to judge the distance between residues, and <type> should be one of peptides, DNA, RNA or DNA&RNA hybrid.
2.5 PDB-BRE-PPISeqLabel
The executable file PDB-BRE-ProSeqLabel is used to extract the protein sequence labels from the output file of PDB-BRE-InterPair. When using the executable file to extract protein sequence labels, the donor type in the PDB-BRE-InterPair output file should be protein.
The command examples for extracting the protein sequence labels in interaction pairs are as follows:
cd <INSTALL_DIR>
# If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below.
./PDB-BRE-PPISeqLabel -i ./Result/InterPairDeRedun_protein_5.0_Any.csv -n 4
The output file obtained by executing the above command are ProSeqLabel_1_protein_5.0_Any.csv and ProSeqLabel_2_protein_5.0_Any.csv in the ./Result directory. Among them, 5.0 represents the threshold for judging whether there is an interaction between residues, and Any represents the way to judge the distance between residues.