# **2 Quickstart** Some simple examples of how to use executables contained in the PDB-BRE package. ## **2.1 Data collection** The PDB-BRE package performs the analysis based on the PDB ID in the **RCSB PDB database** ([https://www.rcsb.org/](https://www.rcsb.org/)). Specifically, complexes that may include **protein-protein interactions, peptide-protein interactions, DNA-protein interactions, RNA-protein interactions, NA hybrid-protein interactions and ligand-protein interactions** can be screened by advanced search in the RCSB PDB database. Comma-separated PDB IDs are available through the 'Download All' button after an advanced search (consistent with the input format of PDB-BRE). Sample datasets containing the above six interactions are collected under the **./doc** path: | File Name | Included PDB IDs | | :-----| :----- | | protein-protein.txt | 1A6A,1AFQ,1AIS,1AL2 | | peptide-protein.txt | 1A07,1A08,148L,1G32 | | DNA-protein.txt | 10MH,173D,185D,193D | | RNA-protein.txt | 1A1T,1A34,1A4T,1A9N | | NA_hybrid-protein.txt | 1D9D,1D9F,1NH3,1SI2 | | ligand-protein.txt | 101M,102L,102M,103L | ## **2.2 PDB-BRE-InterPair** The executable file PDB-BRE-InterPair is used to extract specific interaction pairs and binding residues from PDB file of the complex. Specifically, the interaction types of entities in the complexes supported by PDB-BRE include protein-protein interaction, peptide-protein interaction, DNA protein interaction, RNA-protein interaction, NA hybrid-protein interaction and ligand-protein interaction. ### 2.2.1 Example description of PDB-BRE-InterPair The command line examples for extracting the above six interaction pairs are as follows: ``` cd # If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below. # For protein-protein interaction (the following two commands have the same effect): ./PDB-BRE-InterPair -f ./doc/protein-protein.txt -t 'protein' -c 50 -n 4 # ./PDB-BRE-InterPair -i '1A6A,1AFQ,1AIS,1AL2' -t 'protein' -c 50 -n 4 # For peptide-protein interaction: ./PDB-BRE-InterPair -f ./doc/peptide-protein.txt -t 'peptide' -c 50 -n 4 # ./PDB-BRE-InterPair -i '1A07,1A08,148L,1G32' -t 'peptide' -c 50 -n 4 # For DNA-protein interaction: ./PDB-BRE-InterPair -f ./doc/DNA-protein.txt -t 'DNA' -c 0 -n 4 # ./PDB-BRE-InterPair -i '10MH,173D,185D,193D' -t 'DNA' -c 0 -n 4 # For RNA-protein interaction: ./PDB-BRE-InterPair -f ./doc/RNA-protein.txt -t 'RNA' -c 0 -n 4 # ./PDB-BRE-InterPair -i '1A1T,1A34,1A4T,1A9N' -t 'RNA' -c 0 -n 4 # For NA hybrid-protein interaction: ./PDB-BRE-InterPair -f ./doc/NA_hybrid-protein.txt -t 'DNA&RNA hybrid' -c 0 -n 4 # ./PDB-BRE-InterPair -i '1D9D,1D9F,1NH3,1SI2' -t 'DNA&RNA hybrid' -c 0 -n 4 # For small molecules ligand-protein interaction: ./PDB-BRE-InterPair -f ./doc/ligand-protein.txt -t 'ligand' -c 0 -n 4 # ./PDB-BRE-InterPair -i '101M,102L,102M,103L' -t 'ligand' -c 0 -n 4 ``` ### 2.2.2 Output description of PDB-BRE-InterPair The output files obtained by executing the above command line are stored in the **./Result** directory: - **InterPair_\\_5.0_Any.csv:** Store all the interaction pairs obtained by the analysis in the file. - **InterPairDeRedun_\_5.0_Any.csv:** Store all the de-redundant interaction pairs obtained by the analysis in the file, except for ligand-protein interaction. Besides, there may be four txt files that record exceptions during PDB-BRE-InterPair analysis are stored in the **./Exception/InterPairException** directory: - **PDBFile_Exception_5.0.txt:** No PDB format file or PDB ID exception. - **WebCon_Exception_5.0.txt:** Web page connection exception. - **NoInterPair_Exception_5.0.txt:** No interaction pair in a given PDB ID. - **NoSpecType_Exception_5.0.txt:** No analysis type specified in a given PDB ID. Among them, _**5.0**_ represents the threshold for judging whether there is an interaction between residues, _**Any**_ represents the way to judge the distance between residues, and **_\_** should be one of _**protein**_, _**peptide**_, _**DNA**_, _**RNA**_, _**DNA&RNA hybrid**_ or _**ligand**_. ## **2.3 PDB-BRE-DonSeqLabel** The executable file PDB-BRE-DonSeqLabel is used to extract the donor sequence labels from the output file of PDB-BRE-InterPair. The types of donors supported by the executable file include: _**peptides**_, _**DNA**_, _**RNA**_ and _**DNA&RNA hybrid**_. The command examples for extracting the donor sequence labels in interaction pairs are as follows: ``` cd # For peptide-protein interaction: ./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_peptide_5.0_Any.csv # For DNA-protein interaction: ./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_DNA_5.0.csv # For RNA-protein interaction: ./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_RNA_5.0.csv # For NA hybrid-protein interaction: ./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun_DNA&RNA hybrid_5.0.csv # General format: # Note that is valid only if type is 'peptide', optional 'Any' or 'CA'. ./PDB-BRE-DonSeqLabel -i ./Result/InterPairDeRedun___.csv ``` The output file obtained by executing the above command are **DonSeqLabel_\\_5.0_Any.csv** in the **./Result** directory. Among them, _**5.0**_ represents the threshold for judging whether there is an interaction between residues, _**Any**_ represents the way to judge the distance between residues, and **_\_** should be one of _**peptides**_, _**DNA**_, _**RNA**_ or _**DNA&RNA hybrid**_. ## **2.4 PDB-BRE-ProSeqLabel** The executable file PDB-BRE-ProSeqLabel is used to extract the protein sequence labels from the output file of PDB-BRE-InterPair. When using the executable file to extract protein sequence labels, the donor type in the PDB-BRE-InterPair output file should be one of _**peptides**_, _**DNA**_, _**RNA**_ or _**DNA&RNA hybrid**_. The command examples for extracting the protein sequence labels in interaction pairs are as follows: ``` cd # If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below. # For peptide-protein interaction: ./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_peptide_5.0_Any.csv -n 4 # For DNA-protein interaction: ./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_DNA_5.0.csv -n 4 # For RNA-protein interaction: ./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_RNA_5.0.csv -n 4 # For NA hybrid-protein interaction: ./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun_DNA&RNA hybrid_5.0.csv -n 4 # General format: # Note that is valid only if type is 'peptide', optional 'Any' or 'CA'. ./PDB-BRE-ProSeqLabel -i ./Result/InterPairDeRedun___.csv -n 4 ``` The output file obtained by executing the above command are **ProSeqLabel_\\_5.0_Any.csv** in the **./Result** directory. Among them, _**5.0**_ represents the threshold for judging whether there is an interaction between residues, _**Any**_ represents the way to judge the distance between residues, and **_\_** should be one of _**peptides**_, _**DNA**_, _**RNA**_ or _**DNA&RNA hybrid**_. ## **2.5 PDB-BRE-PPISeqLabel** The executable file PDB-BRE-ProSeqLabel is used to extract the protein sequence labels from the output file of PDB-BRE-InterPair. When using the executable file to extract protein sequence labels, the donor type in the PDB-BRE-InterPair output file should be _**protein**_. The command examples for extracting the protein sequence labels in interaction pairs are as follows: ``` cd # If your computer has less than 4 cores, replace 4 after '-n' by 1 in the command below. ./PDB-BRE-PPISeqLabel -i ./Result/InterPairDeRedun_protein_5.0_Any.csv -n 4 ``` The output file obtained by executing the above command are **ProSeqLabel_1_protein_5.0_Any.csv** and **ProSeqLabel_2_protein_5.0_Any.csv** in the **./Result** directory. Among them, _**5.0**_ represents the threshold for judging whether there is an interaction between residues, and _**Any**_ represents the way to judge the distance between residues.