Protein-Protein Interaction Identifier

About

The Protein-Protein Interaction Identifier (PPI-ID) is a structural bioinformatics tool that aids domain-domain interactions (DDIs) or domain-motif (DMIs) interaction prediction between proteins. These predictions can be done within the 'Predict from Accession' tab, which is optimal for predicting DDIs/DMIs between entire proteins that have an associated UniProt accession number. To use this tab, you can simply submit protein accession numbers. DDI/DMI prediction can also be done using the 'Predict from Sequence' tab, which is optimal for predicting DDIs/DMIs that involve protein fragments, synthetic proteins, or other amino acid sequences that do not have an associated UniProt accession number. To use, provide protein domain information generated by InterProScan and/or short linear motif (SLiM) information generated by PPI-ID in the 'Predict from Sequence' tab. Do note that PPI-ID is not capable of predicting motif-motif interactions.

After running predictions on AlphaFold, the user is able to upload the resulting PDB file and interact with the data frame of predicted interactions. Please note that in order for data frame and the 3D molecular model to properly interact, the accession numbers/sequences submitted to PPI-ID must be the exact same as the sequences that were folded by AlphaFold. Also note that the order of protein sequences folded in AlphaFold must match that of the Protein 1 and Protein 2 information upload to PPI-ID. E.g. If accession numbers for proteins DifA (P98149) and Cactus (Q03017) are submitted to PPI-ID as Protein 1 and Protein 2 respectively, the respective order of chains within the AlphaFold-produced pdb file must also follow the same order of DifA first and Cactus second. The order of chains is determined by the order in which amino acid sequences were added to the multimeric FASTA file.

PPI-ID contains a number of features that facilitates structural analysis of pdb files produced by AlphaFold. After pdb file upload, a 'Contact Distance Filter' can be applied, so only predicted interactions that come within the user-specified contact distance will populate the table of predicted DDIs/DMIs. After filtering has been applied, specific residues that satisfy contact distance filter requirements can be labeled on the molecular structure. When labels are added, residues of interest are depicted in ball-and-stick form for improved clarity. Additional toggles allows for the removal of labels while maintaining ball-and-stick view of residues of interest and the reversion of the molecular model to its original state.

Click here to access a video demonstration on how to use PPI-ID.

This tool takes advantage of a compiled dataset of domain-domain interactions from the 3did (2022 release) and DOMINE databases. As a result, domains are identified by their Pfam ID, and domain-SLiM interactions are provided by the Eukaryotic Linear Motif (ELM) Database. These databases all document PPI information based on experimental crystal structure data. As a result, PPI-ID can not be as accurate in predicting PPIs that have been identified through other methods (e.g. FRET, Y2H, etc.). Furthermore, PPI-ID is not capable of predicting novel DDIs/DMIs. PPI-ID purely predicts known interactions as found in crystal structures, or as data is reported in the 3did, DOMINE, and ELM databases. Potential interactions are determined according to the appropriate algorithm implemented in the R script.

Additional resources for AlphaFold output analysis (PAE plot generation, ipTM/pTM score fetching, etc.) can be accessed at PPI-ID's GitHub Page. We also highly recommend using PAE plotting and analysis tools that are provided by PAE Viewer.

Tips

To ensure that tables of predicted DDIs/DMIs can properly interact with an uploaded pdb file, please make sure that the order of proteins (e.g. which accession number of file upload you assigned to Protein 1 and Protein 2) matches the order of chains in the pdb file. Chain order in the multimeric pdb file will correspond with the order of proteins in the FASTA file provided to AlphaFold (the first protein in the FASTA file will be Chain 1, and the second will be Chain 2). To make sure that protein order in PPI-ID matches chain order of pdb file, make sure that the accession number/uploaded information for Protein 1 correctly matches Chain 1. Complete the same verification for Protein 2.

The 'Predict from Accession' tab is most convenient if you seek to predict/detect DDIs/DMIs between two whole proteins that each have their own UniProt accession number. The 'Predict from Sequence' tab is most convenient if you seek to predict/detect DDIs/DMIs involving protein fragments, synthetic proteins, or proteins that do not have an associated UniProt accession number.

Help

If you encounter any issues, please submit a pull request to PPI-ID's GitHub Page. If you are running this app from RStudios, you may have to first open the tool in a browser before you are properly directed to the GitHub page.