Protein–protein interactions are the physical contacts of high specificity that are formed between at least two molecules as result of biochemical reactions driven by electrostatic forces such as hydrophobic effect. Numerous physical contacts have molecular associations between chains which occur in a living organism or a cell in a particular biomolecular context.
Proteins hardly act together as their roles tend to be controlled. A lot of molecular processes in a cell are steered by molecular machines that are made from many protein components organized by their protein–protein interactions. These interactions form what is commonly referred to as interactomics of the organism. The abnormal PPIs are the basis of many aggregation-related diseases including Alzheimer’s disease, Creutzfeldt–Jakob, and may lead to cancer.
Protein-protein Interactions have been studied from various perspectives: signal transduction, molecular dynamics, quantum chemistry, biochemistry, among others. All this information allows the formation of large protein interaction networks – comparable to genetic/epigenetic or metabolic networks– that allow the existing knowledge on molecular etiology of a disease.
PPIs are important to almost every process in a cell. Therefore, understanding PPIs is essential for understanding cell physiology. It is also crucial in drug development, because drugs can affect PPIs.
Proteomics is a field of molecular biology that is concerned with the extensive study of proteins. The word proteomics was first used in the late 1990s, in analogy with genomics. The term proteome is a combination is genome and proteins, and was coined in 1994 by Marc Wilkins when he was a PhD student at Macquarie University.
The entire set of proteins that are produced or modified by a system or organism is called proteome. This varies with distinct requirements and time that an organism or cell undergoes. Being an interdisciplinary domain, proteomics has greatly benefited from the genetic data of the Human Genome Project.
Since its first use, the meaning and scope of the term “proteome” has narrowed. Post-translational changes, proteins intractable to typical separation techniques, and alternative splice products have presented a challenge in realizing the conventional definition of the word.
These days, proteomics explore numerous areas of study. Amongst them are protein function, protein-protein interaction, protein localization, and protein modification. The fundamental proteomics goal is not only identifying all the proteins in a particular cell but also generating three-dimensional map of the cell showing their exact location.
Basic Local Alignment Search Tool (BLAST) is an algorithm that is used to compare information on primary biological sequence such as the sequence of proteins in amino acids or nucleotides in sequences of DNA. A researcher uses BLAST to make a comparison between a query sequences and that of a library or database and make an identification of any resemblance between the two at a certain specified threshold. There are different types of BLASTs depending on the various query sequences.
BLAST was developed by Stephen Altschul, Warren Gish, Webb Miller, Eugene Myers and David J Lipman. The input use sequences in the format of FASTA, a DNA and protein sequence alignment software package. The output is delivered in formats such as the HTML, plain text, and XML. BLAST applies the heuristic method in order to locate similar sequences by finding short matches that exist between two sequences; a process called seeding.
BLAST begins by making local alignments on completion of the first match. A set of common letters that is referred to as words is very important in an attempt to find a similarity in sequences. BLAT (Blast like Alignment Tool) is an alternative to BLAST. However, BLAT is extremely fast but less sensitive than BLAST.
Methods of protein function prediction are techniques that computational biologists use to assign biochemical or biological roles to proteins. Usually, these proteins are ones that are poorly predicted or studied based on genomic sequence of data. The predictions are driven by data-intensive computational techniques. The information may come from gene expression profile, nucleic acid sequence homology, text mining of publication, protein domain structures, protein-protein interaction, phenotypic profiles, and phylogenetic profiles.
Generally, a function can be said to be anything that happens through or to a protein. While methods such RNA interference, the yeast two-hybrid system, and microarray analysis can be used to experimentally show a protein function, advances in sequencing technologies are making the rate at which proteins are experimentally characterized slower than the rate at which new sequences are becoming available. Therefore, the annotation is often done fast and for many proteins at once.
The development of structure-based and context-based methods have expanded what we can predict from data. A combination of methods is now used to get a picture of cellular pathways based on sequence data. The prevalence of computational prediction of protein function is made possible by analyzing ‘evidence codes’ used by the GO database.