Protein subcellular localization prediction
Protein subcellular localization prediction
Main page

Protein subcellular localization prediction

logo
Community Hub0 subscribers
What are your thoughts?
Be the first to start a discussion here.
Be the first to start a discussion here.
Protein subcellular localization prediction

Protein subcellular localization prediction (or just protein localization prediction) involves the prediction of where a protein resides in a cell, its subcellular localization.

In general, prediction tools take as input information about a protein, such as a protein sequence of amino acids, and produce a predicted location within the cell as output, such as the nucleus, Endoplasmic reticulum, Golgi apparatus, extracellular space, or other organelles. The aim is to build tools that can accurately predict the outcome of protein targeting in cells.

Prediction of protein subcellular localization is an important component of bioinformatics based prediction of protein function and genome annotation, and it can aid the identification of drug targets.

Experimentally determining the subcellular localization of a protein can be a laborious and time consuming task. Immunolabeling or tagging (such as with a green fluorescent protein) to view localization using fluorescence microscope are often used. A high throughput alternative is to use prediction.

Through the development of new approaches in computer science, coupled with an increased dataset of proteins of known localization, computational tools can now provide fast and accurate localization predictions for many organisms. This has resulted in subcellular localization prediction becoming one of the challenges being successfully aided by bioinformatics, and machine learning.

Many prediction methods now exceed the accuracy of some high-throughput laboratory methods for the identification of protein subcellular localization. Particularly, some predictors have been developed that can be used to deal with proteins that may simultaneously exist, or move between, two or more different subcellular locations. Experimental validation is typically required to confirm the predicted localizations.

In 1999 PSORT was the first published program to predict subcellular localization. Subsequent tools and websites have been released using techniques such as artificial neural networks, support vector machine and protein motifs. Predictors can be specialized for proteins in different organisms. Some are specialized for eukaryotic proteins, some for human proteins, and some for plant proteins. Methods for the prediction of bacterial localization predictors, and their accuracy, have been reviewed. In 2021, SCLpred-MEM, a membrane protein prediction tool powered by artificial neural networks was published. SCLpred-EMS is another tool powered by Artificial neural networks that classify proteins into endomembrane system and secretory pathway (EMS) versus all others. Similarly, Light-Attention uses machine learning methods to predict ten different common subcellular locations.

The first model to generalize protein subcellular localization to all cell line does so by leveraging images of subcellular landmark stains (i.e., nuclear, plasma membrane, and endoplasmic reticulum markers) across multiple cell stains. Coupling multimodal data of landmark stains along with a pre-trained protein language model, the Prediction of Unseen Proteins' Subcellular Localization (PUPS) model is capable of generative subcellular localization prediction of any protein in any cell line given the protein's amino acid sequence and reference stains of the cell line.

See all
User Avatar
No comments yet.