Hubbry Logo
Protein targetingProtein targetingMain
Open search
Protein targeting
Community hub
Protein targeting
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Protein targeting
Protein targeting
from Wikipedia

Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations within or outside the cell.[1][2][note 1] Proteins can be targeted to the inner space of an organelle, different intracellular membranes, the plasma membrane, or to the exterior of the cell via secretion.[1][2] Information contained in the protein itself directs this delivery process.[2][3] Correct sorting is crucial for the cell; errors or dysfunction in sorting have been linked to multiple diseases.[2][4][5]

History

[edit]
Günter Blobel, awarded the 1999 Nobel Prize in Physiology for his discovery that proteins contain intrinsic signal sequences.

In 1970, Günter Blobel conducted experiments on protein translocation across membranes. Blobel, then an assistant professor at Rockefeller University, built upon the work of his colleague George Palade.[6] Palade had previously demonstrated that non-secreted proteins were translated by free ribosomes in the cytosol, while secreted proteins (and target proteins, in general) were translated by ribosomes bound to the endoplasmic reticulum (ER).[6] Candidate explanations at the time postulated a processing difference between free and ER-bound ribosomes, but Blobel hypothesized that protein targeting relied on characteristics inherent to the proteins, rather than a difference in ribosomes. Supporting his hypothesis, Blobel discovered that many proteins have a short amino acid sequence at one end that functions like a postal code specifying an intracellular or extracellular destination.[3] He described these short sequences (generally 13 to 36 amino acids residues)[1] as signal peptides or signal sequences and was awarded the 1999 Nobel prize in Physiology for the same.[7]

Signal peptides

[edit]

Signal peptides serve as targeting signals, enabling cellular transport machinery to direct proteins to specific intracellular or extracellular locations. While no consensus sequence has been identified for signal peptides, many nonetheless possess a characteristic tripartite structure:[1]

  1. A positively charged, hydrophilic region near the N-terminal.
  2. A span of 10 to 15 hydrophobic amino acids near the middle of the signal peptide.
  3. A slightly polar region near the C-terminal, typically favoring amino acids with smaller side chains at positions approaching the cleavage site.

After a protein has reached its destination, the signal peptide is generally cleaved by a signal peptidase.[1] Consequently, most mature proteins do not contain signal peptides. While most signal peptides are found at the N-terminal, in peroxisomes the targeting sequence is located on the C-terminal extension.[8] Unlike signal peptides, signal patches are composed by amino acid residues that are discontinuous in the primary sequence but become functional when folding brings them together on the protein surface.[9] Unlike most signal sequences, signal patches are not cleaved after sorting is complete.[10] In addition to intrinsic signaling sequences, protein modifications like glycosylation can also induce targeting to specific intracellular or extracellular regions.

Protein translocation

[edit]

Since the translation of mRNA into protein by a ribosome takes place within the cytosol, proteins destined for secretion or a specific organelle must be translocated.[11] This process can occur during translation, known as co-translational translocation, or after translation is complete, known as post-translational translocation.[12]

Co-translational translocation

[edit]
A generalized overview of protein targeting that illustrates co-translational translocation to the endoplasmic reticulum and post-translational translocation to their specified locations. If no targeting sequence is present, then the synthesized protein will remain in the cytosol.

Most secretory and membrane-bound proteins are co-translationally translocated. Proteins that reside in the endoplasmic reticulum (ER), golgi or endosomes also use the co-translational translocation pathway. This process begins while the protein is being synthesized on the ribosome, when a signal recognition particle (SRP) recognizes an N-terminal signal peptide of the nascent protein.[13] Binding of the SRP temporarily pauses synthesis while the ribosome-protein complex is transferred to an SRP receptor on the ER in eukaryotes, and the plasma membrane in prokaryotes.[14] There, the nascent protein is inserted into the translocon, a membrane-bound protein conducting channel composed of the Sec61 translocation complex in eukaryotes, and the homologous SecYEG complex in prokaryotes.[15] In secretory proteins and type I transmembrane proteins, the signal sequence is immediately cleaved from the nascent polypeptide once it has been translocated into the membrane of the ER (eukaryotes) or plasma membrane (prokaryotes) by signal peptidase. The signal sequence of type II membrane proteins and some polytopic membrane proteins are not cleaved off and therefore are referred to as signal anchor sequences. Within the ER, the protein is first covered by a chaperone protein to protect it from the high concentration of other proteins in the ER, giving it time to fold correctly.[citation needed] Once folded, the protein is modified as needed (for example, by glycosylation), then transported to the Golgi for further processing and goes to its target organelles or is retained in the ER by various ER retention mechanisms.

The amino acid chain of transmembrane proteins, which often are transmembrane receptors, passes through a membrane one or several times. These proteins are inserted into the membrane by translocation, until the process is interrupted by a stop-transfer sequence, also called a membrane anchor or signal-anchor sequence.[16] These complex membrane proteins are currently characterized using the same model of targeting that has been developed for secretory proteins. However, many complex multi-transmembrane proteins contain structural aspects that do not fit this model. Seven transmembrane G-protein coupled receptors (which represent about 5% of the genes in humans) mostly do not have an amino-terminal signal sequence. In contrast to secretory proteins, the first transmembrane domain acts as the first signal sequence, which targets them to the ER membrane. This also results in the translocation of the amino terminus of the protein into the ER membrane lumen. This translocation, which has been demonstrated with opsin with in vitro experiments,[17][18] breaks the usual pattern of "co-translational" translocation which has always held for mammalian proteins targeted to the ER. A great deal of the mechanics of transmembrane topology and folding remains to be elucidated.

Post-translational translocation

[edit]

Even though most secretory proteins are co-translationally translocated, some are translated in the cytosol and later transported to the ER/plasma membrane by a post-translational system. In prokaryotes this process requires certain cofactors such as SecA and SecB and is facilitated by Sec62 and Sec63, two membrane-bound proteins.[19] The Sec63 complex, which is embedded in the ER membrane, causes hydrolysis of ATP, allowing chaperone proteins to bind to an exposed peptide chain and slide the polypeptide into the ER lumen. Once in the lumen the polypeptide chain can be folded properly. This process only occurs in unfolded proteins located in the cytosol.[20]

In addition, proteins targeted to other cellular destinations, such as mitochondria, chloroplasts, or peroxisomes, use specialized post-translational pathways. Proteins targeted for the nucleus are also translocated post-translationally through the addition of a nuclear localization sequence (NLS) that promotes passage through the nuclear envelope via nuclear pores.[21]

Sorting of proteins

[edit]

Mitochondria

[edit]
Overview of the major protein import pathways of mitochondria.
The carrier pathway for proteins targeted to the mitochondrial inner membrane.

While some proteins in the mitochondria originate from mitochondrial DNA within the organelle, most mitochondrial proteins are synthesized as cytosolic precursors containing uptake peptide signals.[22][23][24][25] Unfolded proteins bound by cytosolic chaperone hsp70 that are targeted to the mitochondria may be localized to four different areas depending on their sequences.[2][25][26] They may be targeted to the mitochondrial matrix, the outer membrane, the intermembrane space, or the inner membrane. Defects in any one or more of these processes has been linked to health and disease.[27]

Mitochondrial matrix

[edit]

Proteins destined for the mitochondrial matrix have specific signal sequences at their beginning (N-terminus) that consist of a string of 20 to 50 amino acids. These sequences are designed to interact with receptors that guide the proteins to their correct location inside the mitochondria. The sequences have a unique structure with clusters of water-loving (hydrophilic) and water-avoiding (hydrophobic) amino acids, giving them a dual nature known as amphipathic. These amphipathic sequences typically form a spiral shape (alpha-helix) with the charged amino acids on one side and the hydrophobic ones on the opposite side. This structural feature is essential for the sequence to function correctly in directing proteins to the matrix. If mutations occur that mess with this dual nature, the protein often fails to reach its intended destination, although not all changes to the sequence have this effect. This indicates the importance of the amphipathic property for the protein to be correctly targeted to the mitochondrial matrix.[28]

The pre-sequence pathway into the mitochondrial inner membrane (IM) and mitochondrial matrix.

Proteins targeted to the mitochondrial matrix first involves interactions between the matrix targeting sequence located at the N-terminus and the outer membrane import receptor complex TOM20/22.[2][23][29] In addition to the docking of internal sequences and cytosolic chaperones to TOM70.[2][23][29] Where TOM is an abbreviation for translocase of the outer membrane. Binding of the matrix targeting sequence to the import receptor triggers a handoff of the polypeptide to the general import core (GIP) known as TOM40.[2][23][29] The general import core (TOM40) then feeds the polypeptide chain through the intermembrane space and into another translocase complex TIM17/23/44 located on the inner mitochondrial membrane.[2][24][25][30] This is accompanied by the necessary release of the cytosolic chaperones that maintain an unfolded state prior to entering the mitochondria. As the polypeptide enters the matrix, the signal sequence is cleaved by a processing peptidase and the remaining sequences are bound by mitochondrial chaperones to await proper folding and activity.[25][26] The push and pull of the polypeptide from the cytosol to the intermembrane space and then the matrix is achieved by an electrochemical gradient that is established by the mitochondrion during oxidative phosphorylation.[1][24][25][26] In which a mitochondrion active in metabolism has generated a negative potential inside the matrix and a positive potential in the intermembrane space.[25][31] It is this negative potential inside the matrix that directs the positively charged regions of the targeting sequence into its desired location.

Mitochondrial inner membrane

[edit]

Targeting of mitochondrial proteins to the inner membrane may follow 3 different pathways depending upon their overall sequences, however, entry from the outer membrane remains the same using the import receptor complex TOM20/22 and TOM40 general import core.[2][24] The first pathway for proteins targeted to the inner membrane follows the same steps as those designated to the matrix where it contains a matrix targeting sequence that channels the polypeptide to the inner membrane complex containing the previously mentioned translocase complex TIM17/23/44.[2][24][25] However, the difference is that the peptides that are designated to the inner membrane and not the matrix contain an upstream sequence called the stop-transfer-anchor sequence.[2] This stop-transfer-anchor sequence is a hydrophobic region that embeds itself into the phospholipid bilayer of the inner membrane and prevents translocation further into the mitochondrion.[24][25] The second pathway for proteins targeted to the inner membrane follows the matrix localization pathway in its entirety. However, instead of a stop-transfer-anchor sequence, it contains another sequence that interacts with an inner membrane protein called Oxa-1 once inside the matrix that will embed it into the inner membrane.[2][24][25] The third pathway for mitochondrial proteins targeted to the inner membrane follow the same entry as the others into the outer membrane, however, this pathway utilizes the translocase complex TIM22/54 assisted by complex TIM9/10 in the intermembrane space to anchor the incoming peptide into the membrane.[2][24][25] The peptides for this last pathway do not contain a matrix targeting sequence, but instead contain several internal targeting sequences.

Mitochondrial intermembrane space

[edit]

If instead the precursor protein is designated to the intermembrane space of the mitochondrion, there are two pathways this may occur depending on the sequences being recognized. The first pathway to the intermembrane space follows the same steps for an inner membrane targeted protein. However, once bound to the inner membrane the C-terminus of the anchored protein is cleaved via a peptidase that liberates the preprotein into the intermembrane space so it can fold into its active state.[2][25] One of the greatest examples for a protein that follows this pathway is cytochrome b2, that upon being cleaved will interact with a heme cofactor and become active.[2][32] The second intermembrane space pathway does not utilize any inner membrane complexes and therefor does not contain a matrix targeting signal. Instead, it enters through the general import core TOM40 and is further modified in the intermembrane space to achieve its active conformation. TIM9/10 is an example of a protein that follows this pathway in order to be in the location it needs to be to assist in inner membrane targeting.[2][25][33]

Mitochondrial outer membrane

[edit]

Outer membrane targeting simply involves the interaction of precursor proteins with the outer membrane translocase complexes that embeds it into the membrane via internal-targeting sequences that are to form hydrophobic alpha helices or beta barrels that span the phospholipid bilayer.[2][24][25] This may occur by two different routes depending on the preprotein internal sequences. If the preprotein contains internal hydrophobic regions capable of forming alpha helices, then the preprotein will utilize the mitochondrial import complex (MIM) and be transferred laterally to the membrane.[24][25] For preproteins containing hydrophobic internal sequences that correlate to beta-barrel forming proteins, they will be imported from the aforementioned outer membrane complex TOM20/22 to the intermembrane space. In which they will interact with TIM9/10 intermembrane-space protein complex that transfers them to sorting and assembly machinery (SAM) that is present in the outer membrane that laterally displaces the targeted protein as a beta-barrel.[24][25]

Chloroplasts

[edit]

Chloroplasts are similar to mitochondria in that they contain their own DNA for production of some of their components. However, the majority of their proteins are obtained via post-translational translocation and arise from nuclear genes. Proteins may be targeted to several sites of the chloroplast depending on their sequences such as the outer envelope, inner envelope, stroma, thylakoid lumen, or the thylakoid membrane.[2] Proteins are targeted to Thylakoids by mechanisms related to Bacterial Protein Translocation.[28] Proteins targeted to the envelope of chloroplasts usually lack cleavable sorting sequence and are laterally displaced via membrane sorting complexes. General import for the majority of preproteins requires translocation from the cytosol through the Toc and Tic complexes located within the chloroplast envelope. Where Toc is an abbreviation for the translocase of the outer chloroplast envelope and Tic is the translocase of the inner chloroplast envelope. There is a minimum of three proteins that make up the function of the Toc complex. Two of which, referred to as Toc159 and Toc34, are responsible for the docking of stromal import sequences and both contain GTPase activity. The third known as Toc 75, is the actual translocation channel that feeds the recognized preprotein by Toc159/34 into the chloroplast.[34]

Stroma

[edit]

Targeting to the stroma requires the preprotein to have a stromal import sequence that is recognized by the Tic complex of the inner envelope upon being translocated from the outer envelope by the Toc complex. The Tic complex is composed of at least five different Tic proteins that are required to form the translocation channel across the inner envelope.[35] Upon being delivered to the stroma, the stromal import sequence is cleaved off via a signal peptidase. This delivery process to the stroma is currently known to be driven by ATP hydrolysis via stromal HSP chaperones, instead of the transmembrane electrochemical gradient that is established in mitochondria to drive protein import.[34] Further intra-chloroplast sorting depends on additional target sequences such as those designated to the thylakoid membrane or the thylakoid lumen.

Thylakoid lumen

[edit]

If a protein is to be targeted to the thylakoid lumen, this may occur via four differently known routes that closely resemble bacterial protein transport mechanisms. The route that is taken depends upon the protein delivered to the stroma being in either an unfolded or metal-bound folded state. Both of which will still contain a thylakoid targeting sequence that is also cleaved upon entry to the lumen. While protein import into the stroma is ATP-driven, the pathway for metal-bound proteins in a folded state to the thylakoid lumen has been shown to be driven by a pH gradient.

Pathways for proteins targeted to the thylakoid membrane in chloroplasts.

Thylakoid membrane

[edit]

Proteins bound for the membrane of the thylakoid will follow up to four known routes that are illustrated in the corresponding figure shown. They may follow a co-translational insertion route that utilizes stromal ribosomes and the SecY/E transmembrane complex, the SRP-dependent pathway, the spontaneous insertion pathway, or the GET pathway. The last of the three are post-translational pathways originating from nuclear genes and therefor constitute the majority of proteins targeted to the thylakoid membrane. According to recent review articles in the journal of biochemistry and molecular biology, the exact mechanisms are not yet fully understood.

Both chloroplasts and mitochondria

[edit]

Many proteins are needed in both mitochondria and chloroplasts.[36] In general the dual-targeting peptide is of intermediate character to the two specific ones. The targeting peptides of these proteins have a high content of basic and hydrophobic amino acids, a low content of negatively charged amino acids. They have a lower content of alanine and a higher content of leucine and phenylalanine. The dual targeted proteins have a more hydrophobic targeting peptide than both mitochondrial and chloroplastic ones. However, it is tedious to predict if a peptide is dual-targeted or not based on its physio-chemical characteristics.

Nucleus

[edit]

The nucleus of a cell is surrounded by a nuclear envelope consisting of two layers, with the inner layer providing structural support and anchorage for chromosomes and the nuclear lamina.[16] The outer layer is similar to the endoplasmic reticulum (ER) membrane. This envelope contains nuclear pores, which are complex structures made from around 30 different proteins.[16] These pores act as selective gates that control the flow of molecules into and out of the nucleus.

While small molecules can pass through these pores without issue, larger molecules, like RNA and proteins destined for the nucleus, must have specific signals to be allowed through.[20] These signals are known as nuclear localization signals, usually comprising short sequences rich in positively charged amino acids like lysine or arginine.[16]

Proteins called nuclear import receptors recognize these signals and guide the large molecules through the nuclear pores by interacting with the disordered, mesh-like proteins that fill the pore.[16] The process is dynamic, with the receptor moving the molecule through the meshwork until it reaches the nucleus.[20]

Once inside, a GTPase enzyme called Ran, which can exist in two different forms (one bound to GTP and the other to GDP), facilitates the release of the cargo inside the nucleus and recycles the receptor back to the cytosol.[16][20] The energy for this transport comes from the hydrolysis of GTP by Ran. Similarly, nuclear export receptors help move proteins and RNA out of the nucleus using a different signal and also harnessing Ran's energy conversion.[16]

Overall, the nuclear pore complex works efficiently to transport macromolecules at high speed, allowing proteins to move in their folded state and ribosomal components as complete particles, which is distinct from how proteins are transported into most other organelles.[16]

Endoplasmic reticulum

[edit]

The endoplasmic reticulum (ER) plays a key role in protein synthesis and distribution in eukaryotic cells. It's a vast network of membranes where proteins are processed and sorted to various destinations, including the ER itself, the cell surface, and other organelles like the Golgi apparatus, endosomes, and lysosomes.[16] Unlike other organelle-targeted proteins, those headed for the ER start to be transferred across its membrane while they're still being made.[20][16]

Protein synthesis and sorting

[edit]

There are two types of proteins that move to the ER: water-soluble proteins, which completely cross into the ER lumen, and transmembrane proteins, which partly cross and embed themselves within the ER membrane.[20] These proteins find their way to the ER with the help of an ER signal sequence, a short stretch of hydrophobic amino acids.[16]

Proteins entering the ER are synthesized by ribosomes. There are two sets of ribosomes in the cell: those bound to the ER (making it look 'rough') and those floating freely in the cytosol. Both sets are identical but differ in the proteins they synthesize at a given moment.[16][20] Ribosomes that are making proteins with an ER signal sequence attach to the ER membrane and start the translocation process. This process is energy-efficient because the growing protein chain itself pushes through the ER membrane as it elongates.[16]

As the mRNA is translated into a protein, multiple ribosomes may attach to it, creating a structure called a polyribosome.[16] If the mRNA is coding for a protein with an ER signal sequence, the polyribosome attaches to the ER membrane, and the protein begins to enter the ER while it is still being synthesized.[20][16]

Guided entry of soluble proteins
[edit]

In the process of protein synthesis within eukaryotic cells, soluble proteins that are destined for the endoplasmic reticulum (ER) or for secretion out of the cell are guided to the ER by a two-part system. Firstly, a signal-recognition particle (SRP) in the cytosol attaches to the emerging protein's ER signal sequence and the ribosome itself.[16] Secondly, an SRP receptor located in the ER membrane recognizes and binds to the SRP. This interaction temporarily slows down protein synthesis until the SRP and ribs complex binds to the SRP receptor on the ER.[16][20]

Once this binding occurs, the SRP is released, and the ribosome is transferred to a protein translocator in the ER membrane, allowing protein synthesis to continue.[16][20] The polypeptide chain of the protein is then threaded through a channel in the translocator into the ER lumen. The signal sequence of the protein, typically at the beginning (N-terminus) of the polypeptide chain, plays a dual role. It not only targets the ribosome to the ER but also triggers the opening of the translocator.[16] As the protein is fed through the translocator, the signal sequence stays attached, allowing the rest of the protein to move through as a loop. A signal peptidase inside the ER then cuts off the signal sequence, which is subsequently discarded into the lipid bilayer of the ER membrane and broken down.[16][20]

Finally, once the last part of the protein (the C-terminus) passes through the translocator, the entire soluble protein is released into the ER lumen, where it can then fold and undergo further modifications or be transported to its final destination.[16][20]

Mechanisms of transmembrane protein integration
[edit]

Transmembrane proteins, which are partly integrated into the ER membrane rather than released into the ER lumen, have a complex assembly process.[16][20] The initial stages are similar to soluble proteins: a signal sequence starts the insertion into the ER membrane. However, this process is interrupted by a stop-transfer sequence—a string of hydrophobic amino acids—which causes the translocator to halt and release the protein laterally into the membrane.[16][20] This results in a single-pass transmembrane protein with one end inside the ER lumen and the other in the cytosol, and this orientation is permanent.[16]

Some transmembrane proteins use an internal signal (start-transfer sequence) instead of one at the N-terminus, and unlike the initial signal sequence, this start-transfer sequence isn't removed.[16][20] It begins the transfer process, which continues until a stop-transfer sequence is encountered, at which point both sequences become anchored in the membrane as alpha-helical segments.[16]

In more complex proteins that span the membrane multiple times, additional pairs of start- and stop-transfer sequences are used to weave the protein into the membrane in a fashion akin to a sewing machine. Each pair allows a new segment to cross the membrane and adds to the protein's structure, ensuring it is properly embedded with the correct arrangement of segments inside and outside the ER membrane.[16]

Peroxisomes

[edit]
Generalized Protein Targeting to the Peroxisomal Matrix

Peroxisomes contain a single phospholipid bilayer that surrounds the peroxisomal matrix containing a wide variety of proteins and enzymes that participate in anabolism and catabolism. Peroxisomes are specialized cell organelles that carry out specific oxidative reactions using molecular oxygen. Their primary function is to remove hydrogen atoms from organic molecules, a process that results in the production of hydrogen peroxide (H2O2).[37][20] Within peroxisomes, an enzyme called catalase plays a critical role. It uses the hydrogen peroxide generated in the earlier reaction to oxidize various other substances, including phenols, formic acid, formaldehyde, and alcohol.[37][20] This is known as the "peroxidative" reaction.[20]

Peroxisomes are particularly important in liver and kidney cells for detoxifying harmful substances that enter the bloodstream. For example, they are responsible for oxidizing about 25% of the ethanol we consume into acetaldehyde.[37] Additionally, catalase within peroxisomes can break down excess hydrogen peroxide into water and oxygen and thus preventing potential damage from the build-up of H2O2.[37][20] Since it contains no internal DNA like that of the mitochondria or chloroplast all peroxisomal proteins are encoded by nuclear genes.[38] To date there are two types of known Peroxisome Targeting Signals (PTS):

  1. Peroxisome targeting signal 1 (PTS1): a C-terminal tripeptide with a consensus sequence (S/A/C)-(K/R/H)-(L/A). The most common PTS1 is serine-lysine-leucine (SKL).[37] The initial research that led to the discovery of this consensus observed that when firefly luciferase was expressed in cultured insect cells it was targeted to the peroxisome. By testing a variety of mutations in the gene encoding the expressed luciferase, the consensus sequence was then determined.[39] It has also been found that by adding this C-terminal sequence of SKL to a cytosolic protein that it becomes targeted for transport to the peroxisome. The majority of peroxisomal matrix proteins possess this PTS1 type signal.
  2. Peroxisome targeting signal 2 (PTS2): a nonapeptide located near the N-terminus with a consensus sequence (R/K)-(L/V/I)-XXXXX-(H/Q)-(L/A/F) (where X can be any amino acid).[37]

There are also proteins that possess neither of these signals. Their transport may be based on a so-called "piggy-back" mechanism: such proteins associate with PTS1-possessing matrix proteins and are translocated into the peroxisomal matrix together with them.[40]

In the case of cytosolic proteins that are produced with the PTS1 C-terminal sequence, its path to the peroxisomal matrix is dependent upon binding to another cytosolic protein called pex5 (peroxin 5).[41] Once bound, pex5 interacts with a peroxisomal membrane protein pex14 to form a complex. When the pex5 protein with bound cargo interacts with the pex14 membrane protein, the complex induces the release of the targeted protein into the matrix. Upon releasing the cargo protein into the matrix, pex5 dissociation from pex14 occurs via ubiquitinylation by a membrane complex comprising pex2, pex12, and pex10 followed by an ATP dependent removal involving the cytosolic protein complex pex1 and pex6.[42] The cycle for pex5 mediated import into the peroxisomal matrix is restored after the ATP dependent removal of ubiquitin and is free to bind with another protein containing a PTS1 sequence.[41] Proteins containing a PTS2 targeting sequence are mediated by a different cytosolic protein but are believed to follow a similar mechanism to that of those containing the PTS1 sequence.[37]

Diseases

[edit]

In bacteria and archaea

[edit]

As discussed above (see protein translocation), most prokaryotic membrane-bound and secretory proteins are targeted to the plasma membrane by either a co-translation pathway that uses bacterial SRP or a post-translation pathway that requires SecA and SecB. At the plasma membrane, these two pathways deliver proteins to the SecYEG translocon for translocation. Bacteria may have a single plasma membrane (Gram-positive bacteria), or an inner membrane plus an outer membrane separated by the periplasm (Gram-negative bacteria). Besides the plasma membrane the majority of prokaryotes lack membrane-bound organelles as found in eukaryotes, but they may assemble proteins onto various types of inclusions such as gas vesicles and storage granules.

Gram-negative bacteria

[edit]

In gram-negative bacteria proteins may be incorporated into the plasma membrane, the outer membrane, the periplasm or secreted into the environment. Systems for secreting proteins across the bacterial outer membrane may be quite complex and play key roles in pathogenesis. These systems may be described as type I secretion, type II secretion, etc.

Gram-positive bacteria

[edit]

In most gram-positive bacteria, certain proteins are targeted for export across the plasma membrane and subsequent covalent attachment to the bacterial cell wall. A specialized enzyme, sortase, cleaves the target protein at a characteristic recognition site near the protein C-terminus, such as an LPXTG motif (where X can be any amino acid), then transfers the protein onto the cell wall. Several analogous systems are found that likewise feature a signature motif on the extra-cytoplasmic face, a C-terminal transmembrane domain, and cluster of basic residues on the cytosolic face at the protein's extreme C-terminus. The PEP-CTERM/exosortase system, found in many Gram-negative bacteria, seems to be related to extracellular polymeric substance production. The PGF-CTERM/archaeosortase A system in archaea is related to S-layer production. The GlyGly-CTERM/rhombosortase system, found in the Shewanella, Vibrio, and a few other genera, seems involved in the release of proteases, nucleases, and other enzymes.

Bioinformatic tools

[edit]
  • Minimotif Miner is a bioinformatics tool that searches protein sequence queries for a known protein targeting sequence motifs.
  • Phobius predicts signal peptides based on a supplied primary sequence.
  • SignalP predicts signal peptide cleavage sites.
  • LOCtree Archived 2021-12-21 at the Wayback Machine predicts the subcellular localization of proteins.

Notes

[edit]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Protein targeting, also known as protein sorting, is the essential cellular process by which newly synthesized proteins are directed from their site of synthesis on ribosomes to specific locations within the cell or for export outside it, ensuring proper function and compartmentalization within cells across all organisms. This process relies on inherent sorting signals embedded in the protein's sequence, which are recognized by dedicated receptor proteins and transport machinery to guide proteins to destinations such as the (ER), mitochondria, nucleus, peroxisomes, lysosomes, or the plasma membrane. Defects in protein targeting can lead to severe disorders, such as I-cell disease, a lysosomal storage disorder caused by mislocalization of enzymes due to mutations in targeting signals. In eukaryotic cells, protein targeting occurs primarily through two mechanisms: co-translational and post-translational translocation. Co-translational targeting involves proteins being translocated across or into membranes during their synthesis on , most notably for secretory proteins and those destined for the ER, Golgi apparatus, lysosomes, or plasma membrane via the vesicular transport pathway. This process is mediated by the (SRP), which binds to an N-terminal signal sequence—a short stretch of 15–30 hydrophobic —on the emerging polypeptide, pausing until the ribosome docks at the ER membrane's translocon. Post-translational targeting, in contrast, occurs after protein synthesis is complete in the and is used for import into organelles like mitochondria, chloroplasts (in plants), peroxisomes, and the nucleus, where proteins fold and are then actively transported through specific pores or channels. The specificity of targeting is determined by diverse signal sequences or signal patches, which vary by destination and are often cleaved after import to allow proper folding. For instance, mitochondrial proteins feature an amphipathic N-terminal presequence with positively charged residues that directs them to translocases in the outer and inner membranes, while nuclear proteins contain nuclear localization signals (NLS)—short basic motifs like or clusters—that interact with importins for passage through nuclear pore complexes. Peroxisomal targeting signals (PTS) are typically C-terminal tripeptides, such as serine--leucine (SKL), recognized by cytosolic receptors like PEX5 for import. Vesicular transport in the further refines sorting, with proteins acquiring additional signals in the Golgi for delivery to lysosomes (e.g., mannose-6-phosphate tags) or . Overall, these mechanisms maintain cellular by preventing mislocalization, which could otherwise disrupt function or trigger degradation pathways.

Fundamentals

Definition and Cellular Importance

Protein targeting is the by which newly synthesized proteins are directed from their sites of synthesis, typically ribosomes in the , to specific intracellular destinations such as organelles, membranes, or the , utilizing dedicated signals and molecular machinery to ensure precise localization. This directed transport involves recognition of targeting sequences by chaperones and receptors, followed by translocation across or into membranes via specialized complexes. The cellular importance of protein targeting lies in its role in maintaining compartmentalization, which is essential for proper , functional assembly, and execution of specialized processes; for instance, it enables ATP production in mitochondria, lipid and protein synthesis in the (ER), and secretion of components. Without accurate targeting, proteins may aggregate in the , undergo premature degradation, or engage in off-target interactions, leading to cellular dysfunction; studies indicate that mistargeting can affect 5–30% of translocated proteins in experimental models, underscoring the need for mechanisms. In eukaryotes, roughly one-third of the requires targeting to the ER for processing and export, while approximately 13% is directed to mitochondria, illustrating the process's broad impact on cellular proteome organization. These targeting mechanisms exhibit remarkable evolutionary conservation, originating in prokaryotes as basic translocation systems for membrane insertion and , which were later adapted in eukaryotes to support endosymbiotic organelles like mitochondria and chloroplasts. This conservation reflects the fundamental role of spatial protein organization in life across domains, with eukaryotic innovations building upon prokaryotic precursors such as the Sec and Tat pathways.

Historical Milestones

In the 1950s and 1960s, George Palade's pioneering use of electron microscopy revealed the secretory pathway in eukaryotic cells, identifying the rough endoplasmic reticulum as the primary site for synthesizing secretory and membrane proteins, with their subsequent transport through the Golgi apparatus to secretory vesicles. This work established the foundational framework for understanding intracellular protein trafficking and earned Palade the Nobel Prize in Physiology or Medicine in 1974, shared with Albert Claude and Christian de Duve. A major breakthrough came in 1971 when Günter Blobel and David Sabatini proposed the signal hypothesis, suggesting that proteins destined for the secretory pathway or ER membrane contain an N-terminal signal peptide that directs their targeting during or after synthesis. Experimental validation followed, including the 1975 discovery by Blobel and Bernhard Dobberstein of the signal recognition particle (SRP), a ribonucleoprotein complex that binds the signal peptide on nascent polypeptides to facilitate co-translational targeting to the ER. Blobel's contributions culminated in his receipt of the Nobel Prize in Physiology or Medicine in 1999. Parallel developments in the 1970s and focused on organelle-specific targeting. Gottfried Schatz demonstrated that most mitochondrial proteins are encoded by nuclear genes, synthesized in the , and imported post-translationally via N-terminal presequences, with his 1979 experiments confirming this for c1. Schatz's group further identified components of the import machinery, including the TOM complex in the outer mitochondrial membrane through isolation of the 42 kDa ISP42 protein in 1989. For chloroplasts, Colin Robinson and colleagues in the 1980s characterized transit peptides as N-terminal targeting signals and purified processing proteases that cleave them upon import, as detailed in their 1984 study on pea chloroplast proteases. Post-2000 advances leveraged to visualize targeting mechanisms at atomic resolution. Cryo-electron microscopy (cryo-EM) structures of the Sec61 translocon, such as the 2014 mammalian ribosome-Sec61 complex at 3.4 Å resolution by Rebecca M. Voorhees et al., illuminated the channel's role in co- and post-translational translocation across the ER membrane. Concurrently, studies reinforced the essential function of chaperones in post-translational targeting, with 2006 experiments by M. Mokranjac et al. showing that individual molecules accelerate polypeptide unfolding and import into mitochondria without additional factors. More recent work includes the 2023 cryo-EM structure of the ribosome-Sec61 complex bound to the translocon-associated protein (TRAP) complex by Pavel Itskanov et al., providing further insights into accessory protein interactions during translocation.
YearScientist(s)Breakthrough
1950s–1960sGeorge PaladeElucidation of the secretory pathway via electron microscopy, identifying rough ER's role in protein synthesis and trafficking.
1971, David SabatiniProposal of the signal hypothesis for ER targeting via N-terminal signal peptides.
1975, Bernhard DobbersteinDiscovery of the (SRP) for co-translational targeting.
1979Gottfried SchatzDemonstration of post-translational import of mitochondrial proteins like cytochrome c1.
1984Colin Robinson et al.Identification and purification of chloroplast transit peptide processing proteases.
1989Gottfried Schatz et al.Isolation of ISP42 as a key component of the mitochondrial outer membrane import complex (TOM).
2014Rebecca M. Voorhees et al.Cryo-EM structure of the mammalian ribosome-Sec61 translocon complex at 3.4 Å resolution.
2023Pavel Itskanov et al.Cryo-EM structure of the ribosome-Sec61 complex with the TRAP complex.

Targeting Signals

Signal Peptides and Sequences

Signal peptides, also known as leader peptides, are short N-terminal sequences typically comprising 15-30 amino acids that direct nascent proteins to specific cellular compartments, particularly the secretory pathway in eukaryotes and the export machinery in prokaryotes. These sequences exhibit a tripartite structure consisting of an N-region, H-region, and C-region. The N-region is a short, positively charged segment (1-5 residues) rich in basic amino acids such as lysine (K), arginine (R), and histidine (H), which initiates after the start methionine and imparts an overall positive charge to facilitate interaction with translocation components. The central H-region forms a hydrophobic core of 7-15 residues, predominantly composed of non-polar amino acids like leucine (L), isoleucine (I), and valine (V), enabling membrane insertion and targeting. The C-region, typically 3-7 residues long and polar or neutral, contains the cleavage site recognized by signal peptidases and is often shorter in eukaryotes compared to prokaryotes. Mitochondrial targeting presequences represent a specialized class of N-terminal signals, usually 20-80 in length, characterized by their ability to form amphipathic α-helices with one hydrophobic face and one positively charged face.80691-1.pdf) These presequences are enriched in positively charged residues (K/R) and hydroxylated (serine, ), while largely lacking acidic residues, contributing to their net positive charge and helical propensity. A common consensus motif within these presequences is φχχφφ, where φ denotes a hydrophobic residue and χ any , which supports their structural flexibility and targeting efficiency. Beyond N-terminal signals, protein targeting employs diverse internal or C-terminal sequences. Peroxisomal targeting signal 1 (PTS1) is an internal or C-terminal motif, most commonly the tripeptide serine-lysine-leucine (SKL) or close variants, embedded within a short sequence of about 12 amino acids that directs matrix proteins to peroxisomes. Nuclear localization signals (NLS) are typically internal basic clusters; the classic monopartite NLS from the SV40 large T antigen is the heptapeptide proline-lysine-lysine-lysine-arginine-lysine-valine (PKKKRKV), which mediates nuclear import through binding to importin α. For endoplasmic reticulum (ER) retention, soluble resident proteins bear a C-terminal tetrapeptide signal such as lysine-aspartate-glutamate-leucine (KDEL), which prevents escape to the Golgi by facilitating retrieval. The biophysical properties of these signals are critical for their function, with many exhibiting amphipathicity—combining hydrophobic and hydrophilic elements—to promote membrane association without disrupting bilayer integrity. The H-region of classical signal peptides and mitochondrial presequences often adopts an α-helical conformation in membrane-mimetic environments, enhancing translocation efficiency through stabilized interactions. Cleavage of these signals occurs post-targeting by specific peptidases; for ER-directed signal peptides, signal peptidase I (SPase I), a multi-subunit complex, performs endoproteolytic cleavage at the C-region site following the (−3, −1) rule (small residues like at positions −3 and −1 relative to the cleavage point, e.g., Ala-X-Ala). Diversity in signal sequences exists across organisms, reflecting adaptations to distinct translocation systems. Bacterial signal peptides share the tripartite N-H-C structure with eukaryotic counterparts but are generally longer (especially in ) and exhibit higher net positive charge in the N-region, with fewer modifications like . In archaea, variants often feature twin-arginine (RR) motifs in the N-region for the twin-arginine translocation (TAT) pathway, enabling export of folded proteins across the plasma membrane while maintaining the hydrophobic H-core. Representative examples illustrate these signals' roles. The preproinsulin , a 24-residue N-terminal sequence (MALWMRLLPL LALLALWGPD PAAA), directs the precursor to the ER for co-translational translocation and subsequent processing into mature insulin. For mitochondria, the presequence of subunit IV (COX4), a 25-residue N-terminal extension (MLSRLLRVSR LGSRRLLPVR ARLA), exemplifies the amphipathic helical motif that targets this nuclear-encoded subunit to the inner after .

Recognition by Chaperones and Receptors

Molecular chaperones such as and play crucial roles in protein targeting by preventing aggregation of nascent or unfolded polypeptides and maintaining them in competent states for recognition by downstream transport machinery. binds to exposed hydrophobic regions of client proteins in an ATP-dependent manner, stabilizing unfolded conformations that are essential for subsequent interactions with targeting receptors. , often acting later in the folding pathway, collaborates with to remodel proteins and facilitate their delivery to specific organelles, such as mitochondria. Co-chaperones, including J-proteins (also known as DnaJ homologs), enhance the efficiency of these processes by stimulating the ATPase activity of , which drives cycles of substrate binding and release. The (SRP) serves as a key receptor complex for co-translational targeting to the (ER), recognizing as they emerge from the . SRP's SRP54 subunit contains a methionine-rich M domain with a deep hydrophobic groove that accommodates the hydrophobic core of through non-specific interactions, enabling broad specificity for diverse signal sequences. Upon binding, SRP pauses translation and docks to the SRP receptor on the ER membrane via reciprocal interactions between their domains, forming a stable SRP-receptor complex. This association is energy-dependent, with GTP by both triggering the release of the signal peptide to the translocon and dissociation of SRP for recycling. In mitochondrial protein import, specificity is achieved through distinct outer membrane receptors: TOM20 primarily recognizes N-terminal presequences via electrostatic interactions with their amphipathic helical structure, while TOM70 preferentially binds internal targeting signals of carrier proteins through hydrophobic contacts. Cytosolic chaperones deliver preproteins to these receptors; for instance, associates with presequence-containing proteins for handover to TOM20, whereas and cooperate to present carrier preproteins to TOM70. This dual-receptor system ensures efficient sorting, with TOM20 handling matrix-destined proteins and TOM70 managing metabolite transporters. Nuclear import relies on importins as receptors that bind nuclear localization signals (NLS) through electrostatic interactions between basic residues in the NLS and negatively charged grooves in importin-α's repeat domain. Classical monopartite or bipartite NLS motifs fit into this concave binding surface, promoting high-affinity recognition and formation of the importin-cargo complex for transport through nuclear pores. The cycles of and provide energetic support for maintaining NLS-bearing proteins in unfolded states prior to receptor engagement, ensuring timely delivery to the nucleus.

Translocation Processes

Co-translational Translocation

Co-translational translocation is a process in which protein synthesis on cytosolic ribosomes and the concurrent insertion or translocation of the nascent polypeptide across a occur simultaneously, primarily targeting secretory and integral proteins to the (ER) in eukaryotes. This mechanism ensures that hydrophobic regions of proteins are shielded from the aqueous and that proteins destined for the secretory pathway are efficiently directed to the ER lumen or . The process is initiated when a hydrophobic emerges from the ribosomal exit tunnel, triggering recognition and targeting events that couple to insertion. The key components include the (SRP), which is a ribonucleoprotein complex that binds the emerging , halting elongation to allow targeting; the SRP receptor (SR) on the ER membrane, which docks the ribosome-nascent chain (RNC) complex; and the Sec61 translocon, a heterotrimeric protein complex (Sec61α, β, and γ) that forms an aqueous pore in the ER membrane through which the polypeptide thread passes. During translocation, oligosaccharyltransferase (OST), associated with the Sec61 complex, catalyzes the en bloc addition of N-linked glycans to residues in the Asn-X-Ser/Thr on the translocating chain, marking it for further ER processing. Additionally, the ER luminal chaperone BiP, an homolog, binds to the incoming polypeptide via ATP-dependent cycles, acting as a molecular ratchet to prevent back-sliding and drive unidirectional translocation into the lumen. The process unfolds in distinct steps: upon emergence of the signal peptide (typically 15-30 residues long with a hydrophobic core), SRP binds it co-translationally, inducing a pause in translation by interacting with the elongation factor EF2 (or eEF2 in eukaryotes) to create a targeting window of about 30-60 seconds. The SRP-RNC complex then diffuses to the ER membrane, where it engages the SR via GTP hydrolysis, leading to handover of the signal peptide to the Sec61 channel; the ribosome directly associates with Sec61, resuming translation and threading the chain through the pore. As the chain elongates, the signal peptide is cleaved by the signal peptidase complex (SPC) within the translocon, allowing the mature protein to fold in the ER lumen with assistance from chaperones like BiP, while transmembrane domains partition laterally into the lipid bilayer if applicable. This pathway is highly conserved and dominant in eukaryotes for the majority of ER-targeted proteins, ensuring fidelity in the secretory pathway. In prokaryotes, an analogous system employs a simpler SRP (lacking some eukaryotic subunits) and the SecYEG translocon, facilitating co-translational insertion of inner membrane proteins into the plasma membrane, underscoring the evolutionary universality of SRP-dependent targeting across domains of life. BiP-mediated regulation in eukaryotes, through its ATP-driven binding and release, provides an additional layer of control to maintain translocation efficiency against potential retrotranslocation forces.

Post-translational Translocation

Post-translational translocation refers to the process by which fully synthesized proteins in the are imported into organelles such as mitochondria and peroxisomes, distinct from co-translational mechanisms that couple synthesis to membrane crossing. In this pathway, proteins are maintained in a translocation-competent state by cytosolic chaperones, such as , which prevent aggregation and facilitate targeting to organelle receptors before threading through specialized translocons powered by or . This mechanism is essential for proteins destined for non-ribosomal compartments, allowing cytosolic maturation steps like folding intermediates or cofactor assembly prior to import. In mitochondria, post-translational import into the matrix primarily involves the TIM23 complex, a translocon in the inner membrane composed of core subunits Tim23 and Tim17, which forms a protein-conducting channel. Precursor proteins bearing N-terminal presequences are first recognized by outer membrane receptors of the TOM complex and transferred to TIM23 via accessory factors like Tim50. The process is driven by the proton motive force (Δψ) across the inner membrane, which electrophoretically pulls the positively charged presequence through the channel, while matrix-localized mtHsp70, in association with Tim44 and the PAM motor, uses ATP to unfold the protein and generate a pulling force through iterative binding and release cycles. Following translocation, the presequence is cleaved by the mitochondrial processing peptidase (MPP) in the matrix to yield the mature protein. A representative example is the import of matrix proteins such as the β-subunit of F1-ATPase, which follows this presequence pathway. For peroxisomes, post-translational import relies on peroxisomal targeting signals (PTS), particularly PTS1—a C-terminal such as -SKL—that is recognized by the soluble receptor Pex5 in the . The Pex5-cargo complex docks at the peroxisomal via the Pex13-Pex14 complex, forming a transient translocon pore approximately 9 nm in that accommodates folded proteins or oligomers. Translocation occurs without unfolding, facilitated by a nuclear pore-like hydrogel meshwork in the formed by Pex13's YG domain, which selectively permits diffusion of Pex5-bound cargo; post-import, Pex5 is monoubiquitinated and recycled by the AAA ATPases Pex1 and Pex6. This pathway's advantage lies in its ability to import pre-assembled protein complexes, preserving enzymatic activity and enabling rapid peroxisome function in oxidative . An example is PTS1-mediated import of enzymes like , which folds in the before matrix entry. In prokaryotes and chloroplasts, the twin-arginine translocation (TAT) pathway exemplifies post-translational import of folded proteins across energy-transducing membranes, utilizing twin-arginine motifs in signal peptides recognized by Tat receptors. The TAT translocon, comprising TatA, , and TatC, harnesses the proton motive force to drive translocation without ATP, allowing export of cofactored proteins like photosynthetic enzymes in chloroplasts. This mechanism underscores the versatility of post-translational translocation in maintaining folded states essential for function in diverse cellular contexts.

Eukaryotic Organelle Sorting

Mitochondrial Targeting Pathways

Mitochondrial targeting pathways in eukaryotes ensure the precise delivery of nuclear-encoded proteins to specific subcompartments of the , including the outer (OM), (IMS), inner (IM), and matrix. These pathways primarily operate post-translationally, with most precursor proteins synthesized in the and recognized by receptors on the mitochondrial surface before translocation. The of the outer (TOM) complex serves as the universal entry gate for nearly all mitochondrial proteins, where Tom20 and Tom70 act as primary receptors for presequence-containing and carrier proteins, respectively, facilitating passage through the Tom40 channel. For beta-barrel proteins destined for the OM, such as porins, import via TOM is followed by handover to the sorting and assembly machinery (SAM) complex, which inserts these proteins into the ; Sam50 forms the core channel, with Mdm10 stabilizing the process. Sorting signals guide this compartmental specificity. Matrix-targeted proteins typically feature cleavable N-terminal presequences, which form amphipathic alpha-helices rich in positive charges, recognized by cytosolic chaperones like and mitochondrial receptors. In contrast, IM carrier proteins, such as adenine nucleotide translocators, possess internal targeting signals consisting of multiple transmembrane domains with moderate hydrophobicity, lacking cleavable presequences. Upon crossing the OM via TOM, IMS proteins can follow a stop-transfer mechanism, where hydrophobic segments halt translocation and promote lateral release into the IMS, or the oxidative folding pathway mediated by the MIA machinery. The latter involves Mia40, an that captures incoming cysteine-rich precursors via mixed disulfide bonds, enabling their oxidative folding and trapping in the IMS; this is particularly crucial for twin CX3C or CX9C motif proteins like Cox17. For IM insertion, proteins diverge at the IM translocases. Presequence-containing proteins engage the TIM23 complex, where the presequence is driven across the IM by the (Δψ) and pulled into the matrix by ATP-dependent action of mtHsp70 within the presequence translocase-associated motor (PAM) complex, often with Tim44 as a scaffold. Some TIM23 substrates with additional hydrophobic stop-transfer signals are sorted laterally into the IM via the TIM23-SORT subcomplex, involving Tim21 and Mgr2. Carrier proteins, meanwhile, are escorted by small TIM chaperones (Tim9-Tim10 or Tim8-Tim13) across the IMS to the TIM22 complex, where Δψ facilitates insertion of their multiple transmembrane helices into the IM. In the matrix, presequences are cleaved by mitochondrial processing peptidase (MPP), and nascent proteins achieve their native fold with assistance from chaperones like mtHsp70, Hsp60, and Hsp10, preventing aggregation. Quality control mechanisms monitor these pathways to handle mislocalized or unfolded proteins. The mitochondrial unfolded protein response (UPRmt) detects accumulation of mislocalized precursors in the or import defects, triggering transcriptional upregulation of chaperones and proteases via ATFS-1/ATF5 and the integrated stress response to restore . Additionally, cytosolic surveillance by ubiquitin-proteasome systems degrades unimported precursors, while intramitochondrial proteases like LON and i-AAA degrade aberrant imports.

Chloroplast Targeting Pathways

In photosynthetic eukaryotes, the chloroplast proteome consists of approximately 3,000 proteins, with the vast majority—over 90%—encoded by the nuclear genome and synthesized in the as precursors bearing N-terminal transit peptides (TPs). These precursors are imported across the double- via coordinated action of the translocon at the outer of (TOC) and the translocon at the inner of (TIC). The TOC complex, comprising receptor components Toc159 and Toc34 for initial TP recognition and the β-barrel channel Toc75 as the protein-conducting pore, facilitates envelope crossing in an energy-dependent manner driven by GTP and ATP-powered chaperones. Upon translocation through the inner envelope via the complex—primarily involving the channel-forming Tic110 and the scaffold protein Tic40—the precursors reach the stroma, where stromal chaperones prevent aggregation and drive unfolding for import. In the stroma, the transit peptides are cleaved by the stromal processing peptidase (SPP), releasing the mature protein for subsequent sorting or folding. This post-import processing ensures proper localization within the , which is unique among eukaryotic organelles due to its additional internal membrane system housing the photosynthetic apparatus. Intra-chloroplast sorting to the thylakoid lumen employs bipartite targeting signals, consisting of the cleaved TP followed by a thylakoid transfer domain resembling bacterial signal peptides. Lumenal proteins are routed via two distinct post-translational pathways: the Sec-dependent pathway (cpSec), which translocates unfolded precursors in an ATP-dependent manner, as seen with the protein OE33 (PsbO); and the twin-arginine translocation (TAT) pathway (cpTat), which imports fully folded proteins using the proton motive force across the thylakoid membrane, exemplified by OE17 (PsbQ) and OE23 (PsbP). These pathways reflect evolutionary conservation from bacterial ancestors, with cpTat uniquely suited for cofactor-bound proteins like . Proteins destined for the thylakoid membrane, such as light-harvesting /b-binding proteins (LHCPs), utilize signal peptide-like sequences exposed after stromal cleavage, often in conjunction with the (cpSRP) pathway for co-translational integration. This SRP-dependent mechanism involves GTP hydrolysis to target hydrophobic precursors to the , preventing aggregation in the aqueous stroma. A subset of nuclear-encoded proteins exhibits dual targeting to both chloroplasts and mitochondria, mediated by ambiguous N-terminal signals that share partial homology with mitochondrial targeting sequences and chloroplast TPs, allowing stochastic distribution based on chaperone interactions and import kinetics. Examples include the protease Lon1 and components of the glycine decarboxylase complex, which support metabolic coordination between these organelles without requiring alternative splicing or processing. This dual localization enhances efficiency in plant cells, where such proteins constitute a small but functionally significant fraction of the organellar proteomes.

Nuclear Import and Export

Nuclear import and export enable the selective trafficking of proteins and RNAs between the and nucleus in eukaryotic cells, mediated by the nuclear pore complexes (NPCs) embedded in the . These processes are essential for , , and maintenance of nuclear integrity. Small molecules and proteins below approximately 40 kDa can passively diffuse through the NPC, while larger macromolecules require active, energy-dependent facilitated by soluble receptors known as karyopherins. The directionality of transport is driven by a Ran-GTP gradient, with high Ran-GTP concentrations in the nucleus maintained by the chromatin-bound RCC1 and cytoplasmic GTPase-activating protein RanGAP1. The import mechanism begins with nuclear localization signals (NLSs) on cargo proteins being recognized by α, which binds to β to form a heterodimeric receptor complex that shields the cargo for translocation through the NPC. This complex docks to the NPC via interactions with nucleoporins and translocates bidirectionally across the pore, powered by , until nuclear Ran-GTP binds to β, dissociating the complex and releasing the cargo into the nucleoplasm. β is then recycled to the bound to Ran-GTP, where GTP dissociates the pair, allowing reuse. This Ran-GTP gradient ensures unidirectional import by promoting cargo release only in the nucleus. In contrast, nuclear export relies on nuclear export signals (NESs), typically leucine-rich motifs, recognized by exportins such as CRM1 (also known as XPO1), which forms a ternary complex with the NES-cargo and Ran-GTP in the nucleus. This complex translocates through the NPC to the , where Ran-GTP hydrolysis, facilitated by RanGAP1 and RanBP1, releases the cargo and recycles the exportin. For RNA export, the NXF1 ( humans) receptor, often with the NXT1 adaptor, mediates bulk mRNA export independently of Ran-GTP, binding mRNA via -binding proteins and interacting with FG-nucleoporins. CRM1 handles export of proteins, tRNAs, and some mRNAs, ensuring precise spatiotemporal control. The NPC structure features a ~120 MDa scaffold with eightfold , comprising ~30 distinct nucleoporins (Nups), where FG-nucleoporins (FG-Nups) line the central channel to form a selective permeability barrier through hydrophobic interactions and into a hydrogel-like mesh. Karyopherin-cargo complexes transiently interact with FG repeats to partition through this barrier, enabling high throughput with each NPC supporting up to ~1,000 translocation events per second without compromising selectivity. Regulation of nuclear transport occurs via post-translational modifications, notably , which modulates NLS and NES accessibility or receptor affinity in a cell cycle-dependent manner. For instance, phosphorylation near the NLS of v-Jun inhibits importin binding during , restricting nuclear entry until , while activates import to coordinate oncogenic signaling. Such controls link transport to progression, ensuring proteins like cyclins enter the nucleus at appropriate phases.

Endoplasmic Reticulum Targeting

Proteins destined for the secretory pathway or for insertion into the (ER) membrane are primarily targeted co-translationally. Nascent polypeptides bearing an N-terminal are recognized by the (SRP), a ribonucleoprotein complex that binds to the signal sequence as it emerges from the . This interaction pauses translation and directs the ribosome-nascent chain complex to the ER membrane via docking to the SRP receptor (SR). The SRP then facilitates transfer of the to the Sec61 translocon, a heterotrimeric protein channel composed of Sec61α, Sec61β, and Sec61γ subunits, which serves as the primary conduit for protein translocation across or into the ER membrane. Co-translational insertion ensures that hydrophobic transmembrane domains are shielded from the during membrane integration, minimizing aggregation risks. Once in the ER lumen or membrane, proteins are subject to retention mechanisms to prevent unintended export. Soluble luminal proteins, such as chaperones, contain a C-terminal KDEL sequence (Lys-Asp-Glu-Leu) that binds to KDEL receptors in the cis-Golgi, triggering retrograde transport back to the ER via COPI vesicles. For type I membrane proteins, a dilysine motif (KKXX, where X is any amino acid) in the C-terminal cytoplasmic tail interacts with coat protein I (COPI) components, similarly mediating retrieval from the Golgi. These signals ensure that ER-resident proteins maintain their localization, with the KDEL receptor's affinity modulated by pH and calcium gradients between compartments. Quality control in the ER involves chaperone-assisted folding and degradation pathways. The / cycle regulates folding: upon translocation, N-linked glycans are added to residues in the Asn-X-Ser/Thr by oligosaccharyltransferase (OST), associated with the Sec61 translocon. Glucosidase I and II trim the outermost glucose residues, allowing binding to the chaperones (membrane-bound) or (luminal), which retain the and recruit folding enzymes like UDP-glucose: glucosyltransferase (UGGT) to reglucosylate misfolded proteins for repeated cycles. Misfolded or unassembled proteins are directed to ER-associated degradation (ERAD), where they are retrotranslocated through Sec61 or other channels into the , polyubiquitinated by E3 ligases such as Hrd1 or Doa10, and degraded by the 26S . This ubiquitin- pathway eliminates terminally misfolded proteins, maintaining ER . Properly folded proteins exit the ER via COPII-coated vesicles budding from ER exit sites (ERES). The Sar1 recruits the Sec23/24-Sec13/31 complex to the , where Sec24 acts as a cargo adaptor selecting soluble and proteins with specific motifs, such as dileucine or diacidic sequences. Cargo receptors like Erv29 further enhance of secretory proteins into these 60-80 nm vesicles, which fuse with the cis-Golgi or intermediate compartments to initiate anterograde transport. This selective ensures efficient sorting while excluding ER residents.

Peroxisomal Targeting

Peroxisomal targeting enables the selective import of proteins into peroxisomes, single-membrane-bound organelles specialized for oxidative reactions such as beta-oxidation and detoxification. Unlike many other organelles, peroxisomes import fully folded proteins, oligomeric complexes, and even those bound to cofactors, without requiring protein unfolding prior to translocation across the membrane. This post-translational process relies on specific targeting signals and receptor-mediated shuttling, allowing peroxisomes to assemble functional complexes rapidly in response to cellular needs. The majority of peroxisomal matrix proteins contain a peroxisomal targeting signal type 1 (PTS1), a short C-terminal sequence with the consensus motif serine-lysine- (SKL) or close variants such as serine-lysine-methionine (SKM). This signal is recognized in the by the soluble receptor protein PEX5, which binds the PTS1 motif via tetratricopeptide repeat domains in its C-terminal region. A smaller group of proteins, primarily involved in early biosynthetic pathways, utilize the peroxisomal targeting signal type 2 (PTS2), an N-terminal nonapeptide with the arginine/-x5-histidine/ (RLx5HL), where x represents any . The PTS2 is specifically bound by the receptor PEX7, which forms a complex with PEX5 to facilitate . These signals ensure precise sorting, with PTS1 directing over 90% of matrix proteins and PTS2 handling the rest. Once bound to their receptors, proteins are shuttled to the , where PEX5 (for PTS1) or the PEX5-PEX7 complex (for PTS2) docks via interactions with the membrane-anchored peroxins PEX13 and PEX14, forming a transient import pore. release occurs inside the , after which the receptors are monoubiquitinated at a conserved residue by E3 complex composed of PEX2, PEX10, and PEX12. This ubiquitination marks the receptors for extraction back to the by the AAA ATPase peroxins PEX1 and PEX6, enabling receptor recycling and preventing their accumulation on the membrane. The tolerance for folded structures is exemplified by the import of , a tetrameric with bound cofactors, which assembles in the before translocation. Peroxisomes arise de novo from pre-peroxisomal vesicles derived from the membrane, which bud off and fuse with additional precursors to form mature organelles. Each peroxisome typically incorporates around 50 distinct matrix proteins, reflecting their compact tailored to metabolic functions. This biogenesis pathway supports dynamic peroxisome proliferation in response to demands.

Prokaryotic and Archaeal Targeting

Gram-negative Bacterial Systems

possess a complex cell envelope consisting of an inner cytoplasmic , a thin layer, and an outer , necessitating specialized mechanisms for protein targeting to the and outer membrane. Protein export primarily occurs via the Sec and TAT pathways across the inner , followed by further sorting to the outer membrane or retention in the . These systems ensure the proper localization of enzymes, structural components, and factors essential for bacterial survival and . The Sec pathway is the predominant route for exporting unfolded or partially folded proteins across the inner membrane to the in such as . It utilizes the heterotrimeric SecYEG translocon embedded in the inner membrane, where SecY forms the protein-conducting channel, SecE stabilizes the complex, and SecG facilitates translocation cycles. This pathway supports both co-translational translocation, guided by the (SRP) and FtsY receptor, and post-translational translocation, involving the chaperone SecB to prevent premature folding. Sec signal peptides, typically 18–30 long, feature a positively charged N-region, a hydrophobic core H-region, and a polar C-region with an AXA cleavage motif recognized by signal peptidase I for removal upon export. The energy for Sec-mediated translocation is provided by the ATPase activity of SecA, a peripheral membrane that undergoes cyclic to drive stepwise substrate threading through the SecYEG channel, with each cycle advancing approximately 4–7 residues. In contrast, the twin-arginine translocation (TAT) pathway exports fully folded proteins across the inner membrane, particularly those requiring periplasmic cofactors like redox enzymes (e.g., copper amine oxidase or subunits). The TAT system in comprises TatA, , and TatC, forming a receptor complex where and TatC recognize the substrate, and multiple TatA protomers assemble into a dynamic pore for translocation. TAT signal peptides resemble Sec signals but include a conserved twin-arginine motif (S/T-R-R-x-F-L-K) in the H-region, which confers specificity and ensures folded protein compatibility. Unlike the Sec pathway, TAT translocation is powered exclusively by the proton motive force (PMF) across the inner membrane, with the transmembrane electrochemical gradient (Δψ component) driving the process without . This PMF-dependent mechanism allows TAT to function under varying metabolic conditions, though it is sensitive to dissipation of the gradient. Once in the , proteins destined for the outer membrane are targeted via chaperone-mediated pathways. Beta-barrel outer membrane proteins (OMPs), such as porins, are escorted by periplasmic chaperones like SurA and Skp to the beta-barrel assembly machinery (BAM) complex, a five-subunit system (BamA–E) anchored in the outer membrane. BamA, an OMP with a β-barrel domain and periplasmic POTRA repeats, catalyzes the insertion and folding of substrate barrels in an ATP-independent manner, leveraging lateral opening of its barrel for substrate handover. The Lpt (lipopolysaccharide transport) system, involving seven proteins (LptA–G), bridges the periplasm to transport (LPS) from the inner membrane to the outer leaflet of the outer membrane, with LptB2FGC forming an at the inner membrane and LptCDE inserting LPS via a periplasmic LptA polymer. While primarily for LPS, the Lpt machinery integrates with OMP biogenesis to maintain outer membrane and . These outer membrane targeting steps highlight the coordinated protein-protein interactions essential for Gram-negative envelope assembly.00375-2)

Gram-positive Bacterial Systems

In Gram-positive bacteria, protein targeting primarily involves translocation across a single cytoplasmic membrane, lacking the outer membrane characteristic of Gram-negatives, which allows direct release of secreted proteins into the extracellular environment. The general secretion (Sec) pathway dominates this process, facilitating the export of unfolded proteins via the SecYEG translocon in a manner analogous to other bacteria, driven by the ATPase SecA and guided by N-terminal signal peptides. This pathway handles the majority of secretory proteins, including those destined for the cell wall or extracellular space, and is essential for processes such as nutrient acquisition, cell wall maintenance, and virulence factor deployment. For membrane protein insertion, Gram-positive bacteria rely on the YidC insertase, a conserved chaperone that operates independently or in cooperation with the Sec machinery to integrate transmembrane helices into the lipid bilayer. In model organisms like Bacillus subtilis, YidC homologs such as SpoIIIJ (YidC1) and YqjG (YidC2) support the biogenesis of respiratory chain complexes and other essential membrane proteins, with either paralog sufficient for viability, though double mutants are lethal. Alternative pathways expand targeting capabilities; for instance, the ESX (Type VII) secretion system in actinobacteria like Mycobacterium tuberculosis exports folded proteins, including virulence factors such as ESAT-6 and CFP-10, across the membrane using a specialized ATPase-driven apparatus without classical signal peptides. Additionally, the phage shock protein (Psp) response, present in some Gram-positives like Streptococcus pneumoniae, mitigates envelope stress that could impair targeting by stabilizing the membrane during protein translocation overload. Specific targeting signals direct proteins to distinct fates. Lipoproteins, which anchor in the outer leaflet of the cytoplasmic membrane, are recognized by a conserved lipobox motif (L-[S/T/A/V]-[A/G]-C) in their , enabling at the cysteine residue by prolipoprotein diacylglyceryl (Lgt) followed by cleavage and sorting. This pathway is crucial for envelope integrity and immune evasion in pathogens. For extracellular localization, sortase enzymes covalently anchor surface proteins to via an LPXTG sorting motif, with sortase A in and B. subtilis linking pilins, adhesins, and enzymes to the , enhancing host colonization. In B. subtilis, the Sec secretome comprises around 200-300 proteins, including hydrolases like and proteases, which constitute up to 25% of total cellular protein under optimal growth conditions, underscoring the pathway's role in industrial enzyme production and environmental adaptation.

Archaeal Mechanisms

Archaea employ a combination of protein targeting mechanisms that exhibit both prokaryotic and eukaryotic characteristics, facilitating the export of proteins across the cytoplasmic membrane and their integration into the membrane. These systems are adapted to diverse environments, including extreme conditions, and primarily involve the Sec pathway for unfolded proteins and a (SRP) system reminiscent of eukaryotes for co-translational targeting. Additional pathways, such as twin-arginine translocation (TAT)-like systems, enable the export of folded proteins in certain lineages. The Sec pathway in utilizes homologs of the bacterial SecYE translocon, consisting of SecY and SecE, to mediate the translocation of secretory and membrane proteins across or into the cytoplasmic membrane. Unlike , archaea lack a SecA homolog, relying instead on ribosomal stalling during co-translational translocation or possibly gradients for post-translational export, with SecDF aiding in later stages. This pathway processes proteins bearing N-terminal signal peptides, which are cleaved upon translocation, and is essential for viability, as demonstrated by the ability of archaeal SecY to complement bacterial mutants. The SRP system in closely mirrors the eukaryotic version, featuring a heterodimeric SRP composed of SRP54 (also called Ffh) and SRP19 bound to a , which recognizes hydrophobic signal sequences or transmembrane domains on nascent polypeptides emerging from the . This leads to translational pausing and targeting of the ribosome-nascent to the Sec translocon via interaction with the SRP receptor FtsY, a homologous to eukaryotic SRα. SRP54 is essential for cell viability and biogenesis, with reconstitution studies in Haloferax volcanii confirming its co-translational role. Some archaeal species possess a TAT-like system for exporting fully folded proteins, utilizing twin-arginine motifs in N-terminal signal peptides and core components TatA and TatC, but lacking found in . This pathway predominates in , where over 90% of secreted proteins in sp. NRC-1 are TAT-dependent, powered by the sodium motive force rather than proton motive force, facilitating rapid folding in high-salt environments. Examples include the halocin HalH4 in Haloferax mediterranei. Archaeal membrane proteins are predominantly α-helical and follow the "positive inside rule" for topology, with insertion often coordinated by the SRP-Sec system. Signal peptidases in archaea, such as type I homologs (e.g., Sec11), exhibit catalytic mechanisms more akin to eukaryotic signal peptidase complexes than bacterial ones, lacking conserved bacterial residues but sharing Ser-Lys dyads for cleavage of signal peptides post-translocation. These enzymes, present in duplicates like Sec11a and Sec11b in Haloferax volcanii, play distinct roles in processing secretory and membrane proteins. In thermophilic , such as , protein targeting is supported by heat-stable chaperones that prevent aggregation under high temperatures, including group II chaperonins and small heat shock proteins (sHSPs). These chaperones, like the exceptionally stable Cpn from P. furiosus, assist in folding and maintaining translocation-competent states for hyperthermophilic enzymes, such as α-amylase, without ATP-dependent motors like SecA. The in P. furiosus upregulates these chaperones, enhancing of surface proteins like flagellar components for in extreme heat.

Pathological Implications

Diseases Linked to Targeting Defects

Defects in mitochondrial protein targeting contribute to several mitochondrial disorders, primarily through disruptions in the import and assembly of nuclear-encoded proteins into the respiratory chain complexes. For instance, recessive forms of (LHON) have been linked to biallelic mutations in DNAJC30, a cytosolic chaperone involved in complex I biogenesis, which impairs the import and assembly of nuclear-encoded subunits, leading to degeneration and vision loss. Mutations in translocase components, such as TIMM50 in the TIM23 complex, disrupt preprotein import across the , causing severe mitochondrial disorders with , , and . Endoplasmic reticulum (ER) targeting failures often trigger ER stress and the unfolded protein response (UPR), contributing to diseases like and lysosomal storage disorders. In , the most common mutation, ΔF508 in the CFTR gene, causes the CFTR protein to misfold during ER translocation, leading to its recognition by ER-associated degradation (ERAD) machinery and subsequent ubiquitination and proteasomal destruction, preventing proper trafficking to the plasma membrane and resulting in impaired chloride transport. This ER retention and degradation pathway is a central mechanism in the disease's . A classic example of lysosomal targeting defect is I-cell disease (mucolipidosis II), caused by mutations in the GNPTAB gene encoding GlcNAc-1-phosphotransferase, which fails to add mannose-6-phosphate tags to lysosomal enzymes in the Golgi, resulting in their secretion instead of lysosomal delivery, accumulation of undegraded substrates, and severe multisystem involvement including skeletal dysplasia and developmental delay. Peroxisomal targeting defects underlie severe disorders such as Zellweger spectrum disorder (ZSD), characterized by mutations in PEX genes that encode peroxins essential for peroxisome biogenesis and protein import. Mutations in PEX1, the most common cause affecting nearly two-thirds of cases, disrupt the receptor recycling and translocation of peroxisomal targeting signal (PTS1 and PTS2) proteins across the peroxisomal membrane, leading to absent or dysfunctional peroxisomes, accumulation of very long-chain fatty acids, and multisystem failure including hypotonia, seizures, and liver dysfunction. These import blocks prevent the assembly of peroxisomal enzymes critical for lipid metabolism. Nuclear protein targeting impairments are implicated in neurodegenerative diseases like (). In , TDP-43 mislocalization from the nucleus to the is a hallmark , often driven by defects in nuclear import receptors such as importin β, which fail to efficiently transport TDP-43 via its nuclear localization signal, resulting in cytoplasmic aggregation, processing disruptions, and death. This mislocalization exacerbates neuronal toxicity and is observed in over 95% of sporadic cases. In prokaryotes, disruptions in bacterial protein targeting pathways can be exploited for therapeutic purposes in infections. Inhibitors targeting SecA, the motor of the Sec system in like , block the post-translational translocation of secretory proteins across the cytoplasmic membrane, leading to protein export failure and bacterial lethality. Compounds such as and pyrazolopyrimidinone derivatives have been identified as SecA inhibitors with antibacterial potential, highlighting the Sec pathway as a target for novel antibiotics against multidrug-resistant pathogens.

Therapeutic Targeting Strategies

Therapeutic targeting of protein targeting pathways has emerged as a promising strategy for treating disorders arising from defective protein localization, with modulators designed to enhance folding, import, or translocation efficiency. In the (ER), chemical chaperones such as 4-phenylbutyrate (4-PBA) promote proper folding of mislocalized proteins by reducing ER stress and facilitating secretion in conditions involving protein misfolding. For instance, 4-PBA has been shown to restore trafficking and increase secretion of mutant proteins in cellular models of folding diseases. Additionally, small-molecule regulators reprogram ER stress responses to improve the folding and trafficking of secretory proteins, targeting nearly one-third of the that enters the ER. These regulators act by modulating the unfolded protein response, thereby alleviating proteotoxic stress and enhancing overall ER . For mitochondrial targeting defects, antioxidants mitigate oxidative damage that impairs mitochondrial function. Mitochondria-targeted antioxidants, such as ubiquinol derivatives like MitoQ, inhibit and offer therapeutic potential in degenerative diseases linked to mitochondrial dysfunction. approaches address mutations in nuclear-encoded mitochondrial proteins by delivering corrected genes via adeno-associated viral vectors, restoring function in preclinical models of mitochondrial disorders. Pathogenic variants in TIM23 genes, such as those affecting Tim50, disrupt preprotein translocation, and targeted gene replacement has shown promise in compensating for these defects without off-target effects on . In peroxisomal disorders like , therapies focus on stabilizing import receptors to rescue biogenesis. Long-term cholic acid administration sustains peroxisomal function by enhancing metabolism and preventing liver progression in patients with assembly defects. donors, such as S-nitrosoglutathione, promote number and activity in PEX1 mutant fibroblasts, extending lifespan in model organisms by stabilizing import pathways. Antimicrobial strategies exploit bacterial protein targeting for pathogen control. Inhibitors of the Sec translocon, including small molecules targeting ATPase activity, block post-translational export in , abrogating function and enhancing efficacy. For the twin-arginine translocation (TAT) pathway, small-molecule inhibitors disrupt folded protein export in pathogens like , reducing and formation without mammalian toxicity. These TAT blockers, identified through high-throughput screens, synergize with existing antimicrobials to combat multidrug-resistant strains. Emerging therapies leverage gene editing and biomimetic delivery to modulate targeting signals. CRISPR-Cas9 systems enable precise correction of mutations in mitochondrial targeting sequences, shifting and restoring import in cellular models of . Base editors adapted for achieve efficient editing of import-related genes, modeling and alleviating pathogenic variants in . systems mimicking (SRP)-mediated cotranslational targeting facilitate site-specific delivery of therapeutic proteins, enhancing cellular uptake and secretion . These engineered nanoparticles promote chain elongation arrest and microsomal docking, akin to SRP function, for precise at target sites.

Computational Approaches

Prediction Algorithms

Prediction algorithms for protein targeting employ bioinformatics approaches to computationally identify targeting signals and predict subcellular destinations based on sequences. These tools analyze features such as N-terminal signal peptides, transit peptides, and other motifs to classify proteins into categories like secretory, mitochondrial, or chloroplastic pathways. Seminal methods rely on techniques, including neural networks and hidden Markov models, to achieve high predictive accuracy, often exceeding 90% for well-defined signals. One of the foundational tools is SignalP, which uses neural networks to predict the presence and cleavage sites of signal peptides in eukaryotic and prokaryotic proteins. Developed initially in the 1990s and iteratively improved, the latest version, SignalP 6.0, incorporates to distinguish all five types of signal peptides, including those for the twin-arginine translocation pathway, with accuracies surpassing 95% on benchmark datasets for cleavage site prediction. SignalP excels in identifying classical secretory signals but focuses primarily on N-terminal regions, making it essential for distinguishing exported proteins from cytosolic ones. TargetP complements SignalP by discriminating between mitochondrial, chloroplastic, secretory, and other targeting signals using deep neural networks trained on sequence motifs of variable lengths. The tool, in its second iteration (TargetP 2.0), predicts N-terminal presequences and their cleavage sites, achieving balanced accuracies around 85-90% across eukaryotic localizations by leveraging neural architectures suited for motif recognition. It is particularly useful for plant and animal proteins, integrating predictions to resolve overlaps between pathways like mitochondrial and secretory targeting. PSORT, including variants like PSORTb for and WoLF PSORT for eukaryotes, adopts a rule-based approach combined with to predict subcellular localization by integrating multiple sequence features such as composition, functional motifs, and known sorting signals. PSORTb 3.0, for instance, employs support vector machines on bacterial datasets to achieve precision rates of about 90% for outer membrane and periplasmic predictions in . This method's strength lies in its interpretability, allowing users to trace decisions back to specific rules, though it may underperform on novel or atypical sequences compared to purely data-driven models. Advancements in the 2020s have introduced models like DeepLoc, which utilize transformer-based protein language models and attention mechanisms to predict multi-label subcellular localizations directly from full protein sequences, bypassing explicit signal extraction. DeepLoc 2.0, for example, employs protein language models for eukaryotic predictions, attaining top-1 accuracies around 74% and F1 scores over 0.6 for 10 compartments while providing attention-based interpretability for key sequence regions. These models represent a shift toward end-to-end learning, improving handling of non-canonical signals and integrating evolutionary information for broader applicability across organisms. Recent developments have integrated structural predictions from tools like to refine targeting signal identification. For instance, analyses of AlphaFold2-predicted structures have improved the precision of TargetP 2.0 and SignalP 6.0 by confirming signal peptide conformations, reducing false positives in ambiguous cases by up to 20% on benchmark datasets as of 2023. Despite their efficacy, prediction algorithms face limitations in resolving ambiguous or weak targeting signals, such as those in dual-localized proteins or non-classical pathways, where false positives can exceed 10-20% in challenging cases. Additionally, reliance on training data biases toward well-studied organisms can reduce performance on under-represented , underscoring the need for experimental validation to confirm predictions.

Experimental Validation Tools

Fluorescence microscopy serves as a primary tool for visualizing protein localization in living cells, often employing fusions with (GFP) or its variants to track targeting signals. For instance, mito-GFP fusions have been widely used to confirm mitochondrial import by observing punctate fluorescence patterns colocalizing with mitochondrial markers like MitoTracker. This approach allows real-time monitoring of dynamic targeting processes, such as translocation to the or nucleus, with validation through coefficients exceeding 0.8 in fixed and live-cell imaging. However, potential artifacts from fusion-induced mislocalization necessitate controls like staining for endogenous proteins. Subcellular fractionation, combined with , enables the isolation of organelles to biochemically assess protein distribution. Cells are lysed and subjected to sequential centrifugation steps—typically low-speed (e.g., 1,000 × g) for nuclei, medium-speed (10,000 × g) for mitochondria, and high-speed (100,000 × g) for microsomes—followed by Western blotting with compartment-specific markers like COX IV for mitochondria or for ER. This method has quantified targeting efficiency, revealing, for example, over 70% enrichment of matrix proteins in mitochondrial fractions from extracts. protection assays on fractions further confirm import by assessing resistance to added proteases, distinguishing surface-bound from translocated proteins. Chemical cross-linking techniques identify transient interactions between targeting factors and cargo proteins, such as the (SRP) with its receptor. UV- or chemical-induced cross-linking (e.g., using DSS or BS3) captures SRP54-signal sequence complexes in translationally active lysates, followed by immunoprecipitation and or analysis. Seminal studies have mapped SRP-receptor interfaces, showing cross-linked adducts at residues critical for activation, with efficiencies up to 50% in reconstituted systems. This approach is particularly useful for prokaryotic and eukaryotic co-translational targeting pathways. In vitro import assays reconstitute targeting using radiolabeled precursor proteins synthesized via in vitro transcription-translation and isolated organelles. For mitochondrial import, [³⁵S]-methionine-labeled preproteins are incubated with energized mitochondria under ATP-dependent conditions, with import scored by protease-protected, alkali-resistant integration (typically 20-60% efficiency for matrix proteins). This system has elucidated receptor dependencies, such as TOM20/TOM70 roles, by blocking with antibodies and quantifying via autoradiography. Adaptations to human cell mitochondria maintain physiological relevance for studying import defects. Proteomics approaches, leveraging , detect targeting defects in mutants by profiling proteome-wide changes in localization. Label-free or TMT-based quantitative MS on fractionated samples identifies mislocalized proteins, such as accumulation of precursors in cytosolic fractions of mutants (e.g., tom40Δ strains showing 2-5-fold upregulation of non-imported preproteins). Interactome analysis via cross-linking MS maps translocation complexes, revealing novel substrates for pathways like MIA40-dependent . These methods have scaled to genome-wide screens, prioritizing high-confidence hits with spectral counts >10.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.