Messenger RNA
Messenger RNA
Main page
2320706

Messenger RNA

logo
Community Hub0 subscribers
Read side by side
from Wikipedia

The "life cycle" of an mRNA in a eukaryotic cell. RNA is transcribed in the nucleus; after processing, it is transported to the cytoplasm and translated by the ribosome. Finally, the mRNA is degraded.

In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.

mRNA is created during the process of transcription, where an enzyme (RNA polymerase) converts the gene into primary transcript mRNA (also known as pre-mRNA). This pre-mRNA usually still contains introns, regions that will not go on to code for the final amino acid sequence. These are removed in the process of RNA splicing, leaving only exons, regions that will encode the protein. This exon sequence constitutes mature mRNA. Mature mRNA is then read by the ribosome, and the ribosome creates the protein utilizing amino acids carried by transfer RNA (tRNA). This process is known as translation. All of these processes form part of the central dogma of molecular biology, which describes the flow of genetic information in a biological system.

As in DNA, genetic information in mRNA is contained in the sequence of nucleotides, which are arranged into codons consisting of three ribonucleotides each. Each codon codes for a specific amino acid, except the stop codons, which terminate protein synthesis. The translation of codons into amino acids requires two other types of RNA: transfer RNA, which recognizes the codon and provides the corresponding amino acid, and ribosomal RNA (rRNA), the central component of the ribosome's protein-manufacturing machinery.

The concept of mRNA was developed by Sydney Brenner and Francis Crick in 1960 during a conversation with François Jacob. In 1961, mRNA was identified and described independently by one team consisting of Brenner, Jacob, and Matthew Meselson, and another team led by James Watson. While analyzing the data in preparation for publication, Jacob and Jacques Monod coined the name "messenger RNA".

Synthesis

[edit]
RNA polymerase transcribes a DNA strand to form mRNA

The brief existence of an mRNA molecule begins with transcription, and ultimately ends in degradation. During its life, an mRNA molecule may also be processed, edited, and transported prior to translation. Eukaryotic mRNA molecules often require extensive processing and transport, while prokaryotic mRNA molecules do not. A molecule of eukaryotic mRNA and the proteins surrounding it are together called a messenger RNP.[citation needed]

Transcription

[edit]

Transcription is when RNA is copied from DNA in the nucleus. During transcription, RNA polymerase makes a copy of a gene from the DNA to mRNA as needed. This process differs slightly in eukaryotes and prokaryotes. One notable difference is that prokaryotic RNA polymerase associates with DNA-processing enzymes during transcription so that processing can proceed during transcription. Therefore, this causes the new mRNA strand to become double stranded by producing a complementary strand known as the tRNA strand, which when combined are unable to form structures from base-pairing. Moreover, the template for mRNA is the complementary strand of tRNA, which is identical in sequence to the anticodon sequence that the DNA binds to. The short-lived, unprocessed or partially processed product is termed precursor mRNA, or pre-mRNA; once completely processed, it is termed mature mRNA.[citation needed]

Uracil substitution for thymine

[edit]

mRNA uses uracil (U) instead of thymine (T) in DNA. uracil (U) is the complementary base to adenine (A) during transcription instead of thymine (T). Thus, when using a template strand of DNA to build RNA, thymine is replaced with uracil. This substitution allows the mRNA to carry the appropriate genetic information from DNA to the ribosome for translation. Regarding the natural history, uracil came first then thymine; evidence suggests that RNA came before DNA in evolution.[1] The RNA World hypothesis proposes that life began with RNA molecules, before the emergence of DNA genomes and coded proteins. In DNA, the evolutionary substitution of thymine for uracil may have increased DNA stability and improved the efficiency of DNA replication.[2][3]

Eukaryotic pre-mRNA processing

[edit]
DNA gene is transcribed to pre-mRNA, which is then processed to form a mature mRNA, and then lastly translated by a ribosome to a protein

Processing of mRNA differs greatly among eukaryotes, bacteria, and archaea. Non-eukaryotic mRNA is, in essence, mature upon transcription and requires no processing, except in rare cases.[4] Eukaryotic pre-mRNA, however, requires several processing steps before its transport to the cytoplasm and its translation by the ribosome.

Splicing

[edit]

The extensive processing of eukaryotic pre-mRNA that leads to the mature mRNA is the RNA splicing, a mechanism by which introns or outrons (non-coding regions) are removed and exons (coding regions) are joined.[5][6]

5' cap addition

[edit]
5' cap structure

A 5' cap (also termed an RNA cap, an RNA 7-methylguanosine cap, or an RNA m7G cap) is a modified guanine nucleotide that has been added to the "front" or 5' end of a eukaryotic messenger RNA shortly after the start of transcription. The 5' cap consists of a terminal 7-methylguanosine residue that is linked through a 5'-5'-triphosphate bond to the first transcribed nucleotide. Its presence is critical for recognition by the ribosome and protection from RNases.[citation needed]

Cap addition is coupled to transcription, and occurs co-transcriptionally, such that each influences the other. Shortly after the start of transcription, the 5' end of the mRNA being synthesized is bound by a cap-synthesizing complex associated with RNA polymerase. This enzymatic complex catalyzes the chemical reactions that are required for mRNA capping. Synthesis proceeds as a multi-step biochemical reaction.[citation needed]

Editing

[edit]

In some instances, an mRNA will be edited, changing the nucleotide composition of that mRNA. An example in humans is the apolipoprotein B mRNA, which is edited in some tissues, but not others. The editing creates an early stop codon, which, upon translation, produces a shorter protein. Another well-defined example is A-to-I (adenosine to inosine) editing, which is carried out by double-strand specific adenosine-to inosine editing (ADAR) enzymes. This can occur in both the open reading frame and untranslated regions, altering the structural properties of the mRNA. Although essential for development, the exact role of this editing is not fully understood [7]

Polyadenylation

[edit]
Polyadenylation

Polyadenylation is the covalent linkage of a polyadenylyl moiety to a messenger RNA molecule. In eukaryotic organisms most messenger RNA (mRNA) molecules are polyadenylated at the 3' end, but recent studies have shown that short stretches of uridine (oligouridylation) are also common.[8] The poly(A) tail and the protein bound to it aid in protecting mRNA from degradation by exonucleases. Polyadenylation is also important for transcription termination, export of the mRNA from the nucleus, and translation. mRNA can also be polyadenylated in prokaryotic organisms, where poly(A) tails act to facilitate, rather than impede, exonucleolytic degradation.[9] Polyadenylation occurs during and/or immediately after transcription of DNA into RNA. After transcription has been terminated, the mRNA chain is cleaved through the action of an endonuclease complex associated with RNA polymerase. After the mRNA has been cleaved, around 250 adenosine residues are added to the free 3' end at the cleavage site. This reaction is catalyzed by polyadenylate polymerase. Just as in alternative splicing, there can be more than one polyadenylation variant of an mRNA.

Polyadenylation site mutations also occur. The primary RNA transcript of a gene is cleaved at the poly-A addition site, and 100–200 A's are added to the 3' end of the RNA. If this site is altered, an abnormally long and unstable mRNA construct will be formed.

Transport

[edit]

Another difference between eukaryotes and prokaryotes is mRNA transport. Because eukaryotic transcription and translation is compartmentally separated, eukaryotic mRNAs must be exported from the nucleus to the cytoplasm—a process that may be regulated by different signaling pathways.[10] Mature mRNAs are recognized by their processed modifications and then exported through the nuclear pore by binding to the cap-binding proteins CBP20 and CBP80,[11] as well as the transcription/export complex (TREX).[12][13] Multiple mRNA export pathways have been identified in eukaryotes.[14]

In spatially complex cells, some mRNAs are transported to particular subcellular destinations. In mature neurons, certain mRNA are transported from the soma to dendrites. One site of mRNA translation is at polyribosomes selectively localized beneath synapses.[15] The mRNA for Arc/Arg3.1 is induced by synaptic activity and localizes selectively near active synapses based on signals generated by NMDA receptors.[16] Other mRNAs also move into dendrites in response to external stimuli, such as β-actin mRNA.[17] For export from the nucleus, actin mRNA associates with ZBP1[18] and later with 40S subunit. The complex is bound by a motor protein and is transported to the target location (neurite extension) along the cytoskeleton. Eventually ZBP1 is phosphorylated by Src in order for translation to be initiated.[19] In developing neurons, mRNAs are also transported into growing axons and especially growth cones. Many mRNAs are marked with so-called "zip codes", which target their transport to a specific location.[20][21] mRNAs can also transfer between mammalian cells through structures called tunneling nanotubes.[22][23]

Translation

[edit]
Translation of mRNA to protein

Because prokaryotic mRNA does not need to be processed or transported, translation by the ribosome can begin immediately after the end of transcription. Therefore, it can be said that prokaryotic translation is coupled to transcription and occurs co-transcriptionally.[24]

Eukaryotic mRNA that has been processed and transported to the cytoplasm (i.e., mature mRNA) can then be translated by the ribosome. Translation may occur at ribosomes free-floating in the cytoplasm, or directed to the endoplasmic reticulum by the signal recognition particle. Therefore, unlike in prokaryotes, eukaryotic translation is not directly coupled to transcription. It is even possible in some contexts that reduced mRNA levels are accompanied by increased protein levels, as has been observed for mRNA/protein levels of EEF1A1 in breast cancer.[25][non-primary source needed]

Structure

[edit]
The structure of a mature eukaryotic mRNA. A fully processed mRNA includes a 5' cap, 5' UTR, coding region, 3' UTR, and poly(A) tail.

Coding regions

[edit]

Coding regions are composed of codons, which are decoded and translated into proteins by the ribosome; in eukaryotes usually into one and in prokaryotes usually into several. Coding regions begin with the start codon and end with a stop codon. In general, the start codon is an AUG triplet and the stop codon is UAG ("amber"), UAA ("ochre"), or UGA ("opal"). The coding regions tend to be stabilised by internal base pairs; this impedes degradation.[26][27] In addition to being protein-coding, portions of coding regions may serve as regulatory sequences in the pre-mRNA as exonic splicing enhancers or exonic splicing silencers.

Untranslated regions

[edit]
Universal structure of eukaryotic mRNA, showing the structure of the 5' and 3' UTRs.

Untranslated regions (UTRs) are sections of the mRNA before the start codon and after the stop codon that are not translated, termed the five prime untranslated region (5' UTR) and three prime untranslated region (3' UTR), respectively. These regions are transcribed with the coding region and thus are exonic as they are present in the mature mRNA. Several roles in gene expression have been attributed to the untranslated regions, including mRNA stability, mRNA localization, and translational efficiency. The ability of a UTR to perform these functions depends on the sequence of the UTR and can differ between mRNAs. Genetic variants in 3' UTR have also been implicated in disease susceptibility because of the change in RNA structure and protein translation.[28]

The stability of mRNAs may be controlled by the 5' UTR and/or 3' UTR due to varying affinity for RNA degrading enzymes called ribonucleases and for ancillary proteins that can promote or inhibit RNA degradation. (See also, C-rich stability element.)

Translational efficiency, including sometimes the complete inhibition of translation, can be controlled by UTRs. Proteins that bind to either the 3' or 5' UTR may affect translation by influencing the ribosome's ability to bind to the mRNA. MicroRNAs bound to the 3' UTR also may affect translational efficiency or mRNA stability.

Cytoplasmic localization of mRNA is thought to be a function of the 3' UTR. Proteins that are needed in a particular region of the cell can also be translated there; in such a case, the 3' UTR may contain sequences that allow the transcript to be localized to this region for translation.

Some of the elements contained in untranslated regions form a characteristic secondary structure when transcribed into RNA. These structural mRNA elements are involved in regulating the mRNA. Some, such as the SECIS element, are targets for proteins to bind. One class of mRNA element, the riboswitches, directly bind small molecules, changing their fold to modify levels of transcription or translation. In these cases, the mRNA regulates itself.

Poly(A) tail

[edit]

The 3' poly(A) tail is a long sequence of adenine nucleotides (often several hundred) added to the 3' end of the pre-mRNA. This tail promotes export from the nucleus and translation, and protects the mRNA from degradation.

Monocistronic versus polycistronic mRNA

[edit]

An mRNA molecule is said to be monocistronic when it contains the genetic information to translate only a single protein chain (polypeptide). This is the case for most of the eukaryotic mRNAs.[29][30] On the other hand, polycistronic mRNA carries several open reading frames (ORFs), each of which is translated into a polypeptide. These polypeptides usually have a related function (they often are the subunits composing a final complex protein) and their coding sequence is grouped and regulated together in a regulatory region, containing a promoter and an operator. Most of the mRNA found in bacteria and archaea is polycistronic,[29] as is the human mitochondrial genome.[31] Dicistronic or bicistronic mRNA encodes only two proteins.

mRNA circularization

[edit]
mRNA circularisation and regulation

In eukaryotes mRNA molecules form circular structures due to an interaction between the eIF4E and poly(A)-binding protein, which both bind to eIF4G, forming an mRNA-protein-mRNA bridge.[32] Circularization is thought to promote cycling of ribosomes on the mRNA leading to time-efficient translation, and may also function to ensure only intact mRNA are translated (partially degraded mRNA characteristically have no m7G cap, or no poly-A tail).[33]

Other mechanisms for circularization exist, particularly in virus mRNA. Poliovirus mRNA uses a cloverleaf section towards its 5' end to bind PCBP2, which binds poly(A)-binding protein, forming the familiar mRNA-protein-mRNA circle. Barley yellow dwarf virus has binding between mRNA segments on its 5' end and 3' end (called kissing stem loops), circularizing the mRNA without any proteins involved.

RNA virus genomes (the + strands of which are translated as mRNA) are also commonly circularized.[34] During genome replication the circularization acts to enhance genome replication speeds, cycling viral RNA-dependent RNA polymerase much the same as the ribosome is hypothesized to cycle.

Degradation

[edit]

Different mRNAs within the same cell have distinct lifetimes (stabilities). In bacterial cells, individual mRNAs can survive from seconds to more than an hour. However, the lifetime averages between 1 and 3 minutes, making bacterial mRNA much less stable than eukaryotic mRNA.[35] In mammalian cells, mRNA lifetimes range from several minutes to days.[36] The greater the stability of an mRNA the more protein may be produced from that mRNA. The limited lifetime of mRNA enables a cell to alter protein synthesis rapidly in response to its changing needs. There are many mechanisms that lead to the destruction of an mRNA, some of which are described below.

Prokaryotic mRNA degradation

[edit]
Overview of mRNA decay pathways in the different life domains.

In general, in prokaryotes the lifetime of mRNA is much shorter than in eukaryotes. Prokaryotes degrade messages by using a combination of ribonucleases, including endonucleases, 3' exonucleases, and 5' exonucleases. In some instances, small RNA molecules (sRNA) tens to hundreds of nucleotides long can stimulate the degradation of specific mRNAs by base-pairing with complementary sequences and facilitating ribonuclease cleavage by RNase III. It was recently shown that bacteria also have a sort of 5' cap consisting of a triphosphate on the 5' end.[37] Removal of two of the phosphates leaves a 5' monophosphate, causing the message to be destroyed by the exonuclease RNase J, which degrades 5' to 3'.

Eukaryotic mRNA turnover

[edit]

Inside eukaryotic cells, there is a balance between the processes of translation and mRNA decay. Messages that are being actively translated are bound by ribosomes, the eukaryotic initiation factors eIF-4E and eIF-4G, and poly(A)-binding protein. eIF-4E and eIF-4G block the decapping enzyme (DCP2), and poly(A)-binding protein blocks the exosome complex, protecting the ends of the message. The balance between translation and decay is reflected in the size and abundance of cytoplasmic structures known as P-bodies.[38] The poly(A) tail of the mRNA is shortened by specialized exonucleases that are targeted to specific messenger RNAs by a combination of cis-regulatory sequences on the RNA and trans-acting RNA-binding proteins. Poly(A) tail removal is thought to disrupt the circular structure of the message and destabilize the cap binding complex. The message is then subject to degradation by either the exosome complex or the decapping complex. In this way, translationally inactive messages can be destroyed quickly, while active messages remain intact. The mechanism by which translation stops and the message is handed-off to decay complexes is not understood in detail. The majority of mRNA decay was believed to be cytoplasmic; however, recently, a novel mRNA decay pathway was described, which starts in the nucleus.[39]

AU-rich element decay

[edit]

The presence of AU-rich elements in some mammalian mRNAs tends to destabilize those transcripts through the action of cellular proteins that bind these sequences and stimulate poly(A) tail removal. Loss of the poly(A) tail is thought to promote mRNA degradation by facilitating attack by both the exosome complex[40] and the decapping complex.[41] Rapid mRNA degradation via AU-rich elements is a critical mechanism for preventing the overproduction of potent cytokines such as tumor necrosis factor (TNF) and granulocyte-macrophage colony stimulating factor (GM-CSF).[42] AU-rich elements also regulate the biosynthesis of proto-oncogenic transcription factors like c-Jun and c-Fos.[43]

Nonsense-mediated decay

[edit]

Eukaryotic messages are subject to surveillance by nonsense-mediated decay (NMD), which checks for the presence of premature stop codons (nonsense codons) in the message. These can arise via incomplete splicing, V(D)J recombination in the adaptive immune system, mutations in DNA, transcription errors, leaky scanning by the ribosome causing a frame shift, and other causes. Detection of a premature stop codon triggers mRNA degradation by 5' decapping, 3' poly(A) tail removal, or endonucleolytic cleavage.[44]

Small interfering RNA (siRNA)

[edit]

In metazoans, small interfering RNAs (siRNAs) processed by Dicer are incorporated into a complex known as the RNA-induced silencing complex or RISC. This complex contains an endonuclease that cleaves perfectly complementary messages to which the siRNA binds. The resulting mRNA fragments are then destroyed by exonucleases. siRNA is commonly used in laboratories to block the function of genes in cell culture. It is thought to be part of the innate immune system as a defense against double-stranded RNA viruses.[45]

MicroRNA (miRNA)

[edit]

MicroRNAs (miRNAs) are small RNAs that typically are partially complementary to sequences in metazoan messenger RNAs.[46][47] Binding of a miRNA to a message can repress translation of that message and accelerate poly(A) tail removal, thereby hastening mRNA degradation. The mechanism of action of miRNAs is the subject of active research.[48][49]

Other decay mechanisms

[edit]

There are other ways by which messages can be degraded, including non-stop decay and silencing by Piwi-interacting RNA (piRNA), among others.

Applications

[edit]

The administration of a nucleoside-modified messenger RNA sequence can cause a cell to make a protein, which in turn could directly treat a disease or could function as a vaccine; more indirectly the protein could drive an endogenous stem cell to differentiate in a desired way.[50][51]

The primary challenges of RNA therapy center on delivering the RNA to the appropriate cells.[52] Challenges include the fact that naked RNA sequences naturally degrade after preparation; they may trigger the body's immune system to attack them as an invader; and they are impermeable to the cell membrane.[51] Once within the cell, they must then leave the cell's transport mechanism to take action within the cytoplasm, which houses the necessary ribosomes.[50]

Overcoming these challenges, mRNA as a therapeutic was first put forward in 1989 "after the development of a broadly applicable in vitro transfection technique."[53] In the 1990s, mRNA vaccines for personalized cancer have been developed, relying on non-nucleoside modified mRNA. mRNA based therapies continue to be investigated as a method of treatment or therapy for both cancer as well as auto-immune, metabolic, and respiratory inflammatory diseases. Gene editing therapies such as CRISPR may also benefit from using mRNA to induce cells to make the desired Cas protein.[54]

Since the 2010s, RNA vaccines and other RNA therapeutics have been considered to be "a new class of drugs".[55] The first mRNA-based vaccines received restricted authorization and were rolled out across the world during the COVID-19 pandemic by Pfizer–BioNTech COVID-19 vaccine and Moderna, for example.[56] The 2023 Nobel Prize in Physiology or Medicine was awarded to Katalin Karikó and Drew Weissman for the development of effective mRNA vaccines against COVID-19.[57][58][59] New approaches to modulate RNA levels as a therapeutics include the use of antisense oligonucleotides, including for neurodevelopment diseases associated with high mortality.[60]

History

[edit]

Several molecular biology studies during the 1950s indicated that RNA played some kind of role in protein synthesis, but that role was not clearly understood. For instance, in one of the earliest reports, Jacques Monod and his team showed that RNA synthesis was necessary for protein synthesis, specifically during the production of the enzyme β-galactosidase in the bacterium E. coli.[61] Arthur Pardee also found similar RNA accumulation in 1954.[62] In 1953, Alfred Hershey, June Dixon, and Martha Chase described a certain cytosine-containing DNA (indicating it was RNA) that disappeared quickly after its synthesis in E. coli.[63] In hindsight, this may have been one of the first observations of the existence of mRNA but it was not recognized at the time as such.[64]

The idea of mRNA was first conceived by Sydney Brenner and Francis Crick on 15 April 1960 at King's College, Cambridge, while François Jacob was telling them about a recent experiment conducted by Arthur Pardee, himself, and Monod (the so-called PaJaMo experiment, which did not prove mRNA existed but suggested the possibility of its existence). With Crick's encouragement, Brenner and Jacob immediately set out to test this new hypothesis, and they contacted Matthew Meselson at the California Institute of Technology for assistance. During the summer of 1960, Brenner, Jacob, and Meselson conducted an experiment in Meselson's laboratory at Caltech which was the first to prove the existence of mRNA. That fall, Jacob and Monod coined the name "messenger RNA" and developed the first theoretical framework to explain its function.[64]

In February 1961, James Watson revealed that his Harvard-based research group had been right behind them with a series of experiments whose results pointed in roughly the same direction. Brenner and the others agreed to Watson's request to delay publication of their research findings. As a result, the Brenner and Watson articles were published simultaneously in the same issue of Nature in May 1961, while that same month, Jacob and Monod published their theoretical framework for mRNA in the Journal of Molecular Biology.[64]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Messenger RNA (mRNA) is a single-stranded ribonucleic acid (RNA) molecule that carries genetic information from deoxyribonucleic acid (DNA) to ribosomes, where it serves as a template for protein synthesis during translation. In eukaryotes, mRNA is transcribed from a gene's DNA template via RNA polymerase II and transported from the nucleus to the cytoplasm.[1] mRNA encodes the amino acid sequence of proteins using a series of three-nucleotide codons that specify particular amino acids or translation signals.[1] As a key intermediary in the central dogma of molecular biology, mRNA enables the expression of genetic information stored in DNA to produce functional proteins essential for cellular processes.[2] The discovery of mRNA occurred in 1961, when Sydney Brenner, François Jacob, and Matthew Meselson demonstrated that it acts as an unstable, short-lived carrier of genetic information from DNA to protein-synthesizing ribosomes in bacteria.[3] This finding built on earlier hypotheses by Jacob and Monod and resolved how genetic instructions are transferred without direct DNA involvement in translation.[4] In eukaryotes, mRNA production involves transcription in the nucleus followed by extensive processing of the initial transcript, known as pre-mRNA, to generate mature mRNA ready for export and translation.[5] Eukaryotic pre-mRNA processing includes three major steps: addition of a 5' cap (a 7-methylguanosine structure) to protect the mRNA and facilitate ribosome binding; splicing to remove non-coding introns and join coding exons; and cleavage and polyadenylation at the 3' end, adding a poly-A tail for stability and export.[6] The mature mRNA structure typically consists of a 5' untranslated region (UTR), the coding sequence, a 3' UTR, the 5' cap, and the poly-A tail, with the overall length varying from hundreds to thousands of nucleotides depending on the gene.[2] These modifications ensure mRNA stability, efficient nuclear export through nuclear pore complexes, and accurate translation, where ribosomes decode the mRNA sequence in coordination with transfer RNA (tRNA) molecules.[6] Beyond its fundamental role in gene expression, mRNA has gained prominence in biotechnology, particularly in mRNA vaccines that instruct cells to produce viral proteins for immune response training, as seen in COVID-19 vaccines.[7] Dysregulation of mRNA processing or stability is implicated in various diseases, including cancer and neurodegenerative disorders, highlighting its critical regulatory functions.[8] Over 150,000 unique mRNAs have been identified in human cells, enabling the diversity of the proteome from a limited genome through mechanisms like alternative splicing.[2]

Introduction

Definition and Discovery

Messenger RNA (mRNA) is a single-stranded ribonucleic acid (RNA) molecule transcribed from a DNA template that serves as an intermediary carrying the genetic code to ribosomes for directing protein synthesis. In eukaryotes, transcription occurs in the nucleus, with mRNA exported to ribosomes in the cytoplasm; in prokaryotes, both transcription and translation take place in the cytoplasm.[5][9] This process aligns with the central dogma of molecular biology, which posits that genetic information flows from DNA to RNA to proteins. The concept of mRNA emerged in 1961 when François Jacob and Jacques Monod proposed it as an unstable intermediary in bacterial gene expression, particularly in their studies of the lac operon, where it was envisioned as a short-lived RNA that transmits regulatory signals from genes to ribosomes for rapid protein production. Their model explained the observed quick turnover of RNA in bacteria, contrasting with the stability of other RNA types, and laid the groundwork for understanding inducible gene systems. This proposal was experimentally confirmed through pulse-labeling studies in the early 1960s, notably by Sydney Brenner, Jacob, and Matthew Meselson, who demonstrated that a small fraction of rapidly labeled, unstable RNA becomes associated with ribosomes during protein synthesis in bacteriophage-infected Escherichia coli, directly linking it to the lac operon induction.[10] These findings established mRNA's role as the specific carrier of genetic information, distinguishing it from transfer RNA (tRNA) and ribosomal RNA (rRNA), which primarily function in the translational apparatus rather than encoding protein sequences.[5]

Role in Gene Expression

Messenger RNA (mRNA) occupies a central position in the central dogma of molecular biology, which posits a unidirectional flow of genetic information from DNA to RNA to proteins. In this framework, mRNA is transcribed from DNA templates in the nucleus (in eukaryotes) or cytoplasm (in prokaryotes), capturing the genetic sequence as a single-stranded RNA molecule complementary to the template strand of DNA (and thus matching the coding strand, with uracil replacing thymine). This process, known as transcription, ensures that the information encoded in genes is transferred to a portable form that can direct protein synthesis. Once produced, mRNA serves as the template during translation, where ribosomes bind to it and decode its nucleotide triplets—codons—into a specific sequence of amino acids, thereby producing functional proteins essential for cellular processes.[11] A key distinction in mRNA function arises from its organization in different organisms. In eukaryotes, mRNAs are predominantly monocistronic, meaning each molecule encodes a single polypeptide chain, which facilitates independent regulation of individual proteins and aligns with the compartmentalized nature of eukaryotic cells. Conversely, prokaryotic mRNAs are often polycistronic, derived from operons that group multiple genes under a single promoter, allowing coordinated translation of several proteins from one mRNA transcript to support rapid responses to environmental changes, such as nutrient availability. This polycistronic strategy is exemplified in bacterial operons like the lac operon, where lactose metabolism enzymes are expressed together. In prokaryotes, mRNA is translated directly without the extensive processing seen in eukaryotes.[12] mRNA integrates deeply with broader cellular regulatory networks, serving as a focal point for control at both transcriptional and post-transcriptional levels. Transcriptional regulation modulates mRNA production rates through factors like promoters and enhancers, while post-transcriptional mechanisms— including mRNA degradation, splicing, and interactions with RNA-binding proteins—adjust its availability, stability, and translation efficiency to achieve precise spatiotemporal gene expression. These layers of regulation allow cells to adapt dynamically, for instance, by rapidly degrading unnecessary mRNAs during stress responses.[13] The functional role of mRNA in gene expression exhibits remarkable evolutionary conservation, underpinned by the near-universal genetic code that interprets its codons consistently across bacteria, archaea, and eukaryotes. This code, deciphered through pioneering experiments using synthetic mRNAs, assigns specific amino acids or stop signals to each of the 64 possible triplets, enabling seamless translation machinery compatibility across diverse life forms and highlighting mRNA's ancient origins in the last universal common ancestor. Minor variations in the code occur in certain organelles and organisms, but the core triplet-based decoding remains invariant, underscoring mRNA's fundamental conservation.

Structure

Core Components

Mature messenger RNA (mRNA) in eukaryotes exhibits a fundamental linear architecture consisting of a 5' cap, a coding sequence, and a 3' end, flanked by untranslated regions. This core structure ensures the mRNA's stability, export from the nucleus, and efficient translation into proteins. The 5' cap is added post-transcriptionally and consists of a 7-methylguanosine (m⁷G) moiety linked via a 5'-5' triphosphate bridge to the first nucleotide of the transcript, protecting the mRNA from exonucleolytic degradation and facilitating ribosome binding.[14] The coding sequence (CDS), also known as the open reading frame (ORF), spans from the start codon AUG to a stop codon (UAA, UAG, or UGA), directly encoding the amino acid sequence of the polypeptide.[15] Unlike DNA, mRNA incorporates uracil (U) in place of thymine (T) within its nucleotide composition of adenine (A), guanine (G), cytosine (C), and uracil, with eukaryotic mRNAs typically ranging from 500 to 10,000 nucleotides in length to accommodate the CDS and flanking elements.[16][17] At the 3' end, mRNA maturation involves endonucleolytic cleavage at a site downstream of the polyadenylation signal, most commonly the hexanucleotide AAUAAA in eukaryotes, followed by the addition of a poly(A) tail.[8] This cleavage and tailing process defines the mature 3' terminus, contributing to mRNA stability and translational efficiency. Additionally, mRNA achieves a closed-loop configuration through interactions between the 5' cap and the poly(A)-binding protein (PABP) at the 3' end, mediated by the scaffold protein eIF4G, which enhances mRNA circularization and promotes ribosome recycling for sustained translation.[18]

Untranslated Regions

Messenger RNA (mRNA) untranslated regions (UTRs) are non-coding segments flanking the coding sequence (CDS) that play crucial roles in regulating translation initiation, mRNA stability, and localization. The 5' UTR is located upstream of the start codon, while the 3' UTR is downstream of the stop codon; both contain sequence elements and structures that modulate gene expression without being translated into protein. The 5' UTR serves as the primary site for ribosome recruitment and initiation codon recognition. In prokaryotes, it typically harbors the Shine-Dalgarno sequence, a purine-rich motif approximately 5-10 nucleotides upstream of the AUG start codon, which base-pairs with the anti-Shine-Dalgarno sequence in the 16S rRNA to position the ribosome accurately. Prokaryotic 5' UTRs are generally short, averaging 20-30 nucleotides, reflecting the streamlined nature of bacterial translation. In eukaryotes, the 5' UTR contains the Kozak consensus sequence surrounding the start codon (e.g., GCCA/GCC AUG G), which enhances recognition by the scanning 40S ribosomal subunit. Eukaryotic 5' UTRs average 100-200 nucleotides in length and facilitate the scanning mechanism, where the 43S pre-initiation complex binds near the 5' cap and moves downstream to identify the first suitable AUG codon. The 3' UTR exerts control over mRNA stability and translational efficiency through embedded regulatory motifs. It often includes AU-rich elements (AREs), sequences rich in adenine and uracil (e.g., AUUUA repeats), that bind proteins to promote or inhibit decay, thereby fine-tuning transcript half-life. Additionally, 3' UTRs serve as binding platforms for microRNAs (miRNAs), where seed sequences complementary to miRNA guide the RNA-induced silencing complex (RISC) to repress translation or induce degradation. Eukaryotic 3' UTRs vary widely in length, typically ranging from 100 to 2000 nucleotides, with averages around 1000 nucleotides in humans, allowing for layered regulatory inputs. The poly-A tail, appended to the 3' end, interacts with elements in the adjacent 3' UTR to stabilize the mRNA and facilitate circularization during translation. Secondary structures, such as stem-loops formed by base-pairing within UTRs, significantly influence mRNA functionality. In the 5' UTR, stable hairpins can impede ribosomal scanning, reducing translation efficiency, while moderate structures may enhance initiation by positioning the ribosome. In the 3' UTR, stem-loops can shield or expose regulatory sites, affecting miRNA access or protein binding that modulates stability and decay rates. Prokaryotic UTRs are characteristically shorter and simpler, with fewer regulatory elements suited to rapid, constitutive expression in unicellular organisms. In contrast, eukaryotic UTRs are longer and more complex, incorporating diverse motifs for sophisticated post-transcriptional control that supports multicellular development and environmental responses.

Modifications and Variants

Messenger RNA undergoes various post-transcriptional modifications that influence its stability, localization, and function in gene expression. One of the most prevalent internal modifications is N6-methyladenosine (m⁶A), which marks adenosine residues within the mRNA sequence and is the most abundant modification in eukaryotic mRNAs.[19] This modification is dynamically regulated by writer proteins such as METTL3 and erasers like FTO, affecting multiple aspects of mRNA metabolism, including alternative splicing through interactions with splicing factors and nuclear export via recognition by YTHDC1 protein.[19][20] In eukaryotic processing, m⁶A sites are enriched in 3' untranslated regions and near stop codons, contributing to fine-tuning of mRNA fate without altering the primary sequence.[19] Another key modification is the addition of a poly(A) tail at the 3' end, consisting of 50-250 adenine residues in mammalian mRNAs, which is enzymatically synthesized by poly(A) polymerase during nuclear processing.[21] This homopolymeric tail binds multiple copies of poly(A)-binding protein (PABP), enhancing mRNA stability by protecting against 3' exonucleolytic degradation and promoting translation efficiency through circularization of the mRNA via PABP-eIF4G interactions.[22] The length of the poly(A) tail is tightly controlled, with longer tails correlating with increased translational output and cytoplasmic persistence.[22] mRNA exists in distinct structural variants, primarily linear and circular forms. Linear mRNA, the canonical form, features 5' cap, coding sequence, and 3' poly(A) tail, rendering it susceptible to exonucleases but optimized for ribosomal translation. In contrast, circular mRNA (circRNA) forms through back-splicing, where a splice donor joins an upstream splice acceptor, creating a covalently closed loop that resists degradation by exonucleases due to the absence of free ends.[23] Most circRNAs are derived from exonic sequences, though some retain intronic elements (EIcircRNAs) or arise purely from introns (ciRNAs), and they primarily function in post-transcriptional regulation, such as acting as miRNA sponges or modulating protein activity, rather than serving as templates for protein synthesis.[23][24] Beyond endogenous forms, synthetic mRNA variants have emerged, particularly in therapeutic applications. In vitro transcribed (IVT) mRNA mimics endogenous linear mRNA but is engineered with optimized untranslated regions and capping analogs for enhanced stability and immunogenicity control, as seen in COVID-19 vaccines like those encoding spike protein.[25] Circular synthetic mRNAs, produced via ligation or ribozyme-mediated strategies, offer advantages over linear IVT counterparts, including greater resistance to degradation and prolonged expression, positioning them as next-generation platforms for vaccines and gene therapies.[25] These variants highlight the versatility of mRNA engineering while preserving core functional principles of natural transcripts.[25]

Biosynthesis

Transcription Initiation and Elongation

In prokaryotic transcription, the sigma (σ) factor associates with the core RNA polymerase to form the holoenzyme, which specifically recognizes and binds to promoter regions on the DNA. The promoter typically features conserved sequences known as the -10 box (TATAAT consensus) and the -35 box (TTGACA consensus), located upstream of the transcription start site at +1.[26][27] Upon binding, the holoenzyme unwinds the DNA to form an open complex, initiating RNA synthesis at the +1 site by incorporating the first nucleotide, usually a purine.[26] The σ factor is then released, allowing the core polymerase to enter the elongation phase, where it synthesizes the RNA transcript at an average rate of approximately 50 nucleotides per second.[28] Eukaryotic transcription of messenger RNA (mRNA) precursors is carried out by RNA polymerase II (Pol II), which requires the assembly of a preinitiation complex (PIC) at the core promoter. The TATA-binding protein (TBP), a subunit of the transcription factor IID (TFIID) complex, binds to the TATA box, a core promoter element typically located 25-35 base pairs upstream of the transcription start site.[29][30] Additional general transcription factors (TFIIA, TFIIB, TFIIE, TFIIF, and TFIIH) join to recruit Pol II, while the Mediator complex bridges the PIC with gene-specific activators to stabilize assembly and facilitate promoter opening.[31] Elongation proceeds following phosphorylation of the C-terminal domain (CTD) of Pol II's largest subunit by TFIIH's kinase subunit, which releases Pol II from the promoter and promotes processive RNA chain extension.[32] Promoter elements dictate the specificity and efficiency of transcription initiation. The core promoter, encompassing sequences like the TATA box, initiator (Inr), and downstream promoter element (DPE) in eukaryotes—or the -10 and -35 boxes in prokaryotes—directly interacts with the transcription machinery to define the start site and directionality.[30][33] Enhancers, in contrast, are distal regulatory elements that loop to contact the core promoter via Mediator and other coactivators, enhancing transcription rates but not altering the primary initiation site.[30] Transcription directionality is established by the orientation of these elements relative to the antisense (template) strand, which is read 3' to 5' to synthesize the sense RNA strand 5' to 3'.[34][33] A key feature of initiation in both prokaryotes and eukaryotes is abortive initiation, where RNA polymerase repeatedly synthesizes and releases short RNA transcripts (typically 2-15 nucleotides) without clearing the promoter.[35][36] This non-productive cycling allows the enzyme to probe promoter conformation until stable promoter clearance occurs, transitioning to productive elongation; during synthesis, uracil is incorporated opposite adenine on the template strand.[35][37]

Termination and Primary Transcript

In prokaryotes, transcription termination occurs through two primary mechanisms: Rho-independent and Rho-dependent. Rho-independent termination, also known as intrinsic termination, involves the formation of a stable RNA hairpin structure in the nascent transcript, followed by a run of uridine residues (U-run) that weakens the RNA-DNA hybrid, causing RNA polymerase to dissociate from the DNA template.[38] This process does not require additional protein factors and is driven solely by the sequence-specific folding of the RNA and its interaction with the polymerase.[39] In contrast, Rho-dependent termination relies on the Rho protein, a hexameric RNA helicase that binds to specific rut (Rho utilization) sites on the emerging RNA, translocates along the transcript in a 5' to 3' direction using ATP hydrolysis, and unwinds the transcription elongation complex, leading to polymerase release.[39] This mechanism is particularly important for terminating transcription at sites lacking strong intrinsic signals and helps prevent unwanted read-through into downstream genes.[40] In eukaryotes, transcription termination by RNA polymerase II (Pol II) is more complex and tightly linked to the processing of the primary transcript. Termination is triggered by the polyadenylation signal (typically AAUAAA) located in the 3' untranslated region of the pre-mRNA, which causes Pol II to pause approximately 1-2 kilobases downstream.[41] The cleavage and polyadenylation specificity factor (CPSF) complex then recognizes this signal and recruits endonucleases, such as CPSF-73, to cleave the RNA at the poly(A) site, separating the upstream pre-mRNA from the downstream fragment.[42] The 5'-3' exoribonuclease Xrn2 (also known as Rat1 in yeast) subsequently degrades the downstream cleaved RNA, acting as a "torpedo" that catches up to the paused Pol II, destabilizes the elongation complex, and promotes polymerase release through allosteric changes and dephosphorylation of the C-terminal domain.[43] This process ensures efficient termination and prevents the production of aberrant extended transcripts.[44] The primary transcript, often referred to as pre-mRNA or heterogeneous nuclear RNA (hnRNA) in eukaryotes, is the initial, unprocessed product of transcription that includes both exons and introns, along with extended 5' and 3' untranslated regions beyond the mature mRNA boundaries.[45] In prokaryotes, the primary transcript is typically mature mRNA without introns, but in eukaryotes, it encompasses the full gene sequence transcribed by Pol II, with exons representing the coding and regulatory segments (averaging 50-250 base pairs each) interspersed by introns that can span hundreds to thousands of base pairs.[46] Eukaryotic primary transcripts can reach lengths of up to 100 kilobases or more, reflecting the expansive intron content that constitutes about 95% of the total in many protein-coding genes.[47] These transcripts also feature temporary 5' extensions from promoter-proximal regions and 3' extensions downstream of the poly(A) site, which are later trimmed during processing.[48] Transcription termination is functionally coupled to pre-mRNA processing in eukaryotes to enhance efficiency and fidelity, with termination factors like CPSF recruiting processing machinery such as capping enzymes and splicing components during elongation.[49] This co-transcriptional integration ensures that 3' end cleavage facilitates Xrn2-mediated termination while simultaneously enabling polyadenylation and export signals, reducing the risk of defective transcripts.[50] In prokaryotes, termination more directly coordinates with translation initiation, but the eukaryotic coupling underscores the compartmentalized nature of gene expression.[44]

Processing

5' Capping and Export Signals

The 5' capping of messenger RNA (mRNA) occurs co-transcriptionally shortly after transcription initiation, typically when the nascent transcript reaches a length of 20-30 nucleotides, allowing the 5' end to emerge from the RNA polymerase II (Pol II) exit channel.[50] This process begins with the RNA triphosphatase removing the γ-phosphate from the 5' triphosphate end of the pre-mRNA, followed by the guanylyltransferase component of the capping enzyme (CE, also known as RNGTT in humans) transferring a guanosine monophosphate (GMP) moiety from GTP to form an unusual 5'-5' triphosphate linkage, resulting in GpppN at the 5' end.[51] Subsequent methylation steps involve the RNA guanine-7-methyltransferase (RNMT) adding a methyl group to the N7 position of the guanosine to produce m7GpppN, while cap methyltransferases 1 and 2 (CMTR1 and CMTR2) catalyze 2'-O-ribose methylation on the first and second nucleotides, respectively, yielding the mature cap 0 (m7GpppN) or cap 1 (m7GpppNm) structures essential for mRNA stability and function.[52] These enzymes associate directly with the phosphorylated C-terminal domain of Pol II and the paused elongation complex, ensuring efficient coupling of capping to transcription.[53] The primary functions of the 5' cap include protecting the mRNA from degradation by 5' to 3' exonucleases, such as Xrn1, thereby enhancing transcript stability during processing and export.[54] Additionally, the cap promotes translation initiation by serving as a binding site for the eukaryotic initiation factor 4E (eIF4E), which is part of the eIF4F complex that recruits the 40S ribosomal subunit to the mRNA 5' end, facilitating scanning to the start codon.[55] This cap-eIF4E interaction is critical for efficient ribosome recruitment and is modulated by phosphorylation of 4E-BP proteins, underscoring the cap's role in translational control.[56] In terms of export signals, the mature 5' cap is immediately recognized by the nuclear cap-binding complex (CBC), composed of CBP80 (NCBP1) and CBP20 (NCBP2), which binds the m7G structure with high affinity and shields it from exonucleases while recruiting the TREX (transcription-export) complex.[57] The CBC-TREX interaction, mediated by components like ALYREF, couples the capped mRNA to the nuclear pore complex for passage into the cytoplasm, ensuring that only properly capped transcripts are exported efficiently.[58] This cap-dependent signaling also briefly coordinates with splicing factors to promote intron removal, though the primary export linkage occurs via CBC.[59] Unlike eukaryotic mRNA, prokaryotic transcripts lack a 5' cap due to the absence of Pol II-like capping machinery, relying instead on direct binding of the 30S ribosomal subunit to the Shine-Dalgarno sequence upstream of the start codon for translation initiation without cap-mediated protection or recruitment.[6]

Splicing and Intron Removal

Splicing is a critical post-transcriptional process in eukaryotic cells that removes non-coding introns from pre-mRNA and ligates the coding exons to produce mature mRNA. This process is carried out by the spliceosome, a large ribonucleoprotein complex composed of five small nuclear ribonucleoproteins (snRNPs: U1, U2, U4/U6, and U5) and numerous associated proteins. The spliceosome assembles stepwise on the pre-mRNA, recognizing specific sequence motifs at intron boundaries and internal sites to ensure precise excision and joining.00146-9) Spliceosome assembly begins with the recognition of the 5' splice site by the U1 snRNP, which base-pairs with the conserved GU dinucleotide at the intron-exon junction, adhering to the GU rule established from early sequence analyses of splice junctions.[60] Subsequently, the U2 snRNP binds the branch point sequence, typically located 20–50 nucleotides upstream of the 3' splice site and featuring a conserved adenine (A) residue within a YNCURAC consensus (where Y is pyrimidine, N any nucleotide, R purine), forming base pairs with U2 snRNA to stabilize the commitment complex. The 3' splice site is marked by an AG dinucleotide, also recognized through interactions involving U2 and later U5 snRNPs, completing the early recognition phase.[60] The splicing mechanism proceeds via two sequential transesterification reactions. In the first step, the 2'-OH group of the branch point adenine attacks the phosphodiester bond at the 5' splice site, cleaving the 5' exon and forming a lariat intermediate where the intron is looped via a 2'-5' phosphodiester bond.[61] The second transesterification involves the 3'-OH of the freed 5' exon attacking the 3' splice site, ligating the exons and releasing the lariat intron.[61] These reactions are catalyzed within the spliceosome's active site, with Prp8 serving as a central scaffold protein that positions substrates and coordinates catalysis across both steps.[62] Prp16, an ATPase associated with the U5 snRNP, drives conformational rearrangements and proofreading after the first step to ensure fidelity before the second transesterification. Alternative splicing allows a single pre-mRNA to generate multiple mRNA isoforms by varying exon inclusion, such as through exon skipping, mutually exclusive exons, or intron retention, thereby expanding proteome diversity. In humans, approximately 95% of multi-exon genes undergo alternative splicing, producing numerous isoforms that can differ in function, localization, or stability.[63] This regulation often involves sequence elements like exonic or intronic splicing enhancers/silencers and is influenced by splicing factors that modulate splice site choice during spliceosome assembly. In contrast to spliceosomal splicing, certain introns in organellar genomes, such as those in mitochondria and chloroplasts, can undergo self-splicing without proteins, relying on the RNA's intrinsic ribozyme activity. Group I introns, common in fungal and plant organelles, initiate splicing with an exogenous guanosine cofactor attacking the 5' splice site, followed by exon ligation, as first demonstrated in Tetrahymena rRNA.90176-3.pdf) Group II introns, prevalent in bacterial and organellar genomes, mirror the spliceosomal pathway more closely by forming a lariat intermediate via branch point attack, with self-splicing observed in yeast mitochondrial introns. These self-splicing mechanisms highlight evolutionary links between ancient ribozymes and the modern spliceosome.

Polyadenylation and 3' End Formation

In eukaryotic mRNA processing, the polyadenylation signal, typically the hexanucleotide sequence AAUAAA located 10-30 nucleotides upstream of the cleavage site, is recognized by the cleavage and polyadenylation specificity factor (CPSF) complex.[64] Downstream of this signal, GU- or U-rich elements, situated approximately 20-30 nucleotides after the AAUAAA motif, are bound by the cleavage stimulation factor (CstF), which helps position the cleavage machinery.[64] The pre-mRNA is then cleaved endonucleolytically by the CPSF-associated endonuclease between these signals, generating the 3' end for subsequent polyadenylation.[8] Following cleavage, poly(A) polymerase (PAP) catalyzes the addition of a poly(A) tail, consisting of approximately 200-250 adenine residues, to the newly exposed 3' hydroxyl group.[65] The nuclear poly(A)-binding protein 1 (PABPN1) binds to the growing tail, stimulating PAP activity and ensuring controlled elongation until the optimal length is reached, after which it inhibits further addition to prevent over-adenylation.[65] This length regulation is critical, as tails shorter or longer than this range can impair mRNA function.[21] The poly(A) tail serves multiple essential functions in mRNA maturation and utilization. It promotes nuclear export by facilitating the recruitment of export adaptors such as ALYREF, which links the mRNA to the NXF1/NXT1 export receptor at the nuclear pore complex.[66] In the cytoplasm, the tail bound by cytoplasmic poly(A)-binding protein (PABP) enhances mRNA stability by shielding the 3' end from exonucleolytic degradation, thereby extending the mRNA's half-life.01137-6.pdf) Additionally, PABP interacts with the translation initiation factor eIF4G, forming a closed-loop structure with the 5' cap that stimulates ribosome recruitment and enhances translation efficiency.[67] A notable variant occurs in replication-dependent histone mRNAs, which lack a poly(A) tail and instead terminate in a conserved stem-loop structure formed through a distinct processing pathway.[68] In this case, the U7 small nuclear ribonucleoprotein (snRNP) recognizes a specific binding site downstream of the stem-loop, directing cleavage and ligation to generate the mature 3' end, which regulates histone mRNA stability in a cell cycle-dependent manner.[68]

RNA Editing

RNA editing refers to post-transcriptional enzymatic modifications that alter the nucleotide sequence of messenger RNA (mRNA), thereby expanding the proteome and influencing gene expression without changing the genomic DNA.[69] These changes are catalyzed by deaminase enzymes and occur primarily in specific contexts to fine-tune protein function, stability, and regulatory interactions.[70] The most prevalent form of RNA editing in eukaryotes is adenosine-to-inosine (A-to-I) editing, mediated by adenosine deaminases acting on RNA (ADAR) enzymes, which deaminate adenosine residues to inosine; during translation, inosine is recognized as guanosine (G) by the ribosome.[71] ADAR1, ADAR2, and ADAR3 are the primary enzymes involved, with ADAR1 and ADAR2 being catalytically active; ADAR2 is particularly abundant in the brain, where it contributes to transcriptome diversity by editing neuronal mRNAs, such as those encoding glutamate receptors, to modulate synaptic plasticity and neurotransmitter signaling.[72] A-to-I editing often targets double-stranded RNA structures formed by inverted Alu repeats in primates, leading to synonymous or nonsynonymous codon changes that can affect protein isoforms.[73] In contrast, cytidine-to-uridine (C-to-U) editing is less common and primarily mediated by the APOBEC1 enzyme, which deaminates cytidine to uridine in specific mRNA targets.[74] A canonical example occurs in the apolipoprotein B (apoB) mRNA in the mammalian small intestine, where APOBEC1, in complex with cofactors like ACF, edits a CAA codon to UAA at position 6666, introducing a premature stop codon that truncates the protein to produce the shorter ApoB48 isoform essential for lipid transport, rather than the full-length ApoB100.[75] This editing is tissue-specific and requires mooring sequence elements downstream of the target cytidine for enzyme recruitment.[76] Genome-wide studies have identified thousands of RNA editing sites in the human transcriptome, with over 14,000 A-to-I sites in more than 1,400 mRNAs reported early on, predominantly in Alu elements, though recent analyses reveal up to 189,000 cell-type-specific sites, particularly in the brain.[77][78] These edits influence various processes, including alternative splicing by altering splice site recognition, coding sequence changes that modify protein function, and modulation of microRNA binding sites to affect mRNA stability and translation.[70] For instance, A-to-I editing can recode ion channel subunits, enhancing calcium permeability in neurons.[71] Regulation of RNA editing occurs at multiple levels, with ADAR enzymes localized differently: nuclear isoforms like ADAR1-p110 edit pre-mRNAs, potentially integrating with splicing machinery to influence exon inclusion, while cytoplasmic forms such as ADAR1-p150 target mature mRNAs or viral RNAs.[79] Dysregulation is linked to diseases; for example, mutations or downregulation of ADAR2 lead to inefficient editing of the GluA2 receptor Q/R site in amyotrophic lateral sclerosis (ALS), causing excitotoxicity in motor neurons and contributing to neurodegeneration.[80]

Translation

Initiation Complex Formation

In prokaryotes, translation initiation commences with the binding of the 30S ribosomal subunit to the messenger RNA (mRNA) at the Shine-Dalgarno (SD) sequence located in the 5' untranslated region (UTR), which base-pairs with the complementary anti-Shine-Dalgarno (ASD) sequence (CCUCC) near the 3' end of the 16S ribosomal RNA (rRNA).[81] This interaction aligns the start codon, typically AUG, in proximity to the ribosomal P site, ensuring accurate positioning for initiator tRNA binding.[82] The process is facilitated by three initiation factors: IF1, which occupies the A site to block non-initiator tRNAs and stabilize the 30S subunit; IF3, which promotes mRNA binding and prevents premature association with the 50S subunit to maintain fidelity; and IF2, a GTPase that delivers the initiator formylmethionyl-tRNA^fMet (fMet-tRNA^fMet) to the AUG codon in the P site.[83] Upon GTP hydrolysis by IF2, the 50S subunit joins to form the complete 70S initiation complex, releasing the initiation factors. In eukaryotes, initiation begins with the assembly of the 43S preinitiation complex (PIC), comprising the 40S ribosomal subunit associated with eukaryotic initiation factors (eIFs) eIF1, eIF1A, and eIF3, along with the ternary complex of eIF2-GTP-bound initiator methionyl-tRNA^i (Met-tRNA^i).[84] eIF2 specifically recognizes and stabilizes Met-tRNA^i in the ternary complex, delivering it to the 40S subunit's P site in a partially accommodated orientation.[85] The 43S PIC is then recruited to the mRNA's 5' cap structure (m^7GpppN) via the eIF4F complex, which includes the cap-binding protein eIF4E, the multifunctional scaffold eIF4G, and the ATP-dependent RNA helicase eIF4A; eIF4G bridges eIF4E and eIF3 to tether the ribosome to the mRNA.[86] From this cap-bound position, the 43S PIC scans the 5' UTR in a 5'-to-3' direction, unwinding secondary structures with eIF4A's helicase activity, until it identifies the start AUG codon.[87] Optimal recognition of the eukaryotic start codon depends on its surrounding sequence context, known as the Kozak consensus: GCCRCCAUGG, where R denotes a purine (A or G) at the -3 position relative to the AUG, and the +4 position is preferably G; this motif enhances initiation efficiency by stabilizing codon-anticodon pairing and PIC accommodation.[88] Mutations deviating from this consensus reduce translation accuracy and efficiency, as the -3 purine and +4 G positions interact directly with ribosomal elements and eIFs to promote GTP hydrolysis by eIF2 and release of eIFs.[88] The 5' UTR influences this scanning process by providing binding sites for regulatory factors that modulate ribosome movement. While most eukaryotic mRNAs rely on this cap-dependent scanning mechanism, certain viral mRNAs and cellular transcripts under stress conditions utilize internal ribosome entry sites (IRES) for cap-independent initiation. IRES elements, often complex RNA structures in the 5' UTR, directly recruit the 40S subunit and associated eIFs to an internal AUG without prior cap binding or scanning, enabling translation when cap-dependent pathways are inhibited, as first demonstrated in poliovirus RNA. This alternative mode supports viral replication and cellular adaptation to stressors like hypoxia.[89]

Elongation and Codon Decoding

During elongation, the ribosome moves along the mRNA in the 5' to 3' direction, adding amino acids to the growing polypeptide chain one at a time. This process begins after the formation of the initiation complex, where the initiator tRNA occupies the P site and the A site is empty. Each cycle of elongation involves decoding of the mRNA codon in the A site, formation of a peptide bond, and translocation of the mRNAs and tRNAs relative to the ribosome. Codon-anticodon pairing occurs when the anticodon of an incoming aminoacyl-tRNA (aa-tRNA) base-pairs with the mRNA codon in the ribosomal A site. According to the wobble hypothesis, the third position of the codon allows for non-standard base pairing, enabling a single tRNA to recognize multiple synonymous codons and reducing the number of required tRNAs. This flexibility arises from modifications in the anticodon's first position (corresponding to the codon's third), such as inosine pairing with A, C, or U.[90] Selection of the cognate aa-tRNA for the A-site codon is facilitated by elongation factors. In prokaryotes, elongation factor Tu (EF-Tu) forms a ternary complex with GTP and aa-tRNA, delivering it to the A site where codon recognition induces GTP hydrolysis, releasing EF-Tu-GDP and allowing accommodation of the aa-tRNA. In eukaryotes, the homologous elongation factor 1A (eEF1A) performs an analogous role, binding GTP and aa-tRNA to promote accurate decoding via induced-fit conformational changes in the ribosome upon cognate pairing. GTP hydrolysis by eEF1A ensures fidelity through kinetic proofreading, rejecting near-cognate tRNAs.[91][92] Peptide bond formation is catalyzed by the peptidyl transferase center (PTC) in the large ribosomal subunit, which is composed entirely of ribosomal RNA (rRNA) acting as a ribozyme. The 23S rRNA in prokaryotes (or 28S rRNA in eukaryotes) positions the peptidyl-tRNA in the P site and the aa-tRNA in the A site, facilitating nucleophilic attack by the A-site amino group on the P-site ester bond without requiring protein catalysis. This rRNA-mediated reaction transfers the nascent peptide chain to the A-site aa-tRNA.[93] Following peptide bond formation, translocation shifts the deacylated tRNA to the E site, the peptidyl-tRNA to the P site, and advances the mRNA by one codon to expose the next codon in the A site. In prokaryotes, elongation factor G (EF-G), bound to GTP, binds the ribosome and promotes this movement; GTP hydrolysis by EF-G accelerates the conformational changes in the ribosome and tRNAs, resolving hybrid states and ensuring efficient translocation. The eukaryotic counterpart, elongation factor 2 (eEF2), operates similarly, using GTP hydrolysis to drive tRNA and mRNA movement within the 80S ribosome.[94] The speed of elongation varies between organisms and is influenced by codon usage. In prokaryotes, ribosomes typically incorporate 10-20 amino acids per second under optimal conditions. Eukaryotic translation is generally slower, at approximately 5-6 amino acids per second, with additional pauses at rare codons due to limited availability of corresponding tRNAs, which can regulate co-translational protein folding and quality control.[95][96][97]

Termination and Ribosome Release

Translation termination occurs when the ribosome encounters one of three stop codons—UAA, UAG, or UGA—in the mRNA, signaling the end of protein synthesis and triggering the release of the completed polypeptide chain.[98] In prokaryotes, RF1 recognizes UAA and UAG, while RF2 recognizes UAA and UGA; both possess peptidyl-tRNA hydrolase activity that cleaves the ester bond linking the nascent peptide to the tRNA in the P site.[99] In eukaryotes, eRF1 decodes all three stop codons and catalyzes the hydrolysis, functioning in a ternary complex with GTP-bound eRF3, a GTPase that enhances termination efficiency.[100] The GTP hydrolysis by RF3 or eRF3 promotes the dissociation of the class I release factors (RF1/RF2 or eRF1) from the ribosome, ensuring rapid progression to the next phase.[101] Following peptide release, the post-termination ribosomal complex must be disassembled to recycle components for new rounds of translation. In both prokaryotes and eukaryotes, the ATP-binding cassette protein ABCE1 plays a central role in splitting the ribosome into its 40S/30S and 60S/50S subunits, facilitating the release of the deacylated tRNA and mRNA.[102] In eukaryotes, this process is assisted by initiation factors such as eIF1 and eIF1A, which help in subunit separation and prevent premature reinitiation on the same mRNA.[103] The freed mRNA can then be recycled for additional translation cycles or marked for decay, depending on cellular conditions.[104] Although stop codons generally halt translation, certain mechanisms allow read-through in specific contexts. Suppressor tRNAs with anticodons complementary to stop codons can occasionally insert an amino acid, enabling translation to continue, though this is rare and often mutagenic.[105] A notable exception is the recoding of UGA as selenocysteine in selenoproteins, where a selenocysteine insertion sequence (SECIS) element in the 3' untranslated region recruits selenocysteyl-tRNA^Sec and elongation factor SelB/eEFSec to decode UGA without terminating translation.[106] Such programmed read-through is essential for incorporating this rare amino acid and exemplifies how mRNA context can override standard termination signals.[107]

Localization and Stability

Nuclear Export Mechanisms

In eukaryotic cells, the nuclear export of messenger RNA (mRNA) is a tightly regulated process that ensures only mature, properly processed transcripts are transported from the nucleus to the cytoplasm for translation. This translocation occurs through nuclear pore complexes (NPCs), large protein assemblies embedded in the nuclear envelope, and involves the formation of export-competent messenger ribonucleoprotein (mRNP) particles. The process is essential for gene expression, as it separates transcription in the nucleus from translation in the cytoplasm, preventing premature translation of immature mRNAs.[108] A key player in this pathway is the TREX (transcription-export) complex, a conserved multisubunit assembly that couples mRNA transcription and processing to nuclear export. In yeast and mammals, the TREX complex, which includes the THO subcomplex and the RNA helicase Sub2 (or UAP56 in humans), is recruited to nascent mRNA during transcription elongation and splicing. This recruitment facilitates the loading of the primary mRNA export receptor, NXF1 (Mex67 in yeast) bound to NXT1 (Mtr2 in yeast), onto the mRNP, directing it to the nuclear basket of the NPC for translocation. The THO component of TREX prevents R-loop formation during transcription, ensuring smooth handover to export factors, while Sub2 unwinds secondary structures to promote NXF1 binding. Seminal studies have shown that TREX mutation disrupts mRNA export, leading to nuclear accumulation and cellular defects.[109][110][111] Directionality of mRNA export is achieved independently of the classical Ran-GTP gradient that drives most nuclear transport, relying instead on asymmetric localization and ATP-dependent remodeling at the NPC. Although the Ran-GTP/GDP gradient maintains overall nuclear-cytoplasmic asymmetry, the NXF1-NXT1 mediated export of mRNPs does not directly require Ran-GTP for translocation. Instead, the DEAD-box ATPase Dbp5 (DDX19 in humans), anchored to the cytoplasmic fibrils of the NPC via Nup159 (Nup214), uses ATP hydrolysis to unwind mRNP complexes upon arrival at the cytoplasmic side, releasing mature mRNA into the cytoplasm and recycling export factors back to the nucleus. This Dbp5 cycle, stimulated by Gle1 and inositol hexakisphosphate (IP6), ensures unidirectional transport and prevents back-diffusion of mRNPs.[112][113][114] Quality control during export is mediated by the exon junction complex (EJC), a multiprotein assembly deposited 20-24 nucleotides upstream of exon-exon junctions by the splicing machinery. The EJC, consisting of core components eIF4A3, MAGOH, Y14, and MLN51, marks spliced mRNAs as export-competent and distinguishes them from unspliced or aberrantly processed transcripts, which are retained in the nucleus. EJCs recruit TREX and NXF1, enhancing export efficiency, and also flag mRNAs for post-export surveillance, such as nonsense-mediated decay (NMD) if premature stop codons are detected. This mechanism ensures that only high-quality mRNAs proceed to translation.[115][116] The 5' cap and poly(A) tail, added during processing, briefly facilitate export by serving as binding sites for adaptor proteins like CBP80 and PABPN1, which indirectly link to NXF1 and promote mRNP remodeling. In prokaryotes, nuclear export is irrelevant due to the absence of a nucleus; instead, transcription and translation are directly coupled in the cytoplasm, with ribosomes binding nascent mRNA as it emerges from RNA polymerase.[117]

Cytoplasmic Trafficking and Localization

Once in the cytoplasm, messenger RNAs (mRNAs) are assembled into messenger ribonucleoprotein (mRNP) complexes that facilitate their trafficking and localization to specific subcellular sites, enabling spatially restricted protein synthesis. This process is crucial for cellular asymmetry, such as in polarized cells like neurons and oocytes. Localization signals, often termed "zipcodes," are primarily located in the 3' untranslated region (3' UTR) of mRNAs and serve as recognition motifs for RNA-binding proteins (RBPs) that direct mRNPs to target destinations.00126-3) For instance, the β-actin mRNA contains a 54-nucleotide zipcode in its 3' UTR that binds the RBP ZBP1 (zipcode-binding protein 1), which mediates transport to neuronal dendrites, supporting actin cytoskeleton dynamics at synaptic sites.[118] These interactions ensure that mRNAs are packaged into transport-competent mRNPs shortly after nuclear export.00651-7) Directed transport of mRNPs relies on motor proteins that move along the cytoskeleton, particularly microtubules. Kinesin motors, such as kinesin-1, drive plus-end-directed transport toward the cell periphery, while dynein powers minus-end-directed movement toward the microtubule-organizing center. In asymmetric distribution, these motors coordinate to position mRNAs; for example, in Drosophila oocytes, dynein transports gurken mRNA to the anterior-dorsal region for eggshell patterning, while kinesin-1 relocates oskar mRNA to the posterior pole for germline specification.01302-7) This motor-driven mechanism is essential for long-distance trafficking in large cells, where mRNPs form granules visible by microscopy and associate with microtubules via adaptor proteins.00602-X) mRNP granules play key roles in cytoplasmic regulation and storage during trafficking. Processing bodies (P-bodies) sequester mRNAs for translational repression or decay, acting as hubs for mRNA surveillance and quality control without directly driving localization.00643-X) Stress granules, induced by cellular stress, temporarily store mRNAs by halting translation, allowing rapid resumption upon stress relief; they often dock with P-bodies, facilitating mRNA exchange and contributing to spatiotemporal control in the cytoplasm.01027-9) In contrast to directed transport, shorter mRNAs typically rely on passive diffusion for local positioning, whereas longer or structurally complex mRNAs favor active, motor-mediated delivery to overcome cytoplasmic barriers.01213-8) This dichotomy ensures efficient resource allocation, with diffusion sufficing for uniform distribution and directed mechanisms enabling precise asymmetry.[119]

Degradation

Prokaryotic mRNA Decay Pathways

In prokaryotes, particularly bacteria like Escherichia coli, mRNA decay is a rapid process that ensures quick adaptation to environmental changes, with an average mRNA half-life of approximately 3-7 minutes under exponential growth conditions.[120] This turnover is primarily mediated by a combination of endonucleolytic and exonucleolytic activities, often coupled to translation, and contrasts with the longer-lived eukaryotic mRNAs. The core machinery includes ribonucleases such as RNase E, polynucleotide phosphorylase (PNPase), and RNase II, which collectively degrade mRNA from internal sites or the ends.[121] A major initiation pathway involves endonucleolytic cleavage by RNase E, a key enzyme in the RNA degradosome complex, which targets unstructured regions, stem-loops, or monosome-bound mRNAs.[122] RNase E preferentially cleaves at A/U-rich sites downstream of the 5' end, often in a translation-independent manner, generating fragments that are subsequently susceptible to exonucleolytic attack.[123] In polycistronic mRNAs, common in bacteria, such cleavages can differentially destabilize individual cistrons, allowing coordinated yet modular gene expression.[121] Following endonucleolytic cuts or direct 3' end processing, degradation proceeds via 3'-5' exonucleases like PNPase and RNase II, which require prior shortening of the 3' end. Unlike eukaryotes, prokaryotic mRNAs lack extensive poly(A) tails; instead, limited polyadenylation by poly(A) polymerase I (PAP I) adds short A-tails to facilitate processive degradation by these exonucleases.[124] PNPase, a phosphorolytic enzyme, degrades from the 3' end using phosphate as a cofactor, while RNase II hydrolyzes phosphodiester bonds but stalls at stem-loops.[121] An alternative 5'-3' decay pathway begins with the RNA pyrophosphohydrolase RppH, which converts the 5'-triphosphate end of primary transcripts to a monophosphate, priming the mRNA for exonucleolytic degradation. This RppH-mediated decapping is often translation-coupled, as ribosome protection hinders access, and is followed by enhanced endonucleolytic cleavage primarily by RNase E, with subsequent 3'-5' exonucleolytic degradation by enzymes such as PNPase.[125] The efficiency of this pathway depends on 5' end accessibility and can be modulated by upstream open reading frames or secondary structures.[126] mRNA stability in bacteria is further regulated by small regulatory RNAs (sRNAs) that base-pair with target mRNAs, often facilitated by the chaperone protein Hfq, leading to enhanced recruitment of RNases like RNase E for accelerated decay.[127] For instance, Hfq-sRNA complexes can expose cleavage sites or block translation, thereby promoting endonucleolytic initiation and shortening mRNA lifespan in response to stress.[128] This post-transcriptional control layer allows fine-tuned regulation without altering transcription rates.[121]

Eukaryotic mRNA Turnover Processes

In eukaryotic cells, mRNA turnover is a tightly regulated process that determines transcript stability and gene expression levels, primarily occurring in the cytoplasm through a deadenylation-dependent pathway that contrasts with the more rapid, translation-coupled decay seen in prokaryotes. This basal degradation pathway ensures the removal of mRNAs after their functional lifespan, recycling nucleotides and preventing accumulation of potentially harmful transcripts. The initial and rate-limiting step in most eukaryotic mRNA decay is deadenylation, where the poly(A) tail is progressively shortened by deadenylase complexes. The CCR4-NOT complex, a major multi-subunit deadenylase, plays a central role by recruiting to the mRNA via interactions with poly(A)-binding proteins (PABPs) and catalyzing the removal of adenylate residues through its catalytic subunits Ccr4 and Caf1 (also known as Pop2).[129] This process typically reduces the poly(A) tail length from over 200 nucleotides to a stub of 10-20 adenines, which destabilizes the mRNA and triggers subsequent decay steps. Shortening of the poly(A) tail promotes decapping, the hydrolysis of the 5' cap structure (m7GpppN) by the Dcp1/Dcp2 heterodimeric enzyme complex. Dcp2 provides the catalytic activity, while Dcp1 acts as a cofactor that enhances decapping efficiency and recruits other decay factors.[130] Once the cap is removed, the mRNA body becomes accessible to the 5'-3' exoribonuclease Xrn1, which rapidly degrades the transcript from the 5' end in a processive manner.[130] This decapping-dependent 5'-3' pathway accounts for the majority of bulk mRNA turnover in eukaryotes. In parallel or as an alternative route, particularly for aberrant or unadenylated mRNAs, the RNA exosome complex mediates 3'-5' exonucleolytic degradation. The cytoplasmic exosome, assisted by the cofactor Ski7, targets non-polyadenylated or prematurely deadenylated transcripts for surveillance and decay, ensuring quality control of defective mRNAs.[131] In the nucleus, the exosome subunit Rrp6 (also known as Exonuclease R) contributes to the processing and degradation of aberrant transcripts before export, further supporting mRNA surveillance.[132] Eukaryotic mRNA half-lives vary widely, typically ranging from several hours to days, allowing for fine-tuned control of protein synthesis. Factors such as codon bias—where optimal codons correlate with increased stability—and mRNA secondary structure in the coding sequence influence decay rates by modulating translation efficiency and accessibility to decay factors.[133][134] For instance, mRNAs enriched in optimal codons exhibit longer half-lives, while structured regions can protect against rapid degradation.[135]

Regulatory Decay Mechanisms

Messenger RNA (mRNA) degradation serves as a critical regulatory mechanism to fine-tune gene expression by targeting specific transcripts for rapid decay under physiological conditions. These pathways, distinct from constitutive turnover, respond to sequence features or cellular signals to selectively eliminate mRNAs, thereby controlling protein levels in processes like development, stress response, and immune regulation. Key examples include surveillance systems that detect aberrant transcripts and RNA interference pathways that silence endogenous or viral genes. Nonsense-mediated decay (NMD) is a quality control pathway that targets mRNAs containing premature termination codons (PTCs) for degradation, preventing the production of truncated proteins. In eukaryotes, NMD recognizes PTCs located more than 50 nucleotides upstream of an exon-exon junction, where the exon junction complex (EJC) is deposited during splicing. The UPF1 RNA helicase, along with UPF2 and UPF3, forms a complex that interacts with the EJC; UPF2 and UPF3 bridge UPF1 to the EJC, stimulating UPF1's helicase activity to unwind the mRNA and recruit decay factors.[136][137] AU-rich elements (AREs), often found in the 3' untranslated regions (UTRs) of mRNAs encoding cytokines and proto-oncogenes, mediate rapid decay to limit inflammatory responses. The zinc finger protein tristetraprolin (TTP) binds directly to these AREs, such as in tumor necrosis factor-alpha (TNF-α) mRNA, and recruits the deadenylation machinery to shorten the poly(A) tail, thereby promoting decapping and exonucleolytic degradation. This TTP-ARE interaction is regulated by phosphorylation, which modulates TTP's binding affinity and decay-promoting activity.[138][139] MicroRNAs (miRNAs) regulate gene expression post-transcriptionally by guiding the RNA-induced silencing complex (RISC), which includes Argonaute proteins, to complementary sites in the 3' UTR of target mRNAs. Binding of Argonaute-loaded miRISC to the 3' UTR recruits GW182 (also known as TNRC6), which interacts with deadenylation complexes like CCR4-NOT to trigger poly(A) tail removal, followed by decapping and 5'-to-3' exonucleolytic decay, often without significant translational repression in animals. This mechanism silences hundreds of genes involved in development and disease.[140][141] Small interfering RNAs (siRNAs) mediate precise mRNA silencing through RISC in both plants and animals, primarily for antiviral defense and endogenous gene regulation. In plants, siRNAs derived from viral double-stranded RNA direct Argonaute proteins in RISC to cleave complementary viral or endogenous transcripts via perfect base-pairing. In animals, siRNAs contribute to antiviral responses by targeting viral genomes and also silence endogenous transposons or repetitive elements, enhancing genome stability.[142][143]

Regulation and Functions

Post-Transcriptional Regulation

Post-transcriptional regulation of messenger RNA (mRNA) encompasses mechanisms that fine-tune gene expression after transcription, including spatial localization that controls where and when translation occurs. mRNA localization to specific subcellular compartments enables precise spatial regulation of protein synthesis, ensuring proteins are produced at the right time and place to support cellular functions such as development and polarity. For instance, in animal embryos, maternally deposited mRNAs are localized and translationally repressed until fertilization, allowing coordinated activation to drive early developmental processes like axis formation in Drosophila or cell fate specification in vertebrates. This spatial control restricts translation to targeted sites, preventing ectopic protein production and enhancing efficiency in resource-limited environments.[144] RNA-binding proteins (RBPs) play a central role in post-transcriptional regulation by modulating mRNA stability and translation through interactions with untranslated regions (UTRs). The RBP HuR binds to AU-rich elements (AREs) in the 3' UTRs of target mRNAs, promoting their stabilization and increasing protein output, as seen in the regulation of inflammatory cytokines like TNF-α where HuR competes with destabilizing factors to extend mRNA half-life. In contrast, tristetraprolin (TTP) recognizes similar AREs to recruit decay machinery, accelerating mRNA degradation and suppressing excessive immune responses; for example, TTP targets mRNAs encoding feedback inhibitors of inflammation, maintaining homeostasis by preventing overproduction. These antagonistic actions of HuR and TTP exemplify how RBPs achieve dynamic control over mRNA fate via UTR sequences.[145][146] Biomolecular phase separation further contributes to localized mRNA regulation by forming membraneless condensates that compartmentalize mRNA-ribonucleoprotein (mRNP) complexes. RNAs and RBPs drive liquid-liquid phase separation to create dynamic droplets enriched in specific mRNAs, which sequester transcripts for localized control of translation and processing, as observed in cytoplasmic granules that buffer mRNA stoichiometries and restrict access to ribosomes. In human embryonic stem cells, for instance, FXR1-containing condensates spatially organize mRNPs to influence differentiation by concentrating regulatory RNAs and proteins. These condensates provide a scaffold for efficient, insulated reactions, enhancing spatiotemporal precision in gene expression.[147][148] Feedback loops involving mRNAs that encode regulators of their own processing represent another layer of autoregulation, ensuring balanced expression of splicing factors and other RBPs. Splicing factors often bind their own pre-mRNAs to promote alternative splicing events that include premature stop codons, triggering nonsense-mediated decay (NMD) to autoregulate levels and prevent toxic accumulation, as demonstrated in networks involving SR proteins and hnRNPs. For example, RBPMS, a master regulator of smooth muscle splicing, engages in such loops to maintain homeostasis during cellular differentiation. These auto-regulatory mechanisms, including positive feedbacks with transcription factors, robustly coordinate post-transcriptional events during development.[149][150][151]

Non-Coding and Emerging Roles

Circular RNAs (circRNAs) derived from mRNA loci represent a major class of non-coding transcripts with regulatory functions distinct from protein synthesis. These circRNAs are produced via back-splicing of pre-mRNA exons, resulting in stable, closed-loop structures resistant to exonuclease degradation. A key example is ciRS-7 (also known as CDR1as), generated from the CDR1 locus, which functions primarily as a microRNA (miRNA) sponge. ciRS-7 harbors more than 70 conserved binding sites for miR-7, sequestering the miRNA and derepressing its targets, such as those involved in neuronal function; this was first identified through high-throughput sequencing and functional assays in human and mouse brain tissues. Such sponging activity exemplifies how circRNAs from protein-coding genes modulate post-transcriptional gene regulation without translating into proteins.[152] While predominantly non-coding, certain circRNAs exhibit protein-coding potential, challenging traditional classifications. Translation occurs via cap-independent mechanisms, including internal ribosome entry sites (IRES) or N6-methyladenosine (m6A) modifications that recruit ribosomes to the circular structure. For instance, circ-ZNF609, derived from the ZNF609 mRNA locus, encodes a short protein that promotes myoblast proliferation and differentiation during muscle development, as demonstrated by ribosome profiling and knockout studies in mouse models. This capability has been observed in a subset of circRNAs, where the encoded peptides regulate cellular processes independently of their linear counterparts. Brief reference to their structural origins highlights how back-splicing events from linear mRNA precursors enable these diverse roles. Linear mRNAs also contribute non-coding functions by serving as scaffolds for protein complexes in signaling pathways. Such scaffolding roles extend mRNA utility beyond coding, organizing ribonucleoprotein complexes for efficient cellular responses. Emerging evidence positions mRNA export as a cellular stress sensor. Under conditions like heat shock, global mRNA export is inhibited through the inactivation and dissociation of export adaptors and guard proteins from the export receptor NXF1 (also known as TAP), such as via phosphorylation of Nab2 by the MAPK kinase Slt2 in yeast, retaining bulk transcripts in the nucleus while permitting selective export of stress-inducible mRNAs (e.g., heat shock proteins). This adaptive response prioritizes survival gene expression and was elucidated through studies on nuclear retention dynamics in yeast and mammalian cells.[153][154] For viral infections, export can be modulated differently, often through viral interference with export factors. mRNAs further participate in phase-separated organelles, membraneless compartments formed by liquid-liquid phase separation (LLPS). In stress granules and processing bodies (P-bodies), mRNAs act as scaffolds or modulators, recruiting RNA-binding proteins like G3BP1 to drive condensate assembly and sequester stalled translation initiation complexes. Specific mRNA secondary structures influence LLPS specificity, as shown in polyglutamine-driven systems where mRNA length and sequence dictate partitioning into droplets. This role enhances mRNA stability and regulates translation under stress, with implications for neurodegeneration.[155][156] Distinguishing mRNAs from long non-coding RNAs (lncRNAs) relies on coding potential: mRNAs contain a coding sequence (CDS) typically encoding proteins of at least 100 amino acids, enabling ribosomal translation, whereas lncRNAs (>200 nucleotides) lack substantial ORFs and primarily exert regulatory effects. Despite this, functional overlap exists, as some mRNAs perform lncRNA-like roles (e.g., scaffolding) without relying on their CDS, underscoring convergent evolutionary adaptations in RNA functionality.[157]

History and Applications

Historical Milestones

The concept of messenger RNA (mRNA) as an intermediary carrier of genetic information from DNA to protein synthesis was first proposed in 1961 by François Jacob and Jacques Monod, who described it within their operon model of gene regulation in Escherichia coli, suggesting that mRNA serves as a transient template dictated by structural genes and regulated by operator regions. This theoretical framework was experimentally validated later that year through pulse-labeling experiments by Sydney Brenner, Jacob, and Matthew Meselson, which demonstrated the existence of a short-lived, rapidly turning-over RNA species in bacteria that correlates with β-galactosidase synthesis. In the 1970s, the discovery of heterogeneous nuclear RNA (hnRNA) revealed large, rapidly labeled nuclear transcripts in eukaryotic cells, identified by Sheldon Penman and colleagues as polydisperse RNAs with sizes up to 100 kb, serving as precursors to mature mRNA. Concurrently, Phillip A. Sharp and Richard J. Roberts independently uncovered split genes and RNA splicing in 1977 while studying adenovirus transcripts, showing that non-coding introns are removed from pre-mRNA to form continuous coding exons, a finding that earned them the 1993 Nobel Prize in Physiology or Medicine. Their work demonstrated that eukaryotic genes are discontinuous, with splicing enabling diverse protein isoforms from single genes.[158] The 1970s also saw the elucidation of mRNA modifications essential for stability and processing. Aaron J. Shatkin and Yasuhiro Furuichi identified the 5' cap structure in 1975, a 7-methylguanosine linked via a 5'-5' triphosphate bridge to the mRNA's first nucleotide, initially observed in reovirus mRNA and later confirmed in eukaryotic cellular mRNAs to protect against exonucleases and facilitate translation initiation. James E. Darnell and coworkers discovered the poly(A) tail in 1971, a 3' addition of 100–250 adenine residues to most eukaryotic mRNAs, and by the 1980s established its roles in nuclear export, translation enhancement, and mRNA stability through studies on HeLa cell transcripts. During the 1990s and 2000s, advances in sequencing technologies enabled the cataloging of alternative splicing patterns, with B. R. Graveley and others compiling comprehensive databases from expressed sequence tags (ESTs), revealing that over 60% of human genes undergo alternative splicing to generate proteomic diversity, as detailed in early genome-wide analyses like those from the Human Genome Project era. The discovery of microRNAs (miRNAs) as key post-transcriptional regulators stemmed from Victor Ambros's identification of lin-4 in 1993, but gained mechanistic insight through Andrew Z. Fire and Craig C. Mello's 1998 experiments in C. elegans, showing that double-stranded RNA triggers sequence-specific mRNA degradation via RNA interference (RNAi), for which they received the 2006 Nobel Prize in Physiology or Medicine. In the 2010s, the epitranscriptome emerged as a dynamic layer of mRNA regulation, with N6-methyladenosine (m6A) modifications mapped genome-wide; key studies by Dan Dominissini, Chuan He, and Samie R. Jaffrey in 2012–2013 identified m6A as the most abundant internal mRNA modification, influencing splicing, export, and decay through writer (e.g., METTL3), reader (e.g., YTHDF2), and eraser (e.g., FTO) proteins. Concurrently, CRISPR-based tools expanded to RNA editing, with Omar O. Abudayyeh and Feng Zhang's 2017 discovery of Cas13 enabling programmable cleavage and base editing of mRNAs without altering the genome, advancing targeted transcript modulation in eukaryotic systems.

Biotechnological and Therapeutic Uses

Messenger RNA (mRNA) has emerged as a versatile platform in biotechnology and therapeutics, enabling rapid production of antigens and proteins for vaccines and treatments. The most prominent application is in vaccines, where synthetic mRNA instructs cells to produce viral proteins, triggering immune responses without using live pathogens. This approach accelerated during the COVID-19 pandemic, with mRNA vaccines from Pfizer-BioNTech and Moderna receiving emergency authorization in late 2020. By 2025, over 13 billion doses of COVID-19 vaccines, including a significant number of mRNA-based doses, have been administered globally, significantly reducing severe illness and hospitalizations.[159] Self-amplifying mRNA vaccines, which encode replicase enzymes to amplify antigen production within cells, offer potential for lower dosing and longer-lasting immunity; for instance, ARCT-154 demonstrated superior persistence compared to conventional mRNA vaccines in phase 3 trials completed by 2025 and received authorization in Japan in 2023. By 2025, mRNA platforms have expanded to other respiratory viruses, with Moderna's mRNA-1345 receiving FDA approval for expanded use in preventing RSV lower respiratory tract disease in adults aged 18–59 at increased risk. Moderna's mRNA-1010 quadrivalent seasonal influenza vaccine also reported positive phase 3 efficacy data in mid-2025, showing relative vaccine efficacy against influenza A and B strains, paving the way for potential regulatory approval. In therapeutic applications, mRNA enables targeted protein expression for disease treatment. Personalized cancer immunotherapies represent a key advance, with BioNTech developing individualized mRNA vaccines that encode neoantigens derived from patient tumor mutations to stimulate T-cell responses. For example, autogene cevumeran (BNT122) has advanced to phase 2 trials for pancreatic and other cancers, showing durable T-cell activation in three-year follow-up data from phase 1 studies. mRNA also facilitates protein replacement therapies, particularly for ischemic conditions. AZD8601, an mRNA encoding VEGF-A delivered via intramyocardial injection, has been evaluated in clinical trials for patients with heart failure undergoing coronary artery bypass grafting (CABG), promoting angiogenesis to reduce myocardial ischemia in preclinical models and early human studies, with potential applications for refractory angina. These therapies leverage mRNA's transient expression to avoid long-term risks associated with gene therapy vectors. mRNA serves as a critical tool in research, produced via in vitro transcription (IVT) for various applications. IVT mRNA encoding reporter proteins like luciferase or GFP is widely used in assays to monitor translation efficiency, mRNA stability, and cellular responses in high-throughput screens. In genome editing, IVT mRNA delivers CRISPR-Cas9 guide RNAs as ribonucleoproteins, enabling precise, temporary modifications without genomic integration, though chemical modifications are often required to mitigate innate immune activation via RIG-I pathways. Synthetic biology employs mRNA circuits for engineering cellular behaviors, such as inducible protein expression in response to transcription factors, facilitating the construction of logic gates and metabolic pathways in mammalian cells. Delivery and stability challenges have driven innovations essential to mRNA's success. Lipid nanoparticles (LNPs) encapsulate mRNA, shielding it from degradation and promoting endosomal escape for cytosolic delivery, as validated in the lipid formulations of approved COVID-19 vaccines. Incorporation of modified nucleosides, such as pseudouridine, further enhances performance by reducing recognition by Toll-like receptors and RIG-I, thereby minimizing inflammatory responses while boosting translation yields up to tenfold in human cells. These modifications, combined with optimized 5' caps and poly-A tails in IVT processes, have enabled mRNA's transition from research reagent to scalable therapeutic modality.

References

User Avatar
No comments yet.