Recent from talks
Nothing was collected or created yet.
RNA splicing
View on WikipediaRNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns (non-coding regions of RNA) and splicing back together exons (coding regions). For nuclear-encoded genes, splicing occurs in the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually needed to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing occurs in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs). There exist self-splicing introns, that is, ribozymes that can catalyze their own excision from their parent RNA molecule. The process of transcription, splicing and translation is called gene expression, the central dogma of molecular biology.

Splicing pathways
[edit]Several methods of RNA splicing occur in nature; the type of splicing depends on the structure of the spliced intron and the catalysts required for splicing to occur.
Spliceosomal complex
[edit]Introns
[edit]The word intron is derived from the terms intragenic region,[1] and intracistron,[2] that is, a segment of DNA that is located between two exons of a gene. The term intron refers to both the DNA sequence within a gene and the corresponding sequence in the unprocessed RNA transcript. As part of the RNA processing pathway, introns are removed by RNA splicing either shortly after or concurrent with transcription.[3] Introns are found in the genes of most organisms and many viruses. They can be located in a wide range of genes, including those that generate proteins, ribosomal RNA (rRNA), and transfer RNA (tRNA).[4]
Within introns, a donor site (5' end of the intron), a branch site (near the 3' end of the intron) and an acceptor site (3' end of the intron) are required for splicing. The splice donor site includes an almost invariant sequence GU at the 5' end of the intron, within a larger, less highly conserved region. The splice acceptor site at the 3' end of the intron terminates the intron with an almost invariant AG sequence. Upstream (5'-ward) from the AG there is a region high in pyrimidines (C and U), or polypyrimidine tract. Further upstream from the polypyrimidine tract is the branchpoint, which includes an adenine nucleotide involved in lariat formation.[5][6] The consensus sequence for an intron (in IUPAC nucleic acid notation) is: G-G-[cut]-G-U-R-A-G-U (donor site) ... intron sequence ... Y-U-R-A-C (branch sequence 20-50 nucleotides upstream of acceptor site) ... Y-rich-N-C-A-G-[cut]-G (acceptor site).[7] However, it is noted that the specific sequence of intronic splicing elements and the number of nucleotides between the branchpoint and the nearest 3' acceptor site affect splice site selection.[8][9] Also, point mutations in the underlying DNA or errors during transcription can activate a cryptic splice site in part of the transcript that usually is not spliced. This results in a mature messenger RNA with a missing section of an exon. In this way, a point mutation, which might otherwise affect only a single amino acid, can manifest as a deletion or truncation in the final protein.[citation needed]

Formation and activity
[edit]Splicing is catalyzed by the spliceosome, a large RNA-protein complex composed of five small nuclear ribonucleoproteins (snRNPs). Assembly and activity of the spliceosome occurs during transcription of the pre-mRNA. The RNA components of snRNPs interact with the intron and are involved in catalysis. Two types of spliceosomes have been identified (major and minor) which contain different snRNPs.
- The major spliceosome splices introns containing GU at the 5' splice site and AG at the 3' splice site. It is composed of the U1, U2, U4, U5, and U6 snRNPs and is active in the nucleus. In addition, a number of proteins including U2 small nuclear RNA auxiliary factor 1 (U2AF35), U2AF2 (U2AF65)[10] and SF1 are required for the assembly of the spliceosome.[6][11] The spliceosome forms different complexes during the splicing process:[12]
- Complex E
- The U1 snRNP binds to the GU sequence at the 5' splice site of an intron;
- Splicing factor 1 binds to the intron branch point sequence;
- U2AF1 binds at the 3' splice site of the intron;
- U2AF2 binds to the polypyrimidine tract;[13]
- Complex E
- Complex A (pre-spliceosome)
- The U2 snRNP displaces SF1 and binds to the branch point sequence and ATP is hydrolyzed;
- Complex A (pre-spliceosome)
- Complex B (pre-catalytic spliceosome)
- The U5/U4/U6 snRNP trimer binds, and the U5 snRNP binds exons at the 5' site, with U6 binding to U2;
- Complex B (pre-catalytic spliceosome)
- Complex B*
- The U1 snRNP is released, U5 shifts from exon to intron, and the U6 binds at the 5' splice site;
- Complex B*
- Complex C (catalytic spliceosome)
- U4 is released, U6/U2 catalyzes transesterification, making the 5'-end of the intron ligate to the A on intron and form a lariat, U5 binds exon at 3' splice site, and the 5' site is cleaved, resulting in the formation of the lariat;
- Complex C (catalytic spliceosome)
- Complex C* (post-spliceosomal complex)
- U2/U5/U6 remain bound to the lariat, and the 3' site is cleaved and exons are ligated using ATP hydrolysis. The spliced RNA is released, the lariat is released and degraded,[14] and the snRNPs are recycled.
- Complex C* (post-spliceosomal complex)
- This type of splicing is termed canonical splicing or termed the lariat pathway, which accounts for more than 99% of splicing. By contrast, when the intronic flanking sequences do not follow the GU-AG rule, noncanonical splicing is said to occur (see "minor spliceosome" below).[15]
- The minor spliceosome is very similar to the major spliceosome, but instead it splices out rare introns with different splice site sequences. While the minor and major spliceosomes contain the same U5 snRNP, the minor spliceosome has different but functionally analogous snRNPs for U1, U2, U4, and U6, which are respectively called U11, U12, U4atac, and U6atac.[16]
Recursive splicing
[edit]In most cases, splicing removes introns as single units from precursor mRNA transcripts. However, in some cases, especially in mRNAs with very long introns, splicing happens in steps, with part of an intron removed and then the remaining intron is spliced out in a following step. This has been found first in the Ultrabithorax (Ubx) gene of the fruit fly, Drosophila melanogaster, and a few other Drosophila genes, but cases in humans have been reported as well.[17][18]
Trans-splicing
[edit]Trans-splicing is a form of splicing that removes introns or outrons, and joins two exons that are not within the same RNA transcript.[19] Trans-splicing can occur between two different endogenous pre-mRNAs or between an endogenous and an exogenous (such as from viruses) or artificial RNAs.[20]
Self-splicing
[edit]Self-splicing occurs for rare introns that form a ribozyme, performing the functions of the spliceosome by RNA alone. There are three kinds of self-splicing introns, Group I, Group II and Group III. Group I and II introns perform splicing similar to the spliceosome without requiring any protein. This similarity suggests that Group I and II introns may be evolutionarily related to the spliceosome. Self-splicing may also be very ancient, and may have existed in an RNA world present before protein.[citation needed]
Two transesterifications characterize the mechanism in which group I introns are spliced:[citation needed]
- 3'OH of a free guanine nucleoside or a nucleotide cofactor (GMP, GDP, GTP) attacks phosphate at the 5' splice site.
- 3'OH of the 5' exon becomes a nucleophile and the second transesterification results in the joining of the two exons.
The mechanism in which group II introns are spliced (two transesterification reactions) is as follows:
- The 2'OH of a specific adenosine in the intron (also known as the "branchpoint") attacks the 5' splice site, thereby forming the lariat.
- The 3'OH of the 5' exon triggers the second transesterification at the 3' splice site, thereby joining the exons together.
Both group I and II introns utilize two magnesium ions in the catalytic core to catalyze the splicing reaction, the same catalytic mechanism used by the spliceosome.[21] Detailed structural characterization reveals that group II intron bears significant similarities to the spliceosome in terms of branchpoint adenosine recognition and structural dynamics during the two steps of transesterification.[22]
tRNA splicing
[edit]tRNA (also tRNA-like) splicing is another rare form of splicing that usually occurs in tRNA. The splicing reaction involves a different biochemistry than the spliceosomal and self-splicing pathways.
In the yeast Saccharomyces cerevisiae, a yeast tRNA splicing endonuclease heterotetramer, composed of TSEN54, TSEN2, TSEN34, and TSEN15, cleaves pre-tRNA at two sites in the acceptor loop to form a 5'-half tRNA, terminating at a 2',3'-cyclic phosphodiester group, and a 3'-half tRNA, terminating at a 5'-hydroxyl group, along with a discarded intron.[23] Yeast tRNA kinase then phosphorylates the 5'-hydroxyl group using adenosine triphosphate. Yeast tRNA cyclic phosphodiesterase cleaves the cyclic phosphodiester group to form a 2'-phosphorylated 3' end. Yeast tRNA ligase adds an adenosine monophosphate group to the 5' end of the 3'-half and joins the two halves together.[24] NAD-dependent 2'-phosphotransferase then removes the 2'-phosphate group.[25][26]
Evolution
[edit]Splicing occurs in all three domains of life, however, the extent and types of splicing can be very different between the major divisions. Eukaryotes splice many protein-coding messenger RNAs and some non-coding RNAs. Prokaryotes, on the other hand, splice non-coding RNAs. Another important difference between these two groups of organisms is that prokaryotes completely lack the spliceosomal pathway.
Because spliceosomal introns are not conserved in all species, there is debate concerning when spliceosomal splicing evolved. Two models have been proposed: the intron late and intron early models (see intron evolution).
| Eukaryotes | Prokaryotes | |
|---|---|---|
| Spliceosomal | + | − |
| Self-splicing | + | + |
| tRNA | + | + |
Biochemical mechanism
[edit]
Spliceosomal splicing and self-splicing involve a two-step biochemical process. Both steps involve transesterification reactions that occur between RNA nucleotides. tRNA splicing, however, is an exception and does not occur by transesterification.[27]
Spliceosomal and self-splicing transesterification reactions occur via two sequential transesterification reactions. First, the 2'OH of a specific branchpoint nucleotide within the intron, defined during spliceosome assembly, performs a nucleophilic attack on the first nucleotide of the intron at the 5' splice site, forming the lariat intermediate. Second, the 3'OH of the released 5' exon then performs a nucleophilic attack at the first nucleotide following the last nucleotide of the intron at the 3' splice site, thus joining the exons and releasing the intron lariat.[28]
Alternative splicing
[edit]In many cases, the splicing process can create a range of unique proteins by varying the exon composition of the same mRNA. This phenomenon is then called alternative splicing. Alternative splicing can occur in many ways. Exons can be extended or skipped, or introns can be retained. It is estimated that 95% of transcripts from multiexon genes undergo alternative splicing, some instances of which occur in a tissue-specific manner and/or under specific cellular conditions.[29] Development of high throughput mRNA sequencing technology can help quantify the expression levels of alternatively spliced isoforms. Differential expression levels across tissues and cell lineages allowed computational approaches to be developed to predict the functions of these isoforms.[30][31] Given this complexity, alternative splicing of pre-mRNA transcripts is regulated by a system of trans-acting proteins (activators and repressors) that bind to cis-acting sites or "elements" (enhancers and silencers) on the pre-mRNA transcript itself. These proteins and their respective binding elements promote or reduce the usage of a particular splice site. The binding specificity comes from the sequence and structure of the cis-elements, e.g. in HIV-1 there are many donor and acceptor splice sites. Among the various splice sites, ssA7, which is 3' acceptor site, folds into three stem loop structures, i.e. Intronic splicing silencer (ISS), Exonic splicing enhancer (ESE), and Exonic splicing silencer (ESSE3). Solution structure of Intronic splicing silencer and its interaction to host protein hnRNPA1 give insight into specific recognition.[32] However, adding to the complexity of alternative splicing, it is noted that the effects of regulatory factors are many times position-dependent. For example, a splicing factor that serves as a splicing activator when bound to an intronic enhancer element may serve as a repressor when bound to its splicing element in the context of an exon, and vice versa.[33] In addition to the position-dependent effects of enhancer and silencer elements, the location of the branchpoint (i.e., distance upstream of the nearest 3' acceptor site) also affects splicing.[8] The secondary structure of the pre-mRNA transcript also plays a role in regulating splicing, such as by bringing together splicing elements or by masking a sequence that would otherwise serve as a binding element for a splicing factor.[34][35]
Role of nuclear speckles in RNA splicing
[edit]The location of pre-mRNA splicing is throughout the nucleus, and once mature mRNA is generated, it is transported to the cytoplasm for translation. In both plant and animal cells, nuclear speckles are regions with high concentrations of splicing factors. These speckles were once thought to be mere storage centers for splicing factors. However, it is now understood that nuclear speckles help concentrate splicing factors near genes that are physically located close to them. Genes located farther from speckles can still be transcribed and spliced, but their splicing is less efficient compared to those closer to speckles. Cells can vary their genomic positions of genes relative to nuclear speckles as a mechanism to modulate the expression of genes via splicing.[36]
Role of splicing/alternative splicing in HIV-integration
[edit]The process of splicing is linked with HIV integration, as HIV-1 targets highly spliced genes.[37]
Splicing response to DNA damage
[edit]DNA damage affects splicing factors by altering their post-translational modification, localization, expression and activity.[38] Furthermore, DNA damage often disrupts splicing by interfering with its coupling to transcription. DNA damage also has an impact on the splicing and alternative splicing of genes intimately associated with DNA repair.[38] For instance, DNA damages modulate the alternative splicing of the DNA repair genes Brca1 and Ercc1.
Experimental manipulation of splicing
[edit]Splicing events can be experimentally altered[39][40] by binding steric-blocking antisense oligos, such as Morpholinos or Peptide nucleic acids to snRNP binding sites, to the branchpoint nucleotide that closes the lariat,[41] or to splice-regulatory element binding sites.[42]
The use of antisense oligonucleotides to modulate splicing has shown great promise as a therapeutic strategy for a variety of genetic diseases caused by splicing defects.[43]
Recent studies have shown that RNA splicing can be regulated by a variety of epigenetic modifications, including DNA methylation and histone modifications.[44]
Splicing errors and variation
[edit]It has been suggested that one third of all disease-causing mutations impact on splicing.[33] Common errors include:
- Mutation of a splice site resulting in loss of function of that site. Results in exposure of a premature stop codon, loss of an exon, or inclusion of an intron.
- Mutation of a splice site reducing specificity. May result in variation in the splice location, causing insertion or deletion of amino acids, or most likely, a disruption of the reading frame.
- Displacement of a splice site, leading to inclusion or exclusion of more RNA than expected, resulting in longer or shorter exons.
Although many splicing errors are safeguarded by a cellular quality control mechanism termed nonsense-mediated mRNA decay (NMD),[45] a number of splicing-related diseases also exist, as suggested above.[46]
Allelic differences in mRNA splicing are likely to be a common and important source of phenotypic diversity at the molecular level, in addition to their contribution to genetic disease susceptibility. Indeed, genome-wide studies in humans have identified a range of genes that are subject to allele-specific splicing.
In plants, variation for flooding stress tolerance correlated with stress-induced alternative splicing of transcripts associated with gluconeogenesis and other processes.[47]
Protein splicing
[edit]In addition to RNA, proteins can undergo splicing. Although the biomolecular mechanisms are different, the principle is the same: parts of the protein, called inteins instead of introns, are removed. The remaining parts, called exteins instead of exons, are fused together. Protein splicing has been observed in a wide range of organisms, including bacteria, archaea, plants, yeast and humans.[48]
Splicing and genesis of circRNAs
[edit]The existence of backsplicing was first suggested in 2012.[49] This backsplicing explains the genesis of circular RNAs resulting from the exact junction between the 3' boundary of an exon with the 5' boundary of an exon located upstream.[50] In these exonic circular RNAs, the junction is a classic 3'-5'link.
The exclusion of intronic sequences during splicing can also leave traces, in the form of circular RNAs.[51] In some cases, the intronic lariat is not destroyed and the circular part remains as a lariat-derived circRNA[52].In these lariat-derived circular RNAs, the junction is a 2'-5'link.
See also
[edit]References
[edit]- ^ Gilbert W (February 1978). "Why genes in pieces?". Nature. 271 (5645): 501. Bibcode:1978Natur.271..501G. doi:10.1038/271501a0. PMID 622185. S2CID 4216649.
- ^ Tonegawa S, Maxam AM, Tizard R, Bernard O, Gilbert W (March 1978). "Sequence of a mouse germ-line gene for a variable region of an immunoglobulin light chain". Proceedings of the National Academy of Sciences of the United States of America. 75 (3): 1485–1489. Bibcode:1978PNAS...75.1485T. doi:10.1073/pnas.75.3.1485. PMC 411497. PMID 418414.
- ^ Tilgner H, Knowles DG, Johnson R, Davis CA, Chakrabortty S, Djebali S, et al. (September 2012). "Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs". Genome Research. 22 (9): 1616–1625. doi:10.1101/gr.134445.111. PMC 3431479. PMID 22955974.
- ^ Roy SW, Gilbert W (March 2006). "The evolution of spliceosomal introns: patterns, puzzles and progress". Nature Reviews. Genetics. 7 (3): 211–221. doi:10.1038/nrg1807. PMID 16485020. S2CID 33672491.
- ^ Clancy S (2008). "RNA Splicing: Introns, Exons and Spliceosome". Nature Education. 1 (1): 31. Archived from the original on 15 March 2011. Retrieved 31 March 2011.
- ^ a b Black DL (June 2003). "Mechanisms of alternative pre-messenger RNA splicing". Annual Review of Biochemistry. 72 (1): 291–336. doi:10.1146/annurev.biochem.72.121801.161720. PMID 12626338. S2CID 23576288.
- ^ "Molecular Biology of the Cell". 2012 Journal Citation Reports. Web of Science (Science ed.). Thomson Reuters. 2013.
- ^ a b Taggart AJ, DeSimone AM, Shih JS, Filloux ME, Fairbrother WG (June 2012). "Large-scale mapping of branchpoints in human pre-mRNA transcripts in vivo". Nature Structural & Molecular Biology. 19 (7): 719–721. doi:10.1038/nsmb.2327. PMC 3465671. PMID 22705790.
- ^ Corvelo A, Hallegger M, Smith CW, Eyras E (November 2010). Meyer IM (ed.). "Genome-wide association between branch point properties and alternative splicing". PLOS Computational Biology. 6 (11) e1001016. Bibcode:2010PLSCB...6E1016C. doi:10.1371/journal.pcbi.1001016. PMC 2991248. PMID 21124863.
- ^ Graveley BR, Hertel KJ, Maniatis T (June 2001). "The role of U2AF35 and U2AF65 in enhancer-dependent splicing". RNA. 7 (6) S1355838201010317: 806–818. doi:10.1017/s1355838201010317. PMC 1370132. PMID 11421359. Archived from the original on 2018-11-20. Retrieved 2014-12-17.
- ^ Matlin AJ, Clark F, Smith CW (May 2005). "Understanding alternative splicing: towards a cellular code". Nature Reviews. Molecular Cell Biology. 6 (5): 386–398. doi:10.1038/nrm1645. PMID 15956978. S2CID 14883495.
- ^ Matera AG, Wang Z (February 2014). "A day in the life of the spliceosome". Nature Reviews. Molecular Cell Biology. 15 (2): 108–121. doi:10.1038/nrm3742. PMC 4060434. PMID 24452469.
- ^ Guth S, Valcárcel J (December 2000). "Kinetic role for mammalian SF1/BBP in spliceosome assembly and function after polypyrimidine tract recognition by U2AF". The Journal of Biological Chemistry. 275 (48): 38059–38066. doi:10.1074/jbc.M001483200. PMID 10954700.
- ^ Cheng Z, Menees TM (December 2011). "RNA splicing and debranching viewed through analysis of RNA lariats". Molecular Genetics and Genomics. 286 (5–6): 395–410. doi:10.1007/s00438-011-0635-y. PMID 22065066. S2CID 846297.
- ^ Ng B, Yang F, Huston DP, Yan Y, Yang Y, Xiong Z, et al. (December 2004). "Increased noncanonical splicing of autoantigen transcripts provides the structural basis for expression of untolerized epitopes". The Journal of Allergy and Clinical Immunology. 114 (6): 1463–1470. doi:10.1016/j.jaci.2004.09.006. PMC 3902068. PMID 15577853.
- ^ Patel AA, Steitz JA (December 2003). "Splicing double: insights from the second spliceosome". Nature Reviews. Molecular Cell Biology. 4 (12): 960–970. doi:10.1038/nrm1259. PMID 14685174. S2CID 21816910.
- ^ Sibley CR, Emmett W, Blazquez L, Faro A, Haberman N, Briese M, et al. (May 2015). "Recursive splicing in long vertebrate genes". Nature. 521 (7552): 371–375. Bibcode:2015Natur.521..371S. doi:10.1038/nature14466. PMC 4471124. PMID 25970246.
- ^ Duff MO, Olson S, Wei X, Garrett SC, Osman A, Bolisetty M, et al. (May 2015). "Genome-wide identification of zero nucleotide recursive splicing in Drosophila". Nature. 521 (7552): 376–379. Bibcode:2015Natur.521..376D. doi:10.1038/nature14475. PMC 4529404. PMID 25970244.
- ^ Di Segni G, Gastaldi S, Tocchini-Valentini GP (May 2008). "Cis- and trans-splicing of mRNAs mediated by tRNA sequences in eukaryotic cells". Proceedings of the National Academy of Sciences of the United States of America. 105 (19): 6864–6869. Bibcode:2008PNAS..105.6864D. doi:10.1073/pnas.0800420105. JSTOR 25461891. PMC 2383978. PMID 18458335.
- ^ Eul J, Patzel V (November 2013). "Homologous SV40 RNA trans-splicing: a new mechanism for diversification of viral sequences and phenotypes". RNA Biology. 10 (11): 1689–1699. doi:10.4161/rna.26707. PMC 3907479. PMID 24178438.
- ^ Steitz, T. A.; Steitz, J. A. (1993-07-15). "A general two-metal-ion mechanism for catalytic RNA". Proceedings of the National Academy of Sciences of the United States of America. 90 (14): 6498–6502. Bibcode:1993PNAS...90.6498S. doi:10.1073/pnas.90.14.6498. ISSN 0027-8424. PMC 46959. PMID 8341661.
- ^ Xu, Ling; Liu, Tianshuo; Chung, Kevin; Pyle, Anna Marie (2023-12-21). "Structural insights into intron catalysis and dynamics during splicing". Nature. 624 (7992): 682–688. Bibcode:2023Natur.624..682X. doi:10.1038/s41586-023-06746-6. ISSN 0028-0836. PMC 10733145. PMID 37993708.
- ^ Trotta CR, Miao F, Arn EA, Stevens SW, Ho CK, Rauhut R, Abelson JN (June 1997). "The yeast tRNA splicing endonuclease: a tetrameric enzyme with two active site subunits homologous to the archaeal tRNA endonucleases". Cell. 89 (6): 849–858. doi:10.1016/S0092-8674(00)80270-6. PMID 9200603. S2CID 16055381.
- ^ Westaway SK, Phizicky EM, Abelson J (March 1988). "Structure and function of the yeast tRNA ligase gene". The Journal of Biological Chemistry. 263 (7): 3171–3176. doi:10.1016/S0021-9258(18)69050-7. PMID 3277966.
- ^ Paushkin SV, Patel M, Furia BS, Peltz SW, Trotta CR (April 2004). "Identification of a human endonuclease complex reveals a link between tRNA splicing and pre-mRNA 3' end formation". Cell. 117 (3): 311–321. doi:10.1016/S0092-8674(04)00342-3. PMID 15109492. S2CID 16049289.
- ^ Soma A (1 April 2014). "Circularly permuted tRNA genes: their expression and implications for their physiological relevance and development". Frontiers in Genetics. 5: 63. doi:10.3389/fgene.2014.00063. PMC 3978253. PMID 24744771.
- ^ Abelson J, Trotta CR, Li H (May 1998). "tRNA splicing". The Journal of Biological Chemistry. 273 (21): 12685–12688. doi:10.1074/jbc.273.21.12685. PMID 9582290.
- ^ Fica SM, Tuttle N, Novak T, Li NS, Lu J, Koodathingal P, et al. (November 2013). "RNA catalyses nuclear pre-mRNA splicing". Nature. 503 (7475): 229–234. Bibcode:2013Natur.503..229F. doi:10.1038/nature12734. PMC 4666680. PMID 24196718.
- ^ Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ (December 2008). "Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing". Nature Genetics. 40 (12): 1413–1415. doi:10.1038/ng.259. PMID 18978789. S2CID 9228930.
- ^ Eksi R, Li HD, Menon R, Wen Y, Omenn GS, Kretzler M, Guan Y (Nov 2013). "Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-seq data". PLOS Computational Biology. 9 (11) e1003314. Bibcode:2013PLSCB...9E3314E. doi:10.1371/journal.pcbi.1003314. PMC 3820534. PMID 24244129.
- ^ Li HD, Menon R, Omenn GS, Guan Y (August 2014). "The emerging era of genomic data integration for analyzing splice isoform function". Trends in Genetics. 30 (8): 340–347. doi:10.1016/j.tig.2014.05.005. PMC 4112133. PMID 24951248.
- ^ Jain N, Morgan CE, Rife BD, Salemi M, Tolbert BS (January 2016). "Solution Structure of the HIV-1 Intron Splicing Silencer and Its Interactions with the UP1 Domain of Heterogeneous Nuclear Ribonucleoprotein (hnRNP) A1". The Journal of Biological Chemistry. 291 (5): 2331–2344. doi:10.1074/jbc.M115.674564. PMC 4732216. PMID 26607354.
- ^ a b Lim KH, Ferraris L, Filloux ME, Raphael BJ, Fairbrother WG (July 2011). "Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes". Proceedings of the National Academy of Sciences of the United States of America. 108 (27): 11093–11098. Bibcode:2011PNAS..10811093H. doi:10.1073/pnas.1101135108. PMC 3131313. PMID 21685335.
- ^ Warf MB, Berglund JA (May 2024). "Role of RNA structure in regulating pre-mRNA splicing". Trends in Biochemical Sciences. 35 (8014): 169–178. doi:10.1016/j.tibs.2009.10.004. PMC 2834840. PMID 19959365.
- ^ Reid DC, Chang BL, Gunderson SI, Alpert L, Thompson WA, Fairbrother WG (December 2009). "Next-generation SELEX identifies sequence and structural determinants of splicing factor binding in human pre-mRNA sequence". RNA. 15 (12): 2385–2397. doi:10.1261/rna.1821809. PMC 2779669. PMID 19861426.
- ^ Bhat P, Chow A, Emert B, et al. (May 2024). "Genome organization around nuclear speckles drives mRNA splicing efficiency". Nature. 629 (5): 1165–1173. Bibcode:2024Natur.629.1165B. doi:10.1038/s41586-024-07429-6. PMC 11164319. PMID 38720076.
- ^ Singh PK, Plumb MR, Ferris AL, Iben JR, Wu X, Fadel HJ, et al. (November 2015). "LEDGF/p75 interacts with mRNA splicing factors and targets HIV-1 integration to highly spliced genes". Genes & Development. 29 (21): 2287–2297. doi:10.1101/gad.267609.115. PMC 4647561. PMID 26545813.
- ^ a b Shkreta L, Chabot B (October 2015). "The RNA Splicing Response to DNA Damage". Biomolecules. 5 (4): 2935–2977. doi:10.3390/biom5042935. PMC 4693264. PMID 26529031.
- ^ Draper BW, Morcos PA, Kimmel CB (July 2001). "Inhibition of zebrafish fgf8 pre-mRNA splicing with morpholino oligos: a quantifiable method for gene knockdown". Genesis. 30 (3): 154–156. doi:10.1002/gene.1053. PMID 11477696. S2CID 32270393.
- ^ Sazani P, Kang SH, Maier MA, Wei C, Dillman J, Summerton J, et al. (October 2001). "Nuclear antisense effects of neutral, anionic and cationic oligonucleotide analogs". Nucleic Acids Research. 29 (19): 3965–3974. doi:10.1093/nar/29.19.3965. PMC 60237. PMID 11574678.
- ^ Morcos PA (June 2007). "Achieving targeted and quantifiable alteration of mRNA splicing with Morpholino oligos". Biochemical and Biophysical Research Communications. 358 (2): 521–527. Bibcode:2007BBRC..358..521M. doi:10.1016/j.bbrc.2007.04.172. PMID 17493584.
- ^ Bruno IG, Jin W, Cote GJ (October 2004). "Correction of aberrant FGFR1 alternative RNA splicing through targeting of intronic regulatory elements". Human Molecular Genetics. 13 (20): 2409–2420. doi:10.1093/hmg/ddh272. PMID 15333583.
- ^ Fu XD, Ares M (October 2014). "Context-dependent control of alternative splicing by RNA-binding proteins". Nature Reviews. Genetics. 15 (10): 689–701. doi:10.1038/nrg3778. PMC 4440546. PMID 25112293.
- ^ Fu XD, Ares M (October 2014). "Context-dependent control of alternative splicing by RNA-binding proteins". Nature Reviews. Genetics. 15 (10): 689–701. doi:10.1038/nrg3778. PMC 4440546. PMID 25112293.
- ^ Danckwardt S, Neu-Yilik G, Thermann R, Frede U, Hentze MW, Kulozik AE (March 2002). "Abnormally spliced beta-globin mRNAs: a single point mutation generates transcripts sensitive and insensitive to nonsense-mediated mRNA decay". Blood. 99 (5): 1811–1816. doi:10.1182/blood.V99.5.1811. PMID 11861299. S2CID 17128174.
- ^ Ward AJ, Cooper TA (January 2010). "The pathobiology of splicing". The Journal of Pathology. 220 (2): 152–163. doi:10.1002/path.2649. PMC 2855871. PMID 19918805.
- ^ van Veen H, Vashisht D, Akman M, Girke T, Mustroph A, Reinen E, et al. (October 2016). "Transcriptomes of Eight Arabidopsis thaliana Accessions Reveal Core Conserved, Genotype- and Organ-Specific Responses to Flooding Stress". Plant Physiology. 172 (2): 668–689. doi:10.1104/pp.16.00472. PMC 5047075. PMID 27208254.
- ^ Hanada K, Yang JC (June 2005). "Novel biochemistry: post-translational protein splicing and other lessons from the school of antigen processing". Journal of Molecular Medicine. 83 (6): 420–428. doi:10.1007/s00109-005-0652-6. PMID 15759099. S2CID 37698110.
- ^ Salzman J, Gawad C, Wang PL, et al. Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One 2012;7(2):e30733.
- ^ Jeck WR, Sorrentino JA, Wang K, et al. Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA 2013;19(2):141-57.
- ^ Zhang Y, Zhang XO, Chen T, et al. Circular intronic long noncoding RNAs. Molecular cell 2013;51(6):792-806.
- ^ Talhouarne GJ and Gall JG. Lariat intronic RNAs in the cytoplasm of Xenopus tropicalis oocytes. RNA 2014;20(9):1476-87.
External links
[edit]- Virtual Cell Animation Collection: mRNA Splicing
- RNA+Splicing at the U.S. National Library of Medicine Medical Subject Headings (MeSH)
RNA splicing
View on GrokipediaOverview
Definition and process
RNA splicing is a critical post-transcriptional modification process in which non-coding sequences known as introns are removed from a precursor messenger RNA (pre-mRNA) transcript, and the remaining coding sequences, called exons, are precisely joined together to produce a mature mRNA ready for translation into protein.[1] This process ensures that only the functional coding information is retained, allowing for the accurate expression of genes in eukaryotic cells.[11] The basic mechanism of RNA splicing involves two sequential transesterification reactions. First, specific splice sites are recognized: the 5' splice site typically begins with a GU dinucleotide, the branch point features an adenosine residue located 20–50 nucleotides upstream of the 3' splice site, and the 3' splice site ends with an AG dinucleotide. In the initial step, the 2'-OH group of the branch point adenosine performs a nucleophilic attack on the phosphodiester bond at the 5' splice site, cleaving the 5' exon and forming a lariat intermediate where the intron is looped via a 2'-5' phosphodiester bond to the branch point.[12] The second step involves the 3'-OH of the freed 5' exon attacking the phosphodiester bond at the 3' splice site, ligating the two exons and releasing the intron lariat.[13] RNA splicing is a universal process observed across eukaryotes, archaea, and certain bacteria, though its prevalence and machinery vary. In higher eukaryotes like humans, introns are present in over 97% of protein-coding genes, often comprising the majority of transcript length and making splicing indispensable for proper gene expression.[14] Unlike other RNA processing events such as 5' capping or 3' polyadenylation, which primarily stabilize the mRNA, splicing directly alters the coding sequence by selecting and joining exons, thereby determining the final protein isoform.[15]Historical discovery
The discovery of RNA splicing began in 1977 when researchers independently identified discontinuous gene structures in adenovirus, revealing that eukaryotic genes are composed of interrupted coding sequences separated by non-coding regions, later termed introns. Phillip A. Sharp's team at the Massachusetts Institute of Technology used electron microscopy to visualize hybrid molecules of adenovirus mRNA annealed to viral DNA, showing that mRNA sequences were spliced from separate genomic segments.[16] Concurrently, Richard J. Roberts's group at Cold Spring Harbor Laboratory employed similar techniques to map cytoplasmic poly(A)+ RNA transcripts from adenovirus type 2, demonstrating collinear but non-contiguous alignment with the genome, thus establishing the concept of split genes.[17] This breakthrough challenged the prevailing view of continuous genes and laid the foundation for understanding pre-mRNA processing; Sharp and Roberts were awarded the 1993 Nobel Prize in Physiology or Medicine for their contributions.[18] In the early 1980s, investigations into the machinery of splicing advanced significantly. Michael R. Lerner and Joan A. Steitz at Yale University proposed that small nuclear ribonucleoproteins (snRNPs), recently identified as abundant nuclear particles containing small nuclear RNAs (snRNAs), play a central role in pre-mRNA splicing, based on their ability to bind specifically to intron sequences via immunoprecipitation assays with autoimmune sera. This work introduced the spliceosome as a dynamic ribonucleoprotein complex mediating splicing in higher eukaryotes. Independently, Thomas R. Cech's laboratory at the University of Colorado discovered self-splicing in the ribosomal RNA intron of Tetrahymena thermophila, where the RNA itself catalyzed its excision without protein assistance, providing the first evidence of RNA's enzymatic activity and expanding the catalytic repertoire of RNA molecules.[19] Cech and Sidney Altman, who identified catalytic RNA in RNase P, shared the 1989 Nobel Prize in Chemistry for these discoveries.[20] During the 1990s and 2000s, detailed mapping of spliceosome components emerged, particularly in humans, through proteomic and genetic approaches. Comprehensive analyses identified over 300 proteins associated with the human spliceosome, including many novel factors involved in assembly and catalysis, using tandem affinity purification and mass spectrometry on yeast and human models. The Human Genome Project further highlighted splicing's prevalence, estimating that alternative splicing affects approximately 60% of human genes based on expressed sequence tag (EST) alignments to the draft genome, underscoring its role in proteomic diversity. In the 2010s, structural biology revolutionized splicing research with cryo-electron microscopy (cryo-EM) providing atomic-level insights into spliceosome dynamics. Kiyoshi Nagai's group at the MRC Laboratory of Molecular Biology resolved the structure of the yeast spliceosome immediately after the branching step at 3.8 Å resolution, revealing key conformational changes and interactions in the catalytic core. Entering the 2020s, long-read sequencing technologies, such as Pacific Biosciences and Oxford Nanopore, unveiled unprecedented splicing complexity in human transcriptomes, identifying thousands of novel isoforms and tissue-specific events that short-read methods overlooked, thus refining estimates of splicing diversity across cell types.[21] In 2024, researchers published the first comprehensive blueprint of the human spliceosome, identifying its core composition of approximately 150 proteins with specialized regulatory functions, further advancing insights into splicing mechanisms and potential therapeutic targets.[22]Types of Splicing Pathways
Spliceosomal splicing
Spliceosomal splicing is the predominant mechanism for intron removal from nuclear pre-mRNA in eukaryotic cells, carried out by the spliceosome, a large ribonucleoprotein complex that assembles de novo on each intron.00146-9) This process ensures the production of mature mRNA by excising non-coding introns and ligating coding exons, with the major spliceosome handling the vast majority of introns in most eukaryotes.[23] The spliceosome comprises small nuclear ribonucleoproteins (snRNPs) and numerous associated proteins, enabling precise recognition and catalysis. The major spliceosome includes four key snRNPs: U1, U2, U4/U6 (a di-snRNP), and U5, each containing a uridine-rich small nuclear RNA (snRNA) bound to specific proteins.00146-9) These components recognize conserved splice site sequences at intron boundaries and facilitate the splicing reaction.[23] In contrast, the minor spliceosome processes a small subset of atypical U12-dependent introns using analogous but distinct snRNPs: U11, U12, U4atac/U6atac, and U5.[24] Assembly of the spliceosome proceeds through a series of dynamic, stepwise complexes on the pre-mRNA substrate. It begins with the commitment complex (E complex), where U1 snRNP binds the 5' splice site and U2 auxiliary factors associate with the branch point sequence, followed by U2 snRNP binding to form the pre-spliceosome (A complex).[25] The tri-snRNP (U4/U6·U5) then joins to create the pre-catalytic B complex, which rearranges to the activated B* complex and ultimately the C complex for intron excision.00146-9) This ordered recruitment ensures fidelity, with rearrangements driven by ATP-dependent helicases and protein factors.[25] Two primary models describe how splice sites are recognized during assembly: intron definition and exon definition. In the intron definition model, prevalent in organisms with short introns like yeast, the spliceosome initially pairs the 5' and 3' splice sites across the intron.[26] Conversely, the exon definition model, common in vertebrates with longer introns, involves initial recognition across the exon, where U1 and U2 snRNPs bind opposing splice sites flanking the exon, facilitating cross-exon interactions before intron removal.[27] These models reflect adaptations to genomic architecture, with consensus sequences at splice sites playing a brief role in initial binding.[26] Most spliceosomal introns are U2-dependent, recognized by the major spliceosome, while U12-dependent introns, comprising about 0.35% of human introns, require the minor spliceosome and often feature AU-AC termini instead of the typical GU-AG.[24] In the human genome, introns average around 3 kb in length, vastly exceeding the typical exon size of about 145 nucleotides, which contributes to the complexity of accurate splicing.[28] Trans-splicing represents a specialized variant of spliceosomal splicing in certain eukaryotes, where a short leader sequence from one RNA molecule is joined to the 5' end of an independent pre-mRNA exon, rather than ligating exons from the same transcript.[29] This process, mediated by similar snRNPs as cis-splicing, occurs prominently in trypanosomes, where it adds a spliced leader to all mRNAs to resolve polycistronic transcripts, and in Caenorhabditis elegans, affecting about 70% of genes to add either SL1 or SL2 leaders.[30] Though rare in vertebrates, it highlights the spliceosome's versatility.[29] For exceptionally long introns, recursive splicing provides a mechanism to subdivide removal into multiple steps, using internal "ratchet" sites that mimic 3' splice sites. In Drosophila melanogaster, where introns can exceed 50 kb, this stepwise process enhances splicing accuracy by iteratively excising portions, as seen in the 74-kb ultrabithorax intron.[31] Recursive sites are enriched and conserved in long introns, preventing aberrant splicing and maintaining efficiency.[32]Self-splicing
Self-splicing refers to a form of RNA splicing in which the intron excises itself from the precursor RNA through ribozyme activity, independent of protein enzymes. This process was first demonstrated in 1982 with the ribosomal RNA precursor from the ciliate Tetrahymena thermophila, where the 413-nucleotide intervening sequence (IVS) was shown to autocatalytically excise and circularize under in vitro conditions mimicking physiological ionic strength, without requiring additional factors beyond a guanosine cofactor.[33] Group I introns are the most extensively studied class of self-splicing introns, characterized by a conserved secondary structure featuring paired helices and an internal guide sequence that aligns the 5' splice site with the 3' hydroxyl of a guanosine cofactor. These introns are predominantly found in organellar genomes (mitochondria and chloroplasts), ribosomal RNA genes of protists and fungi, and bacteriophage genomes, with prokaryotic origins suggesting horizontal transfer to eukaryotic organelles. The splicing mechanism proceeds via two transesterification reactions: first, an exogenous guanosine (or GTP/GMP) attacks the 5' splice site, cleaving the upstream exon and attaching to the intron's 5' end; second, the newly freed 3' hydroxyl of the upstream exon attacks the 3' splice site, forming the ligated exons and releasing the linear intron, which often cyclizes via a 2'-3' phosphodiester bond. This guanosine-dependent pathway requires divalent metal ions like Mg²⁺ for catalysis and is highly efficient in vitro, with rate constants approaching physiological speeds.[34][35] Group II introns, another major class of self-splicing elements, are structurally more complex with six helical domains and exhibit a branching mechanism analogous to that of spliceosomal introns, forming a lariat intermediate. These introns are common in mitochondrial and chloroplast genomes of fungi, plants, and algae, as well as in bacterial genomes, where they often encode a multifunctional reverse transcriptase-like protein that promotes their mobility as retroelements. Splicing initiates with the 2' hydroxyl of a bulged adenosine (branch point) within domain VI attacking the 5' splice site, generating a lariat intron and freeing the upstream exon; the second transesterification then joins the exons and releases the lariat intron, again facilitated by Mg²⁺ ions in the active site. Unlike Group I, Group II introns can splice in the absence of exogenous cofactors, though some rely on maturase proteins encoded within the intron for stability in vivo.[36][12] Self-splicing introns of both groups have prokaryotic origins, with Group I introns identified as over 42,000 across nature as of 2025 and Group II introns numbering in the thousands, primarily in bacterial and organellar contexts, reflecting sporadic distribution and horizontal mobility that contributed to their spread into eukaryotic lineages.[37] Group I and II introns are evolutionarily linked to the emergence of spliceosomal splicing through shared catalytic cores.[35][38]tRNA and minor spliceosomal splicing
tRNA splicing occurs in eukaryotes and archaea, where introns are typically located within the anticodon loop of pre-tRNA transcripts.80287-1) These introns are removed through a protein-dependent pathway involving distinct enzymatic steps, contrasting with self-splicing mechanisms in bacteria.57862-0/fulltext) The process begins with site-specific cleavage by a heterotetrameric tRNA splicing endonuclease complex, composed of subunits homologous to Sen proteins in yeast (such as Sen2, Sen34, Sen54, and Sen55), which recognizes structural features of the pre-tRNA rather than sequence alone.[39] In yeast, the endonuclease generates exons with 5'-hydroxyl and 2',3'-cyclic phosphate termini, leaving the intron as a linear fragment.[40] The subsequent ligation step seals the exons using a multifunctional ligase, such as Trl1 in yeast, which first opens the 2',3'-cyclic phosphate to a 2'-phosphate intermediate before forming the standard 3'-5' phosphodiester bond.57862-0/fulltext) This pathway ensures the production of mature tRNA capable of participating in translation, with the cyclic phosphate intermediate being a hallmark of the eukaryotic and archaeal tRNA splicing mechanism.80287-1) A well-studied example is the intron in the yeast tRNATyr gene (SUP6), where removal is essential not only for maturation but also for proper post-transcriptional modification of the tRNA.[41] In addition to tRNA processing, minor spliceosomal splicing handles a rare class of nuclear pre-mRNA introns known as U12-type, which constitute approximately 0.4% of human introns and often feature AU-AC terminal dinucleotides instead of the typical GU-AG.[42] The minor spliceosome employs specialized small nuclear ribonucleoproteins (snRNPs): U11/U12 and U4atac/U6atac, along with the shared U5 snRNP, to recognize and excise these introns through a process analogous but distinct from major spliceosomal activity.[43] These U12-type introns are enriched in genes expressed in neurons, suggesting specialized roles in neural function and development.[44] Representative examples include AT-AC introns in human genes such as ATR, which encodes a key DNA damage response kinase and relies on minor spliceosome components for accurate isoform production.[45]Biochemical Mechanisms
Splice site recognition and consensus sequences
In eukaryotic pre-mRNA splicing, splice site recognition begins with the identification of conserved sequence motifs at the exon-intron boundaries and within introns, which serve as docking sites for small nuclear ribonucleoproteins (snRNPs) and auxiliary factors. These motifs ensure precise excision of introns and ligation of exons, with deviations from consensus often requiring additional regulatory elements for efficient processing. The core signals include the 5' splice site (5' SS), branch point sequence (BPS), and 3' splice site (3' SS), each exhibiting species-specific consensus patterns derived from extensive sequence analyses and modern computational tools such as position weight matrices (PWMs) and databases like U12DB.[46] The 5' SS is defined by a nearly invariant GU dinucleotide at the start of the intron, forming part of a broader consensus sequence such as /exonCAG|GURAGU in mammals, where the vertical bar denotes the cleavage point and R represents a purine. This GU motif, first identified in viral and cellular genes, is essential for base-pairing with the 5' end of U1 snRNA, initiating spliceosome assembly. Upstream of the 5' SS, sequences resembling polypyrimidine tracts can influence recognition in certain contexts, though they are more prominently associated with the 3' SS. Mutations altering the GU dinucleotide abolish splicing, underscoring its critical role. The BPS, located approximately 20-50 nucleotides upstream of the 3' SS, features a conserved adenosine residue that acts as the nucleophile in the first transesterification step, forming a lariat intermediate. In mammals, the BPS consensus is YNCURAC (Y = pyrimidine, N = any nucleotide, R = purine, underlined A = branch point adenosine), a motif identified through mutational analysis of rabbit β-globin pre-mRNA. This sequence binds SF1/mBBP and facilitates U2 snRNP association, with the distance to the 3' SS influencing efficiency; optimal spacing enhances lariat formation.[47] The 3' SS consists of an AG dinucleotide immediately downstream of a polypyrimidine tract (Py tract), typically 12-20 uridine/cytidine-rich nucleotides that promote U2 auxiliary factor (U2AF) binding. This Py-AG arrangement, conserved across metazoans, was established in early intron sequencing studies and is crucial for defining the acceptor site, with the Py tract compensating for weak AG contexts by recruiting U2AF65. The scanning model posits that U2AF searches downstream from the BPS for the first suitable AG, ensuring accurate cleavage. Splice site recognition is modulated by cis-regulatory elements, including exonic splicing enhancers (ESEs) and intronic splicing enhancers (ISEs), which bind serine/arginine-rich (SR) proteins to stabilize core site interactions, particularly for suboptimal sequences. Conversely, exonic splicing silencers (ESSs) and intronic splicing silencers (ISSs) recruit heterogeneous nuclear ribonucleoproteins (hnRNPs), such as hnRNP A1, to repress usage. ESE motifs, often purine-rich (e.g., GAR repeats), were first characterized in the immunoglobulin μ chain gene and promote exon inclusion by recruiting SR proteins like SF2/ASF. ISEs and ISSs, identified through systematic screens, similarly influence site choice; for instance, G-rich ISEs bind hnRNP F/H to enhance upstream exon definition. These elements are essential for fine-tuning splicing fidelity. Variations in splice site consensus exist, notably in U12-type introns, a minor class (~1% in humans) processed by the minor spliceosome and featuring AU-AC dinucleotides instead of GU-AG. These were discovered through computational analysis of divergent 5' SS sequences and exhibit extended consensus like /RTATCCTTT/, with higher conservation due to their rarity.[48] U12-type introns also have a distinct BPS (UUCCUAAC). Weak splice sites, deviating significantly from consensus (e.g., non-GU 5' SS), depend on auxiliary factors like SR proteins binding ESEs/ISEs to compensate for poor base-pairing with snRNAs, as demonstrated in mutagenesis studies of β-globin introns. Such enhancements can increase splicing efficiency by 10- to 100-fold for suboptimal sites.| Splice Site | Consensus Motif (Mammals) | Key Features | Binding Factor |
|---|---|---|---|
| 5' SS | CAG|GURAGU | GU dinucleotide invariant; R = purine | U1 snRNP |
| BPS | YNCURAC (20-50 nt upstream of 3' SS) | Underlined A = branch adenosine; Y = pyrimidine | SF1, U2 snRNP |
| 3' SS | (YnC)AG| | Py tract (YnC, n=12-20); AG invariant | U2AF |
