Hubbry Logo
RNA polymeraseRNA polymeraseMain
Open search
RNA polymerase
Community hub
RNA polymerase
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
RNA polymerase
RNA polymerase
from Wikipedia

DNA-directed RNA polymerase
RNA polymerase hetero 27-mer, human
Identifiers
EC no.2.7.7.6
CAS no.9014-24-8
Databases
IntEnzIntEnz view
BRENDABRENDA entry
ExPASyNiceZyme view
KEGGKEGG entry
MetaCycmetabolic pathway
PRIAMprofile
PDB structuresRCSB PDB PDBe PDBsum
Gene OntologyAmiGO / QuickGO
Search
PMCarticles
PubMedarticles
NCBIproteins
RNA polymerase (purple) unwinding the DNA double helix. It uses one strand (darker orange) as a template to create the single-stranded messenger RNA (green).

In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that catalyzes the chemical reactions that synthesize RNA from a DNA template.

Using the enzyme helicase, RNAP locally opens the double-stranded DNA so that one strand of the exposed nucleotides can be used as a template for the synthesis of RNA, a process called transcription. A transcription factor and its associated transcription mediator complex must be attached to a DNA binding site called a promoter region before RNAP can initiate the DNA unwinding at that position. RNAP not only initiates RNA transcription, it also guides the nucleotides into position, facilitates attachment and elongation, has intrinsic proofreading and replacement capabilities, and termination recognition capability. In eukaryotes, RNAP can build chains as long as 2.4 million nucleotides.

RNAP produces RNA that, functionally, is either for protein coding, i.e. messenger RNA (mRNA); or non-coding (so-called "RNA genes"). Examples of four functional types of RNA genes are:

Transfer RNA (tRNA)
Transfers specific amino acids to growing polypeptide chains at the ribosomal site of protein synthesis during translation;
Ribosomal RNA (rRNA)
Incorporates into ribosomes;
Micro RNA (miRNA)
Regulates gene activity; and, RNA silencing
Catalytic RNA (ribozyme)
Functions as an enzymatically active RNA molecule.

RNA polymerase is essential to life, and is found in all living organisms and many viruses. Depending on the organism, a RNA polymerase can be a protein complex (multi-subunit RNAP) or only consist of one subunit (single-subunit RNAP, ssRNAP), each representing an independent lineage. The former is found in bacteria, archaea, and eukaryotes alike, sharing a similar core structure and mechanism.[1] The latter is found in phages as well as eukaryotic chloroplasts and mitochondria, and is related to modern DNA polymerases.[2] Eukaryotic and archaeal RNAPs have more subunits than bacterial ones do, and are controlled differently.

Bacteria and archaea only have one RNA polymerase. Eukaryotes have multiple types of nuclear RNAP, each responsible for synthesis of a distinct subset of RNA:

  1. RNA polymerase I synthesizes a pre-rRNA 45S (35S in yeast), which matures and will form the major RNA sections of the ribosome.
  2. RNA polymerase II synthesizes precursors of mRNAs and most sRNA and microRNAs.
  3. RNA polymerase III synthesizes tRNAs, rRNA 5S and other small RNAs found in the nucleus and cytosol.
  4. RNA polymerase IV and V found in plants are less understood; they make siRNA. In addition to the ssRNAPs, chloroplasts also encode and use a bacteria-like RNAP.

Structure

[edit]
T. aquaticus RNA polymerase core (PDB: 1HQM​).
Yeast RNA polymerase II core (PDB: 1WCM​).
Homologous subunits are colored the same:[1]
  orange: α1/RPB3,
  yellow: α2/RPB11,
  wheat: β/RPB2,
  red: β′/RPB1,
  pink: ω/RPB6.

The 2006 Nobel Prize in Chemistry was awarded to Roger D. Kornberg for creating detailed molecular images of RNA polymerase during various stages of the transcription process.[3][4]

In most prokaryotes, a single RNA polymerase species transcribes all types of RNA. RNA polymerase "core" from E. coli consists of five subunits: two alpha (α) subunits of 36 kDa, a beta (β) subunit of 150 kDa, a beta prime subunit (β′) of 155 kDa, and a small omega (ω) subunit. A sigma (σ) factor binds to the core, forming the holoenzyme. After transcription starts, the factor can unbind and let the core enzyme proceed with its work.[5][6] The core RNA polymerase complex forms a "crab claw" or "clamp-jaw" structure with an internal channel running along the full length.[7] Eukaryotic and archaeal RNA polymerases have a similar core structure and work in a similar manner, although they have many extra subunits.[8]

All RNAPs contain metal cofactors, in particular zinc and magnesium cations which aid in the transcription process.[9][10]

Function

[edit]
An electron-micrograph of DNA strands decorated by hundreds of RNAP molecules too small to be resolved. Each RNAP is transcribing an RNA strand, which can be seen branching off from the DNA. "Begin" indicates the 3′ end of the DNA, where RNAP initiates transcription; "End" indicates the 5′ end, where the longer RNA molecules are completely transcribed.

Control of the process of gene transcription affects patterns of gene expression and, thereby, allows a cell to adapt to a changing environment, perform specialized roles within an organism, and maintain basic metabolic processes necessary for survival. Therefore, it is hardly surprising that the activity of RNAP is long, complex, and highly regulated. In Escherichia coli bacteria, more than 100 transcription factors have been identified, which modify the activity of RNAP.[11]

RNAP can initiate transcription at specific DNA sequences known as promoters. It then produces an RNA chain, which is complementary to the template DNA strand. The process of adding nucleotides to the RNA strand is known as elongation; in eukaryotes, RNAP can build chains as long as 2.4 million nucleotides (the full length of the dystrophin gene). RNAP will preferentially release its RNA transcript at specific DNA sequences encoded at the end of genes, which are known as terminators.

Products of RNAP include:

RNAP accomplishes de novo synthesis. It is able to do this because specific interactions with the initiating nucleotide hold RNAP rigidly in place, facilitating chemical attack on the incoming nucleotide. Such specific interactions explain why RNAP prefers to start transcripts with ATP (followed by GTP, UTP, and then CTP). In contrast to DNA polymerase, RNAP includes helicase activity, therefore no separate enzyme is needed to unwind DNA.

Action

[edit]

Initiation

[edit]

RNA polymerase binding in bacteria involves the sigma factor recognizing the core promoter region containing the −35 and −10 elements (located before the beginning of sequence to be transcribed) and also, at some promoters, the α subunit C-terminal domain recognizing promoter upstream elements.[12] There are multiple interchangeable sigma factors, each of which recognizes a distinct set of promoters. For example, in E. coli, σ70 is expressed under normal conditions and recognizes promoters for genes required under normal conditions ("housekeeping genes"), while σ32 recognizes promoters for genes required at high temperatures ("heat-shock genes"). In archaea and eukaryotes, the functions of the bacterial general transcription factor sigma are performed by multiple general transcription factors that work together. The RNA polymerase-promoter closed complex is usually referred to as the "transcription preinitiation complex."[13][14]

After binding to the DNA, the RNA polymerase switches from a closed complex to an open complex. This change involves the separation of the DNA strands to form an unwound section of DNA of approximately 13 bp, referred to as the "transcription bubble". Supercoiling plays an important part in polymerase activity because of the unwinding and rewinding of DNA. Because regions of DNA in front of RNAP are unwound, there are compensatory positive supercoils. Regions behind RNAP are rewound and negative supercoils are present.[14]

Promoter escape

[edit]

RNA polymerase then starts to synthesize the initial DNA-RNA heteroduplex, with ribonucleotides base-paired to the template DNA strand according to Watson-Crick base-pairing interactions. As noted above, RNA polymerase makes contacts with the promoter region. However these stabilizing contacts inhibit the enzyme's ability to access DNA further downstream and thus the synthesis of the full-length product. In order to continue RNA synthesis, RNA polymerase must escape the promoter. It must maintain promoter contacts while unwinding more downstream DNA for synthesis, "scrunching" more downstream DNA into the initiation complex.[15] During the promoter escape transition, RNA polymerase is considered a "stressed intermediate." Thermodynamically the stress accumulates from the DNA-unwinding and DNA-compaction activities. Once the DNA-RNA heteroduplex is long enough (~10 bp), RNA polymerase releases its upstream contacts and effectively achieves the promoter escape transition into the elongation phase. The heteroduplex at the active center stabilizes the elongation complex.

However, promoter escape is not the only outcome. RNA polymerase can also relieve the stress by releasing its downstream contacts, arresting transcription. The paused transcribing complex has two options: (1) release the nascent transcript and begin anew at the promoter or (2) reestablish a new 3′-OH on the nascent transcript at the active site via RNA polymerase's catalytic activity and recommence DNA scrunching to achieve promoter escape. Abortive initiation, the unproductive cycling of RNA polymerase before the promoter escape transition, results in short RNA fragments of around 9 bp in a process known as abortive transcription. The extent of abortive initiation depends on the presence of transcription factors and the strength of the promoter contacts.[16]

Elongation

[edit]
RNA Polymerase II Transcription: the process of transcript elongation facilitated by disassembly of nucleosomes.
RNAP from T. aquaticus pictured during elongation. Portions of the enzyme were made transparent so as to make the path of RNA and DNA more clear. The magnesium ion (yellow) is located at the enzyme active site.

The 17-bp transcriptional complex has an 8-bp DNA-RNA hybrid, that is, 8 base-pairs involve the RNA transcript bound to the DNA template strand.[17] As transcription progresses, ribonucleotides are added to the 3′ end of the RNA transcript and the RNAP complex moves along the DNA. The characteristic elongation rates in prokaryotes and eukaryotes are about 10–100 nts/sec.[18]

Aspartyl (asp) residues in the RNAP will hold on to Mg2+ ions, which will, in turn, coordinate the phosphates of the ribonucleotides. The first Mg2+ will hold on to the α-phosphate of the NTP to be added. This allows the nucleophilic attack of the 3′-OH from the RNA transcript, adding another NTP to the chain. The second Mg2+ will hold on to the pyrophosphate of the NTP.[19] The overall reaction equation is:

(NMP)n + NTP → (NMP)n+1 + PPi

Fidelity

[edit]

Unlike the proofreading mechanisms of DNA polymerase those of RNAP have only recently been investigated. Proofreading begins with separation of the mis-incorporated nucleotide from the DNA template. This pauses transcription. The polymerase then backtracks by one position and cleaves the dinucleotide that contains the mismatched nucleotide. In the RNA polymerase this occurs at the same active site used for polymerization and is therefore markedly different from the DNA polymerase where proofreading occurs at a distinct nuclease active site.[20]

The overall error rate is around 10−4 to 10−6.[21]

Termination

[edit]

In bacteria, termination of RNA transcription can be rho-dependent or rho-independent. The former relies on the rho factor, which destabilizes the DNA-RNA heteroduplex and causes RNA release.[22] The latter, also known as intrinsic termination, relies on a palindromic region of DNA. Transcribing the region causes the formation of a "hairpin" structure from the RNA transcription looping and binding upon itself. This hairpin structure is often rich in G-C base-pairs, making it more stable than the DNA-RNA hybrid itself. As a result, the 8 bp DNA-RNA hybrid in the transcription complex shifts to a 4 bp hybrid. These last 4 base pairs are weak A-U base pairs, and the entire RNA transcript will fall off the DNA.[23]

Transcription termination in eukaryotes is less well understood than in bacteria, but involves cleavage of the new transcript followed by template-independent addition of adenines at its new 3′ end, in a process called polyadenylation.[24]

Other organisms

[edit]

Given that DNA and RNA polymerases both carry out template-dependent nucleotide polymerization, it might be expected that the two types of enzymes would be structurally related. However, x-ray crystallographic studies of both types of enzymes reveal that, other than containing a critical Mg2+ ion at the catalytic site, they are virtually unrelated to each other; indeed template-dependent nucleotide polymerizing enzymes seem to have arisen independently twice during the early evolution of cells. One lineage led to the modern DNA polymerases and reverse transcriptases, as well as to a few single-subunit RNA polymerases (ssRNAP) from phages and organelles.[2] The other multi-subunit RNAP lineage formed all of the modern cellular RNA polymerases.[25][1]

Bacteria

[edit]

In bacteria, the same enzyme catalyzes the synthesis of mRNA and non-coding RNA (ncRNA).

RNAP is a large molecule. The core enzyme has five subunits (~ 400 kDa):[26]

β′
The β′ subunit is the largest subunit, and is encoded by the rpoC gene.[27] The β′ subunit contains part of the active center responsible for RNA synthesis and contains some of the determinants for non-sequence-specific interactions with DNA and nascent RNA. It is split into two subunits in Cyanobacteria and chloroplasts.[28]
β
The β subunit is the second-largest subunit, and is encoded by the rpoB gene. The β subunit contains the rest of the active center responsible for RNA synthesis and contains the rest of the determinants for non-sequence-specific interactions with DNA and nascent RNA.
α (αI and αII)
Two copies of the α subunit, being the third-largest subunit, are present in a molecule of RNAP: αI and αII (one and two). Each α subunit contains two domains: αNTD (N-terminal domain) and αCTD (C-terminal domain). αNTD contains determinants for assembly of RNAP. αCTD (C-terminal domain) contains determinants for interaction with promoter DNA, making non-sequence-non-specific interactions at most promoters and sequence-specific interactions at upstream-element-containing promoters, and contains determinants for interactions with regulatory factors.
ω
The ω subunit is the smallest subunit. The ω subunit facilitates assembly of RNAP and stabilizes assembled RNAP.[29]

In order to bind promoters, RNAP core associates with the transcription initiation factor sigma (σ) to form RNA polymerase holoenzyme. Sigma reduces the affinity of RNAP for nonspecific DNA while increasing specificity for promoters, allowing transcription to initiate at correct sites. The complete holoenzyme therefore has 6 subunits: β′βαI and αIIωσ (~450 kDa).

Eukaryotes

[edit]
Structure of eukaryotic RNA polymerase II (light blue) in complex with α-amanitin (red), a strong poison found in death cap mushrooms that targets this vital enzyme

Eukaryotes have multiple types of nuclear RNAP, each responsible for synthesis of a distinct subset of RNA. All are structurally and mechanistically related to each other and to bacterial RNAP:

  1. RNA polymerase I synthesizes a pre-rRNA 45S (35S in yeast), which matures into 28S, 18S and 5.8S rRNAs, which will form the major RNA sections of the ribosome.[30]
  2. RNA polymerase II synthesizes precursors of mRNAs and most snRNA and microRNAs.[31] This is the most studied type, and, due to the high level of control required over transcription, a range of transcription factors are required for its binding to promoters.
  3. RNA polymerase III synthesizes tRNAs, rRNA 5S and other small RNAs found in the nucleus and cytosol.[32]
  4. RNA polymerase IV synthesizes siRNA in plants.[33]
  5. RNA polymerase V synthesizes RNAs involved in siRNA-directed heterochromatin formation in plants.[34]

Eukaryotic chloroplasts contain a multi-subunit RNAP ("PEP, plastid-encoded polymerase"). Due to its bacterial origin, the organization of PEP resembles that of current bacterial RNA polymerases: It is encoded by the RPOA, RPOB, RPOC1 and RPOC2 genes on the plastome, which as proteins form the core subunits of PEP, respectively named α, β, β′ and β″.[35] Similar to the RNA polymerase in E. coli, PEP requires the presence of sigma (σ) factors for the recognition of its promoters, containing the -10 and -35 motifs.[36] Despite the many commonalities between plant organellar and bacterial RNA polymerases and their structure, PEP additionally requires the association of a number of nuclear encoded proteins, termed PAPs (PEP-associated proteins), which form essential components that are closely associated with the PEP complex in plants. Initially, a group consisting of 10 PAPs was identified through biochemical methods, which was later extended to 12 PAPs.[37][38]

Chloroplast also contain a second, structurally and mechanistically unrelated, single-subunit RNAP ("nucleus-encoded polymerase, NEP"). Eukaryotic mitochondria use POLRMT (human), a nucleus-encoded single-subunit RNAP.[2] Such phage-like polymerases are referred to as RpoT in plants.[39]

Archaea

[edit]

Archaea have a single type of RNAP, responsible for the synthesis of all RNA. Archaeal RNAP is structurally and mechanistically similar to bacterial RNAP and eukaryotic nuclear RNAP I-V, and is especially closely structurally and mechanistically related to eukaryotic nuclear RNAP II.[8][40] The history of the discovery of the archaeal RNA polymerase is quite recent. The first analysis of the RNAP of an archaeon was performed in 1971, when the RNAP from the extreme halophile Halobacterium cutirubrum was isolated and purified.[41] Crystal structures of RNAPs from Sulfolobus solfataricus and Sulfolobus shibatae set the total number of identified archaeal subunits at thirteen.[8][42]

Archaea has the subunit corresponding to Eukaryotic Rpb1 split into two. There is no homolog to eukaryotic Rpb9 (POLR2I) in the S. shibatae complex, although TFS (TFIIS homolog) has been proposed as one based on similarity. There is an additional subunit dubbed Rpo13; together with Rpo5 it occupies a space filled by an insertion found in bacterial β′ subunits (1,377–1,420 in Taq).[8] An earlier, lower-resolution study on S. solfataricus structure did not find Rpo13 and only assigned the space to Rpo5/Rpb5. Rpo3 is notable in that it's an iron–sulfur protein. RNAP I/III subunit AC40 found in some eukaryotes share similar sequences,[42] but does not bind iron.[43] This domain, in either case, serves a structural function.[44]

Archaeal RNAP subunit previously used an "RpoX" nomenclature where each subunit is assigned a letter in a way unrelated to any other systems.[1] In 2009, a new nomenclature based on Eukaryotic Pol II subunit "Rpb" numbering was proposed.[8]

Viruses

[edit]
T7 RNA polymerase producing a mRNA (green) from a DNA template. The protein is shown as a purple ribbon (PDB: 1MSW​)

Orthopoxviruses and some other nucleocytoplasmic large DNA viruses synthesize RNA using a virally encoded multi-subunit RNAP. They are most similar to eukaryotic RNAPs, with some subunits minified or removed.[45] Exactly which RNAP they are most similar to is a topic of debate.[46] Most other viruses that synthesize RNA use unrelated mechanics.

Many viruses use a single-subunit DNA-dependent RNAP (ssRNAP) that is structurally and mechanistically related to the single-subunit RNAP of eukaryotic chloroplasts (RpoT) and mitochondria (POLRMT) and, more distantly, to DNA polymerases and reverse transcriptases. Perhaps the most widely studied such single-subunit RNAP is bacteriophage T7 RNA polymerase. ssRNAPs cannot proofread.[2]

B. subtilis prophage SPβ uses YonO, a homolog of the β+β′ subunits of msRNAPs to form a monomeric (both barrels on the same chain) RNAP distinct from the usual "right hand" ssRNAP. It probably diverged very long ago from the canonical five-unit msRNAP, before the time of the last universal common ancestor.[47][48]

Other viruses use an RNA-dependent RNAP (an RNAP that employs RNA as a template instead of DNA). This occurs in negative strand RNA viruses and dsRNA viruses, both of which exist for a portion of their life cycle as double-stranded RNA. However, some positive strand RNA viruses, such as poliovirus, also contain RNA-dependent RNAP.[49]

History

[edit]

RNAP was discovered independently by Sam Weiss, Audrey Stevens, and Jerard Hurwitz in 1960.[50] By this time, one half of the 1959 Nobel Prize in Medicine had been awarded to Severo Ochoa for the discovery of what was believed to be RNAP,[51] but instead turned out to be polynucleotide phosphorylase.

Purification

[edit]

RNA polymerase can be isolated in the following ways:

And also combinations of the above techniques.

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
RNA polymerase, also known as DNA-directed RNA polymerase, is a multi-subunit enzyme that catalyzes the transcription of DNA into RNA, serving as the primary machinery for gene expression in all cellular organisms and many viruses. This process involves reading a DNA template strand in the 3' to 5' direction while synthesizing a complementary RNA strand in the 5' to 3' direction, using nucleoside triphosphates as substrates. RNA polymerases produce various RNA types, including messenger RNA (mRNA) for protein synthesis, ribosomal RNA (rRNA) for ribosome assembly, and transfer RNA (tRNA) for translation, thereby linking genetic information to cellular function. In prokaryotes, such as bacteria, a single type of RNA polymerase handles all transcription, consisting of a core enzyme with five subunits: two alpha (α), one beta (β), one beta prime (β'), and one omega (ω) subunit, which forms a crab-claw-like structure with a central cleft for DNA and RNA binding. This core associates with a sigma (σ) factor for promoter recognition and initiation, enabling the enzyme to locate specific start sites on DNA. The mechanism proceeds through nucleotide addition cycles, where the enzyme binds an incoming NTP, incorporates it via phosphodiester bond formation at a magnesium ion-containing active site, releases pyrophosphate, and translocates along the DNA, achieving high fidelity with an error rate of about 1 per 10,000 nucleotides through proofreading capabilities. Eukaryotes possess three distinct nuclear RNA polymerases, each specialized for different RNA classes: (Pol I) transcribes most rRNAs in the , (Pol II) synthesizes mRNA and some non-coding RNAs, and (Pol III) produces tRNAs, 5S rRNA, and other small RNAs. Pol II, the most studied, is a 12-subunit complex with a conserved core resembling the bacterial but featuring additional subunits and a unique C-terminal domain (CTD) on its largest subunit (RPB1), consisting of heptapeptide repeats (YSPTSPS) that undergo to regulate transcription phases. Initiation in eukaryotes requires a preinitiation complex assembled with general transcription factors like TFIID, TFIIB, TFIIE, TFIIF, and TFIIH, where TFIIH's XPB unwinds DNA at the promoter using . Elongation and termination are further modulated by factors such as for activator communication, DSIF and NELF for pausing, and P-TEFb for release via CTD . employ a single RNA polymerase more akin to eukaryotic Pol II in complexity, highlighting evolutionary conservation across domains of life. Beyond the nucleus, eukaryotic mitochondria and chloroplasts contain specialized single-subunit RNA polymerases related to enzymes, underscoring the enzyme's ancient origins and adaptability. of RNA polymerase activity is crucial for cellular responses, involving factors in for specificity and extensive post-translational modifications in eukaryotes to coordinate with structure and signaling pathways. Dysregulation of these processes can lead to diseases, but the core fidelity and dynamic mechanisms ensure accurate information flow from to RNA.

Introduction and General Properties

Definition and Biological Role

RNA polymerase is a multi-subunit enzyme that catalyzes the synthesis of RNA molecules from a DNA template through the formation of phosphodiester bonds between ribonucleotides. This process, known as transcription, is the first step in , where genetic information encoded in DNA is copied into RNA to direct cellular functions. In its biological role, RNA polymerase is indispensable for producing various RNA types, including messenger RNA (mRNA) for protein synthesis, ribosomal RNA (rRNA) and transfer RNA (tRNA) for translation machinery, and non-coding RNAs that regulate gene expression and cellular processes. By enabling the conversion of DNA sequences into functional RNA transcripts, it plays a central role in cellular regulation, development, and response to environmental cues across all living organisms. The enzyme performs template-directed , utilizing triphosphates (NTPs)—ATP, UTP, GTP, and CTP—as building blocks to extend the RNA chain in the 5' to 3' direction, antiparallel to the DNA template strand read from 3' to 5'. The reaction is driven by the of released from each incorporated NTP, providing the energy for bond formation, while is maintained through Watson-Crick base-pairing rules (A-U and G-C) between the template DNA and incoming ribonucleotides. RNA polymerase is universally present in all domains of life—Bacteria, Archaea, and Eukarya—as well as in some viruses, underscoring its evolutionary conservation as a fundamental component of the central dogma of molecular biology. RNA polymerase catalyzes transcription and has no direct role in translation, the separate process of protein synthesis performed by ribosomes on mRNA, which initiates at the start codon AUG.

Classification of RNA Polymerases

RNA polymerases are broadly classified into DNA-dependent and RNA-dependent types based on their template specificity. DNA-dependent RNA polymerases (DdRPs), which synthesize RNA using a DNA template, predominate in cellular organisms across bacteria, archaea, and eukaryotes, enabling the transcription of genomic DNA into various RNA species. In contrast, RNA-dependent RNA polymerases (RdRPs) are less common in cells but are essential for RNA viruses, where they replicate and transcribe RNA genomes without a DNA intermediate. Within DNA-dependent polymerases, a key distinction exists between single-subunit and multi-subunit enzymes. Single-subunit DdRPs, such as those from bacteriophage T7, consist of a single polypeptide chain homologous to DNA polymerases and function independently without additional subunits for core activity, making them simpler and often found in viruses. Multi-subunit DdRPs, comprising 10–17 subunits organized into core and accessory components, are the standard in , providing greater complexity for and ; these evolved early and are conserved across domains, underscoring their predominance in , , and eukaryotes. In , a single multi-subunit polymerase handles all transcription, with specificity conferred by interchangeable (σ) factors that associate with the core to recognize diverse promoters and direct the synthesis of mRNA, rRNA, and tRNA. This σ-dependent system allows to transcribe all RNA classes using one polymerase type, adapting to environmental cues through σ factor competition. Archaeal RNA polymerases are also single multi-subunit enzymes per genome, structurally resembling eukaryotic (Pol II) in their core and subunit composition (11–13 subunits), but with simpler via fewer transcription factors, reflecting an intermediate between bacterial and eukaryotic systems. Eukaryotes possess three distinct nuclear multi-subunit RNA polymerases, each specialized for specific RNA classes: (Pol I) primarily transcribes large ribosomal RNAs (rRNAs, such as 18S, 5.8S, and 28S) in the to support ; (Pol II) synthesizes messenger RNAs (mRNAs) from protein-coding genes, as well as small nuclear RNAs (snRNAs) and microRNAs essential for splicing and gene regulation; and (Pol III) produces transfer RNAs (tRNAs), 5S rRNA, and other small RNAs critical for and cellular processes. Additionally, eukaryotic organelles feature specialized variants: mitochondrial RNA polymerases are single-subunit enzymes akin to T7-like viral polymerases, while chloroplasts in use bacterial-like multi-subunit polymerases; further unique nuclear RNA polymerases IV (Pol IV) and V (Pol V), which are Pol II-related and function in RNA-directed and siRNA-mediated , producing precursors for 24-nucleotide siRNAs that target formation. Viral RNA polymerases exhibit remarkable diversity, mirroring host and independent strategies. Many DNA viruses, like T7 bacteriophage, employ single-subunit DdRPs for efficient, host-independent transcription, while RNA viruses predominantly rely on RdRPs—single-subunit enzymes with a conserved right-hand fold featuring motifs for nucleotide binding and catalysis—to replicate positive-sense, negative-sense, or double-stranded RNA genomes, as seen in or . Some large DNA viruses, such as , encode multi-subunit DdRPs resembling cellular ones. Multi-subunit RNA polymerases in , , and eukaryotes share a common evolutionary origin, tracing back to a (LUCA) with a two-barrel catalytic core (double-psi β-barrel domain) that unified transcription machinery before domain divergence, as evidenced by conserved subunits like β/β' in and homologs in /eukaryotes; subsequent expansions, such as additional subunits in eukaryotes, arose from gene duplications and fusions post-LUCA. This shared ancestry highlights the ancient innovation of multi-subunit architecture for robust cellular transcription, distinct from the convergent evolution of single-subunit viral enzymes.

Molecular Structure

Core Enzyme Architecture

The core enzyme of RNA polymerase exhibits a highly conserved architecture across bacteria, archaea, and eukaryotes, characterized by a crab-claw-like shape that accommodates the DNA template and nascent RNA hybrid within a central cleft. This structure features two major lobes—the "jaw" and "clamp"—separated by a narrow channel approximately 25 Å wide, with the active site positioned at the base of the cleft for nucleotide addition. In bacterial RNA polymerase, the overall dimensions are roughly 150 Å × 115 Å × 100 Å, enabling the enzyme to encircle and process double-stranded DNA while maintaining processivity.80872-7) At the heart of this architecture lies the conserved catalytic site, where two Mg²⁺ ions are coordinated by aspartate residues—typically from motifs A (DFDGD) in the β' subunit (or homolog) and C (NAFDWDD) in the β subunit—to facilitate formation between the incoming triphosphate and the 3' end. This two-metal-ion mechanism, essential for polymerase activity, positions the α-phosphate of the for nucleophilic attack by the primer, ensuring and efficiency in synthesis. The pocket, formed primarily by β and β' lobes, creates a sterically constrained environment that selects for correct base pairing and excludes mismatched . In , the core enzyme consists of five subunits: two α subunits (α_I and α_II), β, β', and the small ω subunit, with a total molecular weight of about 390 . The β and β' subunits form the bulk of the crab-claw structure, creating the main DNA-binding channel and catalytic cleft, while the α subunits dimerize to scaffold assembly and interact with upstream regulatory elements. The ω subunit, though small (~10 ), stabilizes the β' clamp domain and aids in core integrity during folding and transcription. Eukaryotic counterparts are larger and more complex; for instance, (Pol II) comprises 12 subunits (RPB1–12), where RPB1 and RPB2 are homologs of bacterial β' and β, respectively, and the clamp domain in RPB1 enhances stability by gripping the DNA-RNA hybrid. Additional subunits like RPB3–RPB6 and RPB8–RPB12 form structural extensions, increasing the overall mass to ~550 while preserving the core scaffold. Key structural motifs, including the jaw (downstream DNA gripper), lid (RNA exit channel regulator), and funnel (nucleotide entry portal), are conserved with variations in size and accessory elements across domains of life, facilitating substrate handling and preventing backtracking. The jaw domain, formed by RPB5 and RPB9 in Pol II (or equivalents), contacts downstream DNA, while the funnel channels NTPs to the active site, and the lid loop modulates RNA displacement. These motifs ensure coordinated movement during elongation. Recent cryo-EM studies since the 2010s have revealed dynamic conformations, such as open (pre-initiation) and closed (elongating) states of the clamp, highlighting conformational flexibility that allows the enzyme to transition between DNA loading and tight hybrid stabilization without dissociating.

Accessory Subunits and Factors

In , sigma factors serve as accessory subunits that confer promoter specificity to the core RNA polymerase, enabling recognition of distinct promoter elements such as the -10 (TATAAT) and -35 (TTGACA) boxes. The primary sigma factor, σ⁷⁰, directs transcription of housekeeping essential for normal cellular functions, while alternative sigma factors, such as σˢ for stationary-phase and osmotic stress responses or σᴱ for extracytoplasmic stress, activate specific sets under environmental challenges like heat shock or limitation. These factors bind reversibly to the core , forming the holoenzyme that facilitates promoter-directed . In eukaryotes, general transcription factors (GTFs) such as TFIIA through TFIIH assemble with RNA polymerase II (Pol II) to form the preinitiation complex at promoters, particularly those containing TATA or initiator elements. TFIIA stabilizes TBP binding to the TATA box, TFIIB links the complex to Pol II and selects the start site, TFIIE recruits TFIIH for promoter melting, TFIIF escorts Pol II to the promoter, and TFIIH unwinds DNA via its helicase subunits while phosphorylating the Pol II CTD. The Mediator complex, a multi-subunit coactivator, further modulates Pol II activity by bridging enhancers and promoters, stabilizing the preinitiation complex, and promoting Pol II recruitment and reinitiation through conformational changes and interactions with transcription factors. Archaeal RNA polymerase relies on transcription factor B (TFB), homologous to eukaryotic TFIIB, and TATA-box binding protein (TBP) for promoter recognition and initiation, binding cooperatively to TATA and BRE elements to position DNA for melting. These two factors suffice for basal transcription, analogous to the simplified eukaryotic machinery but without additional . In viruses like , the viral protein VP16 acts as an accessory factor, recruiting host factors such as HCF-1 and Oct-1 to immediate-early promoters, thereby activating viral through interactions with TBP and TFIIB. Additional subunits enhance polymerase function across domains; in eukaryotes, the RPB4/7 heterodimer forms a stalk on Pol II that restricts clamp opening, aids initiation by positioning promoter DNA, and facilitates recycling via CTD by Fcp1. In , Nus factors (NusA, NusB, NusE, NusG) associate transiently to promote antitermination, stabilizing elongation complexes and suppressing pauses by binding RNA and modulating polymerase translocation. Holoenzyme assembly involves dynamic binding of these factors to core surfaces, with factors dissociating after 9-12 of RNA synthesis during promoter clearance, transitioning to an elongation-competent core. Structurally, factors interact with core domains like the β-flap tip, where σ₄.₂ recognizes the -35 element and aids DNA melting by positioning the non-template strand in the RNA exit channel.

Transcription Mechanism

Initiation and Promoter Recognition

Initiation of transcription begins with the specific recognition of promoter sequences in DNA by RNA polymerase holoenzymes, which assemble the pre-initiation complex (PIC) to position the enzyme at the transcription start site (TSS). The transcription start site (TSS) marks the beginning of RNA synthesis and is determined by promoter sequences, distinct from the start codon (AUG) that initiates translation during protein synthesis on mRNA. In bacteria, the primary sigma factor σ⁷⁰ of the RNA polymerase holoenzyme identifies conserved promoter motifs, including the -35 box (consensus sequence TTGACA) and the -10 box (consensus TATAAT), located upstream of the TSS. These elements are contacted by distinct domains of the σ subunit: region 4 binds the -35 box via helix-turn-helix interactions, while region 2.4 recognizes the -10 box, facilitating initial DNA binding. In eukaryotes, RNA polymerase II (Pol II) relies on general transcription factors within the PIC; the TATA box (consensus TATAAA) around -30 bp upstream is bound by TBP (TATA-binding protein) in TFII D, while the BRE (TFIIB recognition element) upstream of the TATA box and the Inr (initiator) element spanning the TSS (consensus YYANWYY, where Y=pyrimidine, N=any, W=A/T) are recognized by TFIIB and TFII D subunits, respectively, to orient Pol II precisely at the start site. The binding process proceeds through sequential conformational changes. The holoenzyme first forms a closed complex with double-stranded promoter DNA, where σ or TFII factors make sequence-specific contacts without unwinding. Isomerization then occurs, driven by interactions that bend and distort the DNA, leading to the open complex in which ~14 base pairs of DNA are melted to form a single-stranded transcription bubble around the TSS. This melting exposes the template strand for initial nucleotide pairing, with the bubble stabilized by the polymerase cleft and σ/TFIIB positioning the first nucleotides. In bacteria, σ⁷⁰ region 3.2 and the β' jaw clamp the DNA, while in eukaryotes, TFIIB's B-reader and B-linker domains insert into the bubble to guide the template strand into the active site. Following open complex formation, ensues, characterized by repeated synthesis and release of short transcripts (typically 2-10 ) without polymerase progression beyond the promoter. During this phase, the polymerase adds using NTP substrates but fails to escape due to promoter-proximal pausing, resulting in NTP without net chain extension or clearance from the promoter. Promoter escape requires the addition of approximately 2 to the nascent , which triggers a conformational shift (e.g., σ⁷⁰ release in or TFIIB displacement in eukaryotes) and stabilizes the elongating complex. No additional NTP beyond substrate incorporation occurs initially in , though eukaryotic open complex formation involves ATP-dependent activity of TFIIH for melting. Fidelity during initiation is maintained primarily through Watson-Crick base pairing at the active site, where the polymerase selects the complementary NTP for the template base in the initial transcribed region. The trigger loop in the polymerase active site closes to enforce geometric constraints that favor correct base pairing, discriminating against mismatches with error rates around 10⁻⁴ to 10⁻⁵ per nucleotide. This selection mechanism ensures accurate TSS specification and initial RNA sequence, with σ or TFIIB further enhancing specificity by positioning the DNA correctly.

Elongation and Processivity

During the elongation phase of transcription, RNA polymerase (RNAP) synthesizes the RNA chain in a highly processive manner following the initial unstable steps of initiation. The core of this process is the nucleotide addition cycle, which consists of three main steps: (NTP) binding in the post-translocated state, formation catalyzed by a two-metal ion mechanism that releases (PPi), and translocation of the RNAP along the DNA template to reposition the RNA 3' end for the next cycle. This cycle enables continuous RNA extension without dissociation, with the enzyme maintaining a stable transcription elongation complex (TEC) that incorporates nucleotides at the RNA 3' terminus. Processivity refers to the ability of RNAP to synthesize long RNA transcripts without falling off the DNA template, a property essential for transcribing entire genes or operons. In bacteria, such as , RNAP exhibits exceptionally high processivity, capable of extending RNA chains for tens of kilobases (>10^4 ) at rates of 20–80 per second. To maintain this efficiency, RNAP employs as a mechanism: when a mismatch or damage occurs, the reverses translocation, extruding the RNA 3' end into the secondary channel, followed by endonucleolytic cleavage to remove the erroneous segment and restore the correct register. This cleavage activity, intrinsic to the core and enhanced by factors like GreA/GreB in or TFIIS in eukaryotes, ensures transcriptional accuracy without halting progression. Fidelity during elongation is achieved through multiple kinetic checkpoints that discriminate against incorrect NTPs. The primary mechanism involves induced fit, where the trigger loop in the closes only upon correct base-pairing, accelerating for matched substrates while slowing misincorporation. Kinetic discrimination further enhances selectivity, with the (k_cat/K_m) for correct NTPs exceeding that for incorrect ones by approximately 10^5- to 10^6-fold, primarily due to slower binding and closure rates for mismatches. These mechanisms collectively yield an overall rate of about 10^{-5} to 10^{-6} per , balancing speed and accuracy. The stability of the TEC is critically maintained by an 8–9 (bp) RNA-DNA hybrid within the enzyme's cleft, which anchors the complex and prevents slippage or dissociation. This hybrid length is enforced by structural elements such as the and domains in the β' subunit () or equivalent regions in eukaryotic RNAP II, which separate the RNA from the template DNA at the upstream edge and constrain hybrid extension beyond 9 bp. Disruptions to this hybrid, such as through , trigger cleavage to realign the , underscoring its role in processive elongation. Elongation speeds vary across organisms, reflecting regulatory needs. Bacterial RNAP typically proceeds at 20–80 per second, allowing rapid in fast-growing cells. In eukaryotes, RNAP II elongates more slowly, at approximately 1–4 kb per minute (∼17–67 per second), due to frequent interactions with regulatory factors that integrate context and co-transcriptional processing. Regulatory pausing interrupts elongation to allow coordination with cellular processes, such as mRNA or stress responses. In , NusG promotes pausing at specific motifs (e.g., TTNTTT) roughly once every 3 kb, stabilizing the TEC and facilitating antitermination. In eukaryotes, the homologous Spt5 (DSIF) induces pauses, often promoter-proximal, by interacting with NELF to maintain a low-processivity state until released by P-TEFb . These pauses are resolved to resume efficient elongation, highlighting NusG/Spt5 as conserved modulators of the transcription landscape.

Termination and Release

In bacterial RNA polymerase, termination occurs through two primary mechanisms: Rho-dependent and intrinsic. Rho-dependent termination involves the Rho helicase, an RNA-dependent , which binds to C-rich rut sites on the nascent RNA and translocates along it using to unwind the RNA-DNA hybrid helix, leading to polymerase dissociation. This process proceeds via three routes—RNA shearing for rapid recycling, RNAP hyper-translocation for complex decomposition, or a stand-by mode where Rho pre-binds the polymerase—ensuring efficient termination at specific sites. In contrast, intrinsic termination relies on terminator sequences featuring a GC-rich RNA followed by a U-tract; the forms in the RNA exit channel, inducing pausing and similar to that observed during elongation, while the U-tract weakens the RNA-DNA hybrid, promoting transcript cleavage and release without additional factors. Eukaryotic RNA polymerase II (Pol II) termination integrates mRNA processing signals, primarily using the torpedo or allosteric pathways triggered by the polyadenylation signal (PAS). In the torpedo model, cleavage and polyadenylation specificity factor (CPSF) recognizes the PAS (e.g., AAUAAA), recruiting endonuclease CPSF73 to cleave the pre-mRNA 10–30 nucleotides downstream, exposing a 5' RNA end for degradation by the 5'-3' exonuclease Rat1 (XRN2 in humans), which "torpedoes" the paused Pol II by degrading the RNA and disrupting the hybrid. The allosteric pathway complements this by slowing Pol II via PAS-induced conformational changes, dephosphorylation of elongation factor Spt5, and CTD modifications, facilitating Rat1 recruitment and termination. Archaeal RNA polymerase termination mechanisms resemble those of eukaryotes, featuring backtracking where the RNA 3' end displaces into the secondary channel, halting elongation until cleavage restores activity. TFIIS-like factors, such as TFS1, enhance this by inserting acidic residues (e.g., Asp-Glu) into the to stimulate transcript cleavage, promoting fidelity and processivity akin to eukaryotic TFIIS. A paralogue, TFS4, lacks catalytic activity but inhibits RNAP by competing with NTP binding, potentially fine-tuning termination. Following termination, RNA release involves disruption of the RNA-DNA hybrid, often powered by NTP ; in , Rho's activity pulls RNA from the hybrid, enabling dissociation, while in eukaryotes, Rat1 degradation achieves similar hybrid collapse. Polymerase recycling occurs via core reassembly, with released RNAP diffusing for reuse—Rho-dependent termination, for instance, recycles stalled complexes at DNA lesions to support repair. This efficiency ensures precise transcript lengths; defects, such as Rho inactivation, cause read-through transcription, leading to aberrant mRNA and genomic instability. Structurally, hairpin formation in the exit channel triggers pausing by inducing conformational diversity in the trigger loop, destabilizing the elongation complex without direct hairpin-polymerase contact, as seen in bacterial intrinsic terminators where GC-rich stems extend to melt rU-dA pairs. This mechanism underscores termination's role in polymerase across domains.

Variations in Different Organisms

Bacterial RNA Polymerase

Bacterial RNA polymerase (RNAP) is a multisubunit enzyme essential for gene expression in prokaryotes, with the Escherichia coli enzyme serving as the primary model due to its well-characterized simplicity relative to eukaryotic counterparts, which require multiple polymerases and complex initiation factors. The core enzyme consists of five subunits—two α subunits, one β subunit, one β' subunit, and one small ω subunit—with a molecular mass of approximately 400 kDa, forming a crab-claw-like structure that clamps DNA for processivity during elongation; the β' subunit's clamp domain is particularly critical for maintaining DNA grip and enhancing transcriptional efficiency. The catalytically active core associates with a σ factor to form the holoenzyme, approximately 450 kDa in mass, which enables specific promoter recognition; in E. coli, the housekeeping σ70 factor is most common, but alternative σ factors allow adaptation to environmental cues. Promoter diversity in is orchestrated by multiple σ factors, which compete for core binding to redirect RNAP to distinct promoter classes; promoters, recognized by σ70, drive constitutive expression of essential genes via conserved -35 (TTGACA) and -10 (TATAAT) elements, while stress promoters utilize alternative σ factors like σS (RpoS) for general stress or σ32 (RpoH) for heat shock, often featuring extended or variant motifs such as a at -13 for σS. is further modulated by anti-σ factors, which sequester specific σ subunits under non-stress conditions—e.g., RseA binds σE (RpoE) to prevent envelope stress responses until proteolytic release during periplasmic stress—ensuring precise temporal control of transcription. Unique prokaryotic adaptations include the stringent response, where the alarmone ppGpp binds at the β'-ω interface of E. coli RNAP, allosterically restraining the enzyme's cleft to inhibit rRNA promoter open complex formation and favor during limitation, as revealed by a 4.5 Å crystal structure. Transcription-translation coupling is another hallmark, with ribosomes binding nascent mRNA co-transcriptionally in the , forming direct RNAP-ribosome contacts via factors like NusG to prevent , synchronize elongation rates, and regulate in operons such as trp. Rifampicin, a key inhibitor, binds a hydrophobic pocket in the β subunit, sterically blocking the path of the elongating beyond 2-3 , thereby halting without affecting initiation. Recent cryo-EM studies in the have illuminated conformational dynamics of bacterial RNAP complexes; for instance, structures of E. coli σ70 holoenzyme at rRNA promoters (resolved at 3.5-4.1 Å) show σ finger displacement and DNA bubble stabilization during open complex formation, while ppGpp/DksA binding induces cleft narrowing to suppress synthesis under stress. These findings underscore E. coli RNAP's role as a tractable model for evolutionary conservation of core architecture across domains, with prokaryotic simplicity facilitating coupled processes absent in compartmentalized eukaryotes.

Eukaryotic RNA Polymerases

Eukaryotes possess three distinct nuclear , each specialized for transcribing specific classes of RNA within the nucleus, contrasting with the single multifunctional bacterial RNA polymerase that handles all transcription needs. (Pol I) is a 14-subunit complex with a of approximately 590 kDa, localized to the where it accounts for up to 60% of total cellular transcription by synthesizing the 45S pre-rRNA precursor that is processed into the mature 18S, 5.8S, and 28S ribosomal RNAs essential for . Initiation of Pol I transcription requires the upstream binding factor (UBF) to bend and stabilize the promoter DNA, along with selectivity factor 1 (), a complex containing (TBP) and TAFs that recruits Pol I and the initiation factor Rrn3 to the (rDNA) promoter. RNA polymerase II (Pol II), comprising 12 subunits and weighing about 500–600 kDa, resides in the nucleoplasm and is responsible for transcribing all protein-coding messenger RNAs (mRNAs) as well as many non-coding RNAs, including long non-coding RNAs (lncRNAs), microRNAs (miRNAs), and some small nuclear RNAs (snRNAs). A hallmark of Pol II is its C-terminal domain (CTD) on the largest subunit (RPB1), consisting of 25–52 tandem heptapeptide repeats with the Y₁S₂P₃T₄S₅P₆S₇, which serves as a regulatory platform through dynamic . of Ser5 in the CTD repeats, primarily by the CDK7 (part of TFIIH), predominates during and promoter clearance to recruit capping enzymes and ; in contrast, Ser2 by CDK9 (in P-TEFb) accumulates during elongation to facilitate productive RNA synthesis, factor recruitment for splicing and , and chromatin modifications. RNA polymerase III (Pol III), the largest nuclear polymerase at ~700 kDa with 17 subunits, operates in the nucleoplasm to transcribe short, untranslated RNAs critical for cellular functions, such as transfer RNAs (tRNAs), 5S rRNA, and U6 (snRNA). Pol III promoters are classified into types 1–3, with type 3 promoters (e.g., for U6 snRNA) featuring a proximal sequence element (PSE) at positions -65 to -48, recognized by the SNAPc complex, and a at -32 to -25 that recruits TBP within the TFIIIB initiation factor to position Pol III accurately. Beyond the nuclear polymerases, eukaryotic organelles harbor specialized enzymes: the mitochondrial RNA polymerase is a single-subunit (~140 kDa) resembling bacteriophage T7 RNA polymerase in structure and mechanism, transcribing the compact mitochondrial to produce rRNAs, tRNAs, and mRNAs with assistance from transcription factors like TFAM and TFB2M. In contrast, the chloroplast RNA polymerase is a multi-subunit complex (~400 kDa) of bacterial origin, encoded partly by the plastid (e.g., rpoA, rpoB subunits homologous to bacterial α and β), that transcribes genes for photosynthesis-related proteins, rRNAs, and tRNAs, augmented by nuclear-encoded accessory factors. Eukaryotic nuclear transcription is highly compartmentalized, with Pol I confined to nucleoli for rRNA production and ribosome assembly, while Pol II and Pol III activities occur in nucleoplasmic "transcription factories"—dynamic, immobile clusters of polymerases that enhance efficiency through looping. This spatial organization facilitates co-transcriptional RNA , particularly for Pol II transcripts, where capping occurs shortly after (~20–30 nucleotides), splicing during elongation, and near termination, all tethered to the phosphorylated CTD to ensure mRNA maturation concurrent with synthesis.

Archaeal and Viral RNA Polymerases

Archaeal RNA polymerases are multi-subunit enzymes typically comprising 11 to 13 subunits and a of approximately 370 kDa, exhibiting the closest structural homology to eukaryotic among the domains of life. Their core architecture features a double-psi β-barrel fold shared with eukaryotic counterparts, distinguishing them from simpler bacterial enzymes. Transcription initiation relies on two key factors: the (TBP), which recognizes promoter elements, and transcription factor B (TFB), a homolog of eukaryotic TFIIB, which recruits the to form the pre-initiation complex. In extremophilic archaea, such as hyperthermophiles in the genus Thermococcus, these polymerases incorporate adaptations like enhanced ionic bonds and hydrophobic cores to maintain at temperatures exceeding 80°C. Viral DNA-dependent RNA polymerases display significant diversity in organization. T7-like single-subunit polymerases, found in bacteriophages such as T7, are highly efficient, achieving transcription rates 5 to 10 times faster than bacterial polymerases while maintaining strong processivity on promoter-specific templates. Conversely, poxviruses like encode complex multi-subunit DNA-dependent RNA polymerases that operate independently in the host , comprising at least 10 subunits including homologs of cellular core elements. RNA-dependent RNA polymerases (RdRps) power replication in RNA viruses, directly synthesizing complementary strands from RNA templates without involving DNA. In viruses such as influenza and SARS-CoV-2, RdRps generate both positive-sense and negative-sense RNAs, but their lack of proofreading results in error rates of approximately 10⁻⁴ to 10⁻⁵ mutations per nucleotide, driving rapid viral evolution and diversity. Hybrid polymerases, exemplified by reverse transcriptases in retroviruses like HIV, catalyze the conversion of single-stranded viral RNA into double-stranded DNA, integrating polymerase activity with RNase H-mediated degradation of the RNA template. Evolutionarily, archaeal RNA polymerases represent an intermediate form, bridging the simpler bacterial core (with four main subunits) and the more elaborate eukaryotic systems through shared subunit compositions and initiation factors with Pol II. Viral polymerases frequently derive from host origins via , adapting cellular machinery for intracellular replication while evading immune detection. A hallmark of RdRps across RNA viruses is the conserved palm domain, which houses the catalytic motifs A to G essential for nucleotide addition and fidelity. Distinctive strategies include cap-snatching in , where the viral polymerase's endonuclease cleaves 5' caps from host mRNAs to prime synthesis of viral transcripts, ensuring efficient in the .

Regulation and Inhibitors

Transcriptional Regulation Mechanisms

In eukaryotes, transcriptional regulation of RNA polymerase II (Pol II) often involves promoter-proximal pausing, where Pol II initiates transcription but pauses shortly downstream of the promoter, allowing rapid response to developmental or environmental signals.01163-7) This pausing is mediated by the negative elongation factor (NELF) and DRB sensitivity-inducing factor (DSIF), which stabilize Pol II at the promoter-proximal region.01163-7) Release from pausing is primarily controlled by the positive transcription elongation factor b (P-TEFb), a complex consisting of T and CDK9, which phosphorylates the C-terminal domain (CTD) of Pol II at serine 2, as well as NELF and DSIF, promoting productive elongation.01163-7) This mechanism is prevalent at approximately 60% of mammalian genes, particularly those involved in stress responses and cell differentiation. Enhancers and silencers further modulate Pol II recruitment through multi-protein complexes that bridge distant regulatory elements to promoters. The Mediator complex acts as a co-activator, integrating signals from enhancers by interacting with transcription factors and recruiting Pol II to core promoters via its head, middle, and tail modules. Similarly, chromatin remodeling complexes like facilitate Pol II access by altering positioning at enhancers and promoters, often in response to activator binding; for instance, is recruited by domains to displace and stabilize the pre-initiation complex. Silencers, conversely, recruit repressive factors that compact chromatin, limiting Pol II engagement. Epigenetic modifications on histones provide a heritable layer of influencing Pol II activity and accessibility. Histone acetylation, such as H3K27ac at active enhancers, loosens structure to enhance Pol II recruitment, while marks like at promoters correlate with pause-release and elongation by stabilizing the pre-initiation complex. The Pol II CTD serves as a "reader" of these marks through states that recruit -modifying enzymes; for example, CTD serine 5 during facilitates 36 (H3K36me3) during elongation, which suppresses cryptic transcription. Repressive marks, such as , inhibit Pol II progression by promoting compact states. In bacteria, transcriptional regulation centers on operon control, where repressors and activators bind near promoters to modulate RNA polymerase holoenzyme access. Repressors, like the LacI protein in the lac operon, bind operator sequences upstream of the promoter to sterically hinder sigma factor binding, preventing initiation; this is relieved by inducer molecules like allolactose. Activators, such as CRP in the catabolite activator protein system, bind upstream sites and directly contact the alpha subunit of RNA polymerase to enhance promoter recognition and open complex formation, as seen in glucose-repressed genes. This binary control allows coordinated expression of gene clusters in response to nutrients or stressors. Feedback loops, including autoregulation, fine-tune RNA polymerase activity through self-regulatory circuits. Transcription factors often autoregulate their own expression via , where high levels of the factor repress its promoter to maintain ; for example, the Gcn4 transcription factor forms such a loop by modulating its own synthesis. Non-coding RNAs contribute to these loops by acting as molecular decoys or guides; long non-coding RNAs (lncRNAs) like in mammals recruit repressive complexes to silence Pol II transcription in cis, while others, such as promoter-associated ncRNAs, interfere with Pol II recruitment to prevent aberrant activation. Organelle-specific regulation occurs in mitochondria, where transcription factor A (TFAM) orchestrates mitochondrial RNA polymerase (POLRMT) activity. TFAM, an HMG-box protein, bends and packages into nucleoids, facilitating POLRMT recruitment to non-consensus promoters and stabilizing the initiation complex with transcription factor B2 (TFB2M).30103-3) This regulation ensures mitochondrial gene expression matches cellular energy demands, with TFAM levels directly correlating with transcription rates.

Inhibitors and Therapeutic Applications

RNA polymerase inhibitors encompass a diverse class of compounds that target the enzyme's catalytic or regulatory functions, offering therapeutic potential in treating bacterial, viral, and eukaryotic proliferative diseases. These agents primarily act during the elongation phase, a common vulnerability across polymerases, by interfering with addition or translocation. In bacterial systems, rifamycins such as rifampicin bind to a hydrophobic pocket in the β subunit of RNA polymerase, sterically blocking the elongating RNA chain approximately 2-3 nucleotides downstream of the , thereby halting transcription and early elongation. Another bacterial inhibitor, streptolydigin, binds near the to prevent trigger loop folding, inhibiting elongation by stabilizing a non-productive conformation of the enzyme. These mechanisms exploit structural differences between bacterial and host polymerases, enabling selective antibacterial activity. For eukaryotic RNA polymerases, , a derived from mushrooms, binds tightly to and plugs the funnel domain channel, blocking translocation of the RNA-DNA hybrid and causing acute liver poisoning upon ingestion. Additionally, inhibitors of cyclin-dependent kinase 9 (CDK9), such as flavopiridol, disrupt phosphorylation of the RNA polymerase II C-terminal domain, impairing transcriptional elongation and promoter-proximal pausing , which has been leveraged in anticancer strategies. Viral RNA-dependent RNA polymerases (RdRps) are targeted by nucleotide analogs like , which is incorporated into the growing chain by SARS-CoV-2 RdRp; it causes delayed chain termination through slow release of , allowing three additional before stalling synthesis. Inhibitors of RNA polymerase generally operate via competitive mechanisms, such as NTP analogs that mimic substrates but impair ; allosteric modulation, where binding to distant sites alters conformation; or direct translocation blockade, preventing movement of the scaffold. Therapeutically, rifampin serves as a cornerstone for , typically administered at 10 mg/kg daily in combination regimens to eradicate by suppressing . For cancer, CX-5461 selectively inhibits to disrupt rRNA synthesis, activating p53-independent DNA damage responses and showing efficacy in preclinical models of B-lymphoma and high-grade serous . In the context of 2020s pandemics, has been pivotal as an antiviral for , reducing viral replication through RdRp inhibition in hospitalized patients. Resistance to these inhibitors often arises from mutations in polymerase binding sites, such as substitutions in the rpoB gene (e.g., Ser531Leu) that reduce rifampin affinity in M. tuberculosis, complicating treatment and necessitating combination therapies.

Historical Development and Techniques

Discovery and Key Milestones

The initial observations of RNA synthesis in cell-free systems emerged in the mid-1950s, with the discovery of polynucleotide phosphorylase by Marianne Grunberg-Manago and , which enabled primer-independent polymerization of ribonucleotides but lacked DNA dependency. A major breakthrough came in 1959 when Samuel B. Weiss reported the first DNA-dependent RNA synthesis using rat liver extracts, demonstrating that DNA serves as a template for ribonucleotide incorporation and establishing the enzymatic basis of transcription. Independently, Jerard Hurwitz and Audrey Stevens confirmed this activity in extracts in 1960, further solidifying DNA's templating role in bacterial systems. Purification efforts accelerated in the early 1960s, culminating in 1962 with the isolation of the bacterial RNA polymerase from E. coli by Michael Chamberlin and Paul Berg, which allowed characterization of its DNA-dependent activity and processivity. For eukaryotes, Robert G. Roeder achieved a landmark purification in 1969, identifying three distinct nuclear RNA polymerases (I, II, and III) from sea urchin embryos and mammalian cells, revealing the multiplicity of transcription machinery in higher organisms. Key figures like Samuel Weiss advanced in vitro transcription assays that underpinned these isolations, while Reiji Okazaki's work in the 1960s on discontinuous nucleic acid synthesis provided conceptual insights into replication-transcription linkages, though primarily focused on DNA. The 1970s and 1980s saw subunit composition elucidated through biochemical , with the bacterial resolved into core (α₂ββ'ω) and accessory components. A critical advance was the discovery of the sigma (σ) factor by Robert R. Burgess and Andrew A. Travers, which dissociates after initiation to enable promoter-specific transcription in , as demonstrated by its cyclic reuse and selectivity for T4 phage DNA templates. progressed in the 1990s with the first of a bacterial RNA polymerase σ^{70} subunit fragment in 1996 by Seth A. Darst and colleagues, revealing DNA-binding domains essential for promoter recognition. This was followed in 2001 by Patrick Cramer's high-resolution structure of yeast at 2.8 Å, depicting the multi-subunit architecture and cleft for addition. The 2010s ushered in the cryo-electron microscopy (cryo-EM) era, enabling visualization of dynamic complexes; for instance, in 2017, structures of pre-initiation complexes with transcription factors illuminated promoter opening and early elongation. More recently, the 2020 spurred rapid cryo-EM determinations of (RdRp) structures, such as the nsp12-nsp7-nsp8 complex at 2.9 Å, facilitating inhibitor design like . In 2025, cryo-EM structures of transcribing RNAPII complexes isolated directly from human nuclei revealed native conformations and associated factors in cellular contexts.

Purification and Structural Determination Methods

The purification of RNA polymerase has historically relied on multi-step biochemical to isolate the from cellular extracts, with early methods focusing on bacterial systems. In the , classical protocols for RNA polymerase involved cell lysis followed by to fractionate proteins between 33% and 50% saturation, which enriched the approximately 5- to 10-fold, and subsequent ion-exchange on DEAE-cellulose columns eluted at around 0.23 M KCl, achieving a 140- to 170-fold purification with 41% recovery from crude extracts. These steps yielded highly active core preparations with specific activities reaching 6,000 units/mg, though overall yields were low at approximately 1 mg per liter of culture due to the 's low abundance (about 1,000 molecules per cell) and the absence of affinity-based techniques. Later refinements in the late incorporated additional steps like phosphocellulose and polymin P , enabling large-scale isolation of up to 250 mg of holoenzyme from 500 g of cells while maintaining high purity and 45% recovery. Purifying eukaryotic RNA polymerases presents greater challenges due to their nuclear localization, larger multisubunit composition, and association with , necessitating initial nuclear extraction under high-salt conditions to solubilize the enzymes from isolated nuclei. For (Pol II), seminal 1970s methods from rat liver involved hypotonic lysis of nuclei, fractionation (40-60% saturation), and DEAE-Sephadex , separating Pol II (form B, sensitive to ) from Pol I and Pol III with 20- to 40-fold purification and recoveries of 10-20%. Modern recombinant approaches address these issues by overexpressing epitope-tagged versions, such as FLAG- or His-tagged Pol II in or cells, followed by affinity purification exploiting the C-terminal domain (CTD) of the largest subunit (RPB1); anti-CTD monoclonal antibodies like 8WG16 enable one-step isolation of near-homogeneous holoenzyme from nuclear extracts with >90% purity and yields improved by 10- to 100-fold over native methods. Overexpression in using galactose-inducible promoters for tagged RPB1 subunits, combined with (TAP) tags on the CTD, further enhances scalability, yielding milligram quantities for structural studies while preserving post-translational modifications. Advances in have dramatically improved yields across organisms; for bacterial RNA polymerase, plasmid-based overexpression in E. coli using pET vectors with N-terminal His6-tags on the β' subunit allows Ni-NTA , followed by heparin-Sepharose and size-exclusion steps, achieving 10-20 mg/L cultures with >95% purity in 2-3 days. Similarly, systems for eukaryotic polymerases incorporate integrated tags for tandem purification, boosting yields to 0.5-1 mg/L while minimizing contaminants. Structural determination of RNA polymerase has evolved from domain-specific techniques to high-resolution imaging of intact complexes. first revealed the core architecture of bacterial RNA polymerase at 2.6 resolution in 2002, using enzyme crystallized in the presence of rifampicin, but faced limitations for larger eukaryotic forms due to flexibility and size (>500 kDa). For eukaryotic Pol II, a 2.8 crystal structure of the 10-subunit yeast enzyme bound to α-amanitin in 2001 highlighted the clamp domain and CTD scaffold, though full holoenzyme remained challenging owing to dynamic conformations. (NMR) spectroscopy complements these by resolving individual domains, such as the σ70 region 4 of bacterial holoenzyme or the Pol II CTD heptad repeats, providing atomic-level insights into flexible linker regions at resolutions up to 2 but limited to fragments under 50 kDa. Cryo-electron microscopy (cryo-EM) single-particle analysis has revolutionized structural studies since the 2010s, enabling visualization of dynamic, near-native complexes at 3-4 resolution without artifacts. For bacterial systems, cryo-EM resolved transcribing elongation complexes with accessory factors like NusG at 3.2 in 2014, capturing addition cycles. In eukaryotes, advances yielded a 3.0 of the yeast Pol II elongation complex with Elongin and SPT6 in 2023 (building on 2022 precursor maps), revealing ubiquitin-mediated pausing mechanisms and interactions in states previously intractable by . These resolutions allow de novo model building of side chains and transient conformations, with local refinements reaching 2.5 for the catalytic site. Additional biophysical methods probe interactions and dynamics beyond static structures. Cross-linking (XL-MS) maps protein-protein interfaces in Pol II complexes, as in 2015 studies identifying >200 contacts in elongation assemblies at near-atomic precision when integrated with cryo-EM. () spectroscopy, often single-molecule variants, monitors conformational changes, such as the bacterial RNAP clamp opening during (10-20 shifts) or eukaryotic Pol II pausing, providing kinetic data on timescales from milliseconds to seconds.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.