Hubbry Logo
Two-hybrid screeningTwo-hybrid screeningMain
Open search
Two-hybrid screening
Community hub
Two-hybrid screening
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Two-hybrid screening
Two-hybrid screening
from Wikipedia

Overview of two-hybrid assay, checking for interactions between two proteins, called here Bait and Prey.
A. The Gal4 transcription factor gene produces a two-domain protein (BD and AD) essential for transcription of the reporter gene (LacZ).
B,C. Two fusion proteins are prepared: Gal4BD+Bait and Gal4AD+Prey. Neither of them are usually sufficient to initiate transcription (of the reporter gene) alone.
D. When both fusion proteins are produced and the Bait part of the first fusion protein interacts with the Prey part of the second, transcription of the reporter gene occurs.

Two-hybrid screening (originally known as yeast two-hybrid system or Y2H) is a molecular biology technique used to discover protein–protein interactions (PPIs)[1] and protein–DNA interactions[2][3] by testing for physical interactions (such as binding) between two proteins or a single protein and a DNA molecule, respectively.

The premise behind the test is the activation of downstream reporter gene(s) by the binding of a transcription factor onto an upstream activating sequence (UAS). For two-hybrid screening, the transcription factor is split into two separate fragments, called the DNA-binding domain (DBD or often also abbreviated as BD) and activating domain (AD). The BD is the domain responsible for binding to the UAS and the AD is the domain responsible for the activation of transcription.[1][2] The Y2H is thus a protein-fragment complementation assay.

History

[edit]

Pioneered by Stanley Fields and Ok-Kyu Song in 1989, the technique was originally designed to detect protein–protein interactions using the Gal4 transcriptional activator of the yeast Saccharomyces cerevisiae. The Gal4 protein activated transcription of a gene involved in galactose utilization, which formed the basis of selection.[4] Since then, the same principle has been adapted to describe many alternative methods, including some that detect protein–DNA interactions or DNA-DNA interactions, as well as methods that use different host organisms such as Escherichia coli or mammalian cells instead of yeast.[3][5]

Basic premise

[edit]

The key to the two-hybrid screen is that in most eukaryotic transcription factors, the activating and binding domains are modular and can function in proximity to each other without direct binding.[6] This means that even though the transcription factor is split into two fragments, it can still activate transcription when the two fragments are indirectly connected.

The most common screening approach is the yeast two-hybrid assay. In this approach the researcher knows where each prey is located on the used medium (agar plates). Millions of potential interactions in several organisms have been screened in the latest decade using high-throughput screening systems (often using robots) and over thousands of interactions have been detected and categorized in databases as BioGRID.[7][8] This system often utilizes a genetically engineered strain of yeast in which the biosynthesis of certain nutrients (usually amino acids or nucleic acids) is lacking. When grown on media that lacks these nutrients, the yeast fail to survive. This mutant yeast strain can be made to incorporate foreign DNA in the form of plasmids. In yeast two-hybrid screening, separate bait and prey plasmids are simultaneously introduced into the mutant yeast strain or a mating strategy is used to get both plasmids in one host cell.[9]

The second high-throughput approach is the library screening approach. In this set up the bait and prey harboring cells are mated in a random order. After mating and selecting surviving cells on selective medium the scientist will sequence the isolated plasmids to see which prey (DNA sequence) is interacting with the used bait. This approach has a lower rate of reproducibility and tends to yield higher amounts of false positives compared to the matrix approach.[9]

Plasmids are engineered to produce a protein product in which the DNA-binding domain (BD) fragment is fused onto a protein while another plasmid is engineered to produce a protein product in which the activation domain (AD) fragment is fused onto another protein. The protein fused to the BD may be referred to as the bait protein, and is typically a known protein the investigator is using to identify new binding partners. The protein fused to the AD may be referred to as the prey protein and can be either a single known protein or a library of known or unknown proteins. In this context, a library may consist of a collection of protein-encoding sequences that represent all the proteins expressed in a particular organism or tissue, or may be generated by synthesising random DNA sequences.[3] Regardless of the source, they are subsequently incorporated into the protein-encoding sequence of a plasmid, which is then transfected into the cells chosen for the screening method.[3] This technique, when using a library, assumes that each cell is transfected with no more than a single plasmid and that, therefore, each cell ultimately expresses no more than a single member from the protein library.

If the bait and prey proteins interact (i.e., bind), then the AD and BD of the transcription factor are indirectly connected, bringing the AD in proximity to the transcription start site and transcription of reporter gene(s) can occur. If the two proteins do not interact, there is no transcription of the reporter gene. In this way, a successful interaction between the fused protein is linked to a change in the cell phenotype.[1]

The challenge of separating cells that express proteins that happen to interact with their counterpart fusion proteins from those that do not, is addressed in the following section.

Fixed domains

[edit]

In any study, some of the protein domains, those under investigation, will be varied according to the goals of the study whereas other domains, those that are not themselves being investigated, will be kept constant. For example, in a two-hybrid study to select DNA-binding domains, the DNA-binding domain, BD, will be varied while the two interacting proteins, the bait and prey, must be kept constant to maintain a strong binding between the BD and AD. There are a number of domains from which to choose the BD, bait and prey and AD, if these are to remain constant. In protein–protein interaction investigations, the BD may be chosen from any of many strong DNA-binding domains such as Zif268.[2] A frequent choice of bait and prey domains are residues 263–352 of yeast Gal11P with a N342V mutation[2] and residues 58–97 of yeast Gal4,[2] respectively. These domains can be used in both yeast- and bacterial-based selection techniques and are known to bind together strongly.[1][2]

The AD chosen must be able to activate transcription of the reporter gene, using the cell's own transcription machinery. Thus, the variety of ADs available for use in yeast-based techniques may not be suited to use in their bacterial-based analogues. The herpes simplex virus-derived AD, VP16 and yeast Gal4 AD have been used with success in yeast[1] whilst a portion of the α-subunit of E. coli RNA polymerase has been utilised in E. coli-based methods.[2][3]

Whilst powerfully activating domains may allow greater sensitivity towards weaker interactions, conversely, a weaker AD may provide greater stringency.

Construction of expression plasmids

[edit]

A number of engineered genetic sequences must be incorporated into the host cell to perform two-hybrid analysis or one of its derivative techniques. The considerations and methods used in the construction and delivery of these sequences differ according to the needs of the assay and the organism chosen as the experimental background.

There are two broad categories of hybrid library: random libraries and cDNA-based libraries. A cDNA library is constituted by the cDNA produced through reverse transcription of mRNA collected from specific cells of types of cell. This library can be ligated into a construct so that it is attached to the BD or AD being used in the assay.[1] A random library uses lengths of DNA of random sequence in place of these cDNA sections. A number of methods exist for the production of these random sequences, including cassette mutagenesis.[2] Regardless of the source of the DNA library, it is ligated into the appropriate place in the relevant plasmid/phagemid using the appropriate restriction endonucleases.[2]

E. coli-specific considerations

[edit]

By placing the hybrid proteins under the control of IPTG-inducible lac promoters, they are expressed only on media supplemented with IPTG. Further, by including different antibiotic resistance genes in each genetic construct, the growth of non-transformed cells is easily prevented through culture on media containing the corresponding antibiotics. This is particularly important for counter selection methods in which a lack of interaction is needed for cell survival.[2]

The reporter gene may be inserted into the E. coli genome by first inserting it into an episome, a type of plasmid with the ability to incorporate itself into the bacterial cell genome[2] with a copy number of approximately one per cell.[10]

The hybrid expression phagemids can be electroporated into E. coli XL-1 Blue cells which after amplification and infection with VCS-M13 helper phage, will yield a stock of library phage. These phage will each contain one single-stranded member of the phagemid library.[2]

Recovery of protein information

[edit]

Once the selection has been performed, the primary structure of the proteins which display the appropriate characteristics must be determined. This is achieved by retrieval of the protein-encoding sequences (as originally inserted) from the cells showing the appropriate phenotype.

E. coli

[edit]

The phagemid used to transform E. coli cells may be "rescued" from the selected cells by infecting them with VCS-M13 helper phage. The resulting phage particles that are produced contain the single-stranded phagemids and are used to infect XL-1 Blue cells.[2] The double-stranded phagemids are subsequently collected from these XL-1 Blue cells, essentially reversing the process used to produce the original library phage. Finally, the DNA sequences are determined through dideoxy sequencing.[2]

Controlling sensitivity

[edit]

The Escherichia coli-derived Tet-R repressor can be used in line with a conventional reporter gene and can be controlled by tetracycline or doxycycline (Tet-R inhibitors). Thus the expression of Tet-R is controlled by the standard two-hybrid system but the Tet-R in turn controls (represses) the expression of a previously mentioned reporter such as HIS3, through its Tet-R promoter. Tetracycline or its derivatives can then be used to regulate the sensitivity of a system utilising Tet-R.[1]

Sensitivity may also be controlled by varying the dependency of the cells on their reporter genes. For example, this may be affected by altering the concentration of histidine in the growth medium for his3-dependent cells and altering the concentration of streptomycin for aadA dependent cells.[2][3] Selection-gene-dependency may also be controlled by applying an inhibitor of the selection gene at a suitable concentration. 3-Amino-1,2,4-triazole (3-AT) for example, is a competitive inhibitor of the HIS3-gene product and may be used to titrate the minimum level of HIS3 expression required for growth on histidine-deficient media.[2]

Sensitivity may also be modulated by varying the number of operator sequences in the reporter DNA.

Non-fusion proteins

[edit]

A third, non-fusion protein may be co-expressed with two fusion proteins. Depending on the investigation, the third protein may modify one of the fusion proteins or mediate or interfere with their interaction.[1]

Co-expression of the third protein may be necessary for modification or activation of one or both of the fusion proteins. For example, S. cerevisiae possesses no endogenous tyrosine kinase. If an investigation involves a protein that requires tyrosine phosphorylation, the kinase must be supplied in the form of a tyrosine kinase gene.[1]

The non-fusion protein may mediate the interaction by binding both fusion proteins simultaneously, as in the case of ligand-dependent receptor dimerization.[1]

For a protein with an interacting partner, its functional homology to other proteins may be assessed by supplying the third protein in non-fusion form, which then may or may not compete with the fusion-protein for its binding partner. Binding between the third protein and the other fusion protein will interrupt the formation of the reporter expression activation complex and thus reduce reporter expression, leading to the distinguishing change in phenotype.[1]

Split-ubiquitin yeast two-hybrid

[edit]

One limitation of classic yeast two-hybrid screens is that they are limited to soluble proteins. It is therefore impossible to use them to study the protein–protein interactions between insoluble integral membrane proteins. The split-ubiquitin system provides a method for overcoming this limitation.[11] In the split-ubiquitin system, two integral membrane proteins to be studied are fused to two different ubiquitin moieties: a C-terminal ubiquitin moiety ("Cub", residues 35–76) and an N-terminal ubiquitin moiety ("Nub", residues 1–34). These fused proteins are called the bait and prey, respectively. In addition to being fused to an integral membrane protein, the Cub moiety is also fused to a transcription factor (TF) that can be cleaved off by ubiquitin specific proteases. Upon bait–prey interaction, Nub and Cub-moieties assemble, reconstituting the split-ubiquitin. The reconstituted split-ubiquitin molecule is recognized by ubiquitin specific proteases, which cleave off the transcription factor, allowing it to induce the transcription of reporter genes.[12]

Fluorescent two-hybrid assay

[edit]

Zolghadr and co-workers presented a fluorescent two-hybrid system that uses two hybrid proteins that are fused to different fluorescent proteins as well as LacI, the lac repressor. The structure of the fusion proteins looks like this: FP2-LacI-bait and FP1-prey where the bait and prey proteins interact and bring the fluorescent proteins (FP1 = GFP, FP2=mCherry) in close proximity at the binding site of the LacI protein in the host cell genome.[13] The system can also be used to screen for inhibitors of protein–protein interactions.[14]

Enzymatic two-hybrid systems: KISS

[edit]

While the original Y2H system used a reconstituted transcription factor, other systems create enzymatic activities to detect PPIs. For instance, the KInase Substrate Sensor ("KISS"), is a mammalian two-hybrid approach has been designed to map intracellular PPIs. Here, a bait protein is fused to a kinase-containing portion of TYK2 and a prey is coupled to a gp130 cytokine receptor fragment. When bait and prey interact, TYK2 phosphorylates STAT3 docking sites on the prey chimera, which ultimately leads to activation of a reporter gene.[15]

Two-hybrid screening by sequencing

[edit]

A number of strategies have been developed in which the two-hybrid positives are identified by DNA sequencing, usually after selection using a reporter gene. A variant of this approach was recently described as Liquid Y2H-Seq.[16]

One-, three- and one-two-hybrid variants

[edit]

One-hybrid

[edit]

The one-hybrid variation of this technique is designed to investigate protein–DNA interactions and uses a single fusion protein in which the AD is linked directly to the binding domain. The binding domain may be constituted by a library and thus can be selected for proteins binding a desired target sequence (which is inserted in the promoter region of a reporter gene). In a positive-selection system, a binding domain that successfully binds the UAS and allows transcription is thus selected.[1]

Note that selection of DNA-binding domains is not necessarily performed using a one-hybrid system, but may also be performed using a two-hybrid system in which the binding domain is varied and the bait and prey proteins are kept constant.[2][3]

One-hybrid screens can be used to investigate the transcriptional activation properties of proteins. This was done in a large screen of yeast proteins that were fused to a DNA-binding domain and the fused protein simply activated transcription of a reporter gene. This revealed 451 transcriptional activators in yeast, including 132 strong activators.[17]

Three-hybrid

[edit]
Overview of three-hybrid assay.

RNA-protein interactions have been investigated through a three-hybrid variation of the two-hybrid technique. In this case, a hybrid RNA molecule serves to adjoin together the two protein fusion domains—which are not intended to interact with each other but rather the intermediary RNA molecule (through their RNA-binding domains).[1] Techniques involving non-fusion proteins that perform a similar function, as described in the 'non-fusion proteins' section above, may also be referred to as three-hybrid methods.

One-two-hybrid

[edit]

Simultaneous use of the one- and two-hybrid methods (that is, simultaneous protein–protein and protein–DNA interaction) is known as a one-two-hybrid approach and expected to increase the stringency of the screen.[1]

Host organism

[edit]

Although theoretically, any living cell might be used as the background to a two-hybrid analysis, there are practical considerations that dictate which is chosen. The chosen cell line should be relatively cheap and easy to culture and sufficiently robust to withstand application of the investigative methods and reagents.[1] The latter is especially important for doing high-throughput studies. Therefore the yeast S. cerevisiae has been the main host organism for two-hybrid studies. However it is not always the ideal system to study interacting proteins from other organisms.[18] Yeast cells often do not have the same post translational modifications, have a different codon use or lack certain proteins that are important for the correct expression of the proteins. To cope with these problems several novel two-hybrid systems have been developed. Depending on the system used agar plates or specific growth medium is used to grow the cells and allow selection for interaction. The most common used method is the agar plating one where cells are plated on selective medium to see of interaction takes place. Cells that have no interaction proteins should not survive on this selective medium.[7][19]

S. cerevisiae (yeast)

[edit]

The yeast S. cerevisiae was the model organism used during the two-hybrid technique's inception. It is commonly known as the Y2H system. It has several characteristics that make it a robust organism to host the interaction, including the ability to form tertiary protein structures, neutral internal pH, enhanced ability to form disulfide bonds and reduced-state glutathione among other cytosolic buffer factors, to maintain a hospitable internal environment.[1] The yeast model can be manipulated through non-molecular techniques and its complete genome sequence is known.[1] Yeast systems are tolerant of diverse culture conditions and harsh chemicals that could not be applied to mammalian tissue cultures.[1]

A number of yeast strains have been created specifically for Y2H screens, e.g. Y187[20] and AH109,[21] both produced by Clontech. Yeast strains R2HMet and BK100 have also been used.[22]

Candida albicans

[edit]

C. albicans is a yeast with a particular feature: it translates the CUG codon into serine rather than leucine. Due to this different codon usage it is difficult to use the model system S. cerevisiae as a Y2H to check for protein-protein interactions using C. albicans genes. To provide a more native environment a C. albicans two-hybrid (C2H) system was developed. With this system protein-protein interactions can be studied in C. albicans itself.[23][24] A recent addition was the creation of a high-throughput system.[25][26][27]

E. coli

[edit]

Bacterial two hybrid methods (B2H or BTH) are usually carried out in E. coli and have some advantages over yeast-based systems. For instance, the higher transformation efficiency and faster rate of growth lends E. coli to the use of larger libraries (in excess of 108).[2] The absence of requirements for a nuclear localisation signal to be included in the protein sequence and the ability to study proteins that would be toxic to yeast may also be major factors to consider when choosing an experimental background organism.[2]

The methylation activity of certain E. coli DNA methyltransferase proteins may interfere with some DNA-binding protein selections. If this is anticipated, the use of an E. coli strain that is defective for a particular methyltransferase may be an obvious solution.[2] The B2H may not be ideal when studying eukaryotic protein-protein interactions (e.g. human proteins) as proteins may not fold as in eukaryotic cells or may lack other processing.

Mammalian cells

[edit]

In recent years a mammalian two hybrid (M2H) system has been designed to study mammalian protein-protein interactions in a cellular environment that closely mimics the native protein environment.[28] Transiently transfected mammalian cells are used in this system to find protein-protein interactions.[29][30] Using a mammalian cell line to study mammalian protein-protein interactions gives the advantage of working in a more native context.[5] The post-translational modifications, phosphorylation, acylation and glycosylation are similar. The intracellular localization of the proteins is also more correct compared to using a yeast two hybrid system.[31] [32] It is also possible with the mammalian two-hybrid system to study signal inputs.[33] Another big advantage is that results can be obtained within 48 hours after transfection.[5]

Arabidopsis thaliana

[edit]

In 2005 a two hybrid system in plants was developed. Using protoplasts of A. thaliana protein-protein interactions can be studied in plants. This way the interactions can be studied in their native context. In this system the GAL4 AD and BD are under the control of the strong 35S promoter. Interaction is measured using a GUS reporter. In order to enable a high-throughput screening the vectors were made Gateway compatible. The system is known as the protoplast two hybrid (P2H) system.[34]

Aplysia californica

[edit]

The sea hare A californica is a model organism in neurobiology to study among others the molecular mechanisms of long-term memory. To study interactions, important in neurology, in a more native environment a two-hybrid system has been developed in A californica neurons. A GAL4 AD and BD are used in this system.[35][36]

Bombyx mori

[edit]

An insect two-hybrid (I2H) system was developed in a silkworm cell line from the larva or caterpillar of the domesticated silk moth, Bombyx mori (BmN4 cells). This system uses the GAL4 BD and the activation domain of mouse NF-κB P65. Both are under the control of the OpIE2 promoter.[37]

Applications

[edit]

Determination of sequences crucial for interaction

[edit]

By changing specific amino acids by mutating the corresponding DNA base-pairs in the plasmids used, the importance of those amino acid residues in maintaining the interaction can be determined.[1]

After using bacterial cell-based method to select DNA-binding proteins, it is necessary to check the specificity of these domains as there is a limit to the extent to which the bacterial cell genome can act as a sink for domains with an affinity for other sequences (or indeed, a general affinity for DNA).[2]

Drug and poison discovery

[edit]

Protein–protein signalling interactions pose suitable therapeutic targets due to their specificity and pervasiveness. The random drug discovery approach uses compound banks that comprise random chemical structures, and requires a high-throughput method to test these structures in their intended target.[1][19]

The cell chosen for the investigation can be specifically engineered to mirror the molecular aspect that the investigator intends to study and then used to identify new human or animal therapeutics or anti-pest agents.[1][19]

Determination of protein function

[edit]

By determination of the interaction partners of unknown proteins, the possible functions of these new proteins may be inferred.[1] This can be done using a single known protein against a library of unknown proteins or conversely, by selecting from a library of known proteins using a single protein of unknown function.[1]

Zinc finger protein selection

[edit]

To select zinc finger proteins (ZFPs) for protein engineering, methods adapted from the two-hybrid screening technique have been used with success.[2][3] A ZFP is itself a DNA-binding protein used in the construction of custom DNA-binding domains that bind to a desired DNA sequence.[38]

By using a selection gene with the desired target sequence included in the UAS, and randomising the relevant amino acid sequences to produce a ZFP library, cells that host a DNA-ZFP interaction with the required characteristics can be selected. Each ZFP typically recognises only 3–4 base pairs, so to prevent recognition of sites outside the UAS, the randomised ZFP is engineered into a 'scaffold' consisting of another two ZFPs of constant sequence. The UAS is thus designed to include the target sequence of the constant scaffold in addition to the sequence for which a ZFP is selected.[2][3]

A number of other DNA-binding domains may also be investigated using this system.[2]

Strengths

[edit]
  • Two-hybrid screens are low-tech; they can be carried out in any lab without sophisticated equipment.
  • Two-hybrid screens can provide an important first hint for the identification of interaction partners.
  • The assay is scalable, which makes it possible to screen for interactions among many proteins. Furthermore, it can be automated, and by using robots many proteins can be screened against thousands of potentially interacting proteins in a relatively short time. Two types of large screens are used: the library approach and the matrix approach.
  • Yeast two-hybrid data can be of similar quality to data generated by the alternative approach of coaffinity purification followed by mass spectrometry (AP/MS).[39][9]

Weaknesses

[edit]
  • The main criticism applied to the yeast two-hybrid screen of protein–protein interactions are the possibility of a high number of false positive (and false negative) identifications. The exact rate of false positive results is not known, but earlier estimates were as high as 70%. This also, partly, explains the often found very small overlap in results when using a (high throughput) two-hybrid screening, especially when using different experimental systems.[9][30]

The reason for this high error rate lies in the characteristics of the screen:

  • Certain assay variants overexpress the fusion proteins which may cause unnatural protein concentrations that lead to unspecific (false) positives.
  • The hybrid proteins are fusion proteins; that is, the fused parts may inhibit certain interactions, especially if an interaction takes place at the N-terminus of a test protein (where the DNA-binding or activation domain is typically attached).
  • An interaction may not happen in yeast, the typical host organism for Y2H. For instance, if a bacterial protein is tested in yeast, it may lack a chaperone for proper folding that is only present in its bacterial host. Moreover, a mammalian protein is sometimes not correctly modified in yeast (e.g., missing phosphorylation), which can also lead to false results.
  • The Y2H takes place in the nucleus. If test proteins are not localized to the nucleus (because they have other localization signals) two interacting proteins may be found to be non-interacting.
  • Some proteins might specifically interact when they are co-expressed in the yeast, although in reality they are never present in the same cell at the same time. However, in most cases it cannot be ruled out that such proteins are indeed expressed in certain cells or under certain circumstances.

Each of these points alone can give rise to false results. Due to the combined effects of all error sources yeast two-hybrid have to be interpreted with caution. The probability of generating false positives means that all interactions should be confirmed by a high confidence assay, for example co-immunoprecipitation of the endogenous proteins, which is difficult for large scale protein–protein interaction data. Alternatively, Y2H data can be verified using multiple Y2H variants[40] or bioinformatics techniques. The latter test whether interacting proteins are expressed at the same time, share some common features (such as gene ontology annotations or certain network topologies), have homologous interactions in other species.[41]

See also

[edit]
  • Phage display, an alternative method for detecting protein–protein and protein–DNA interactions
  • Protein array, a chip-based method for detecting protein–protein interactions
  • Synthetic genetic array analysis, a yeast-based method for studying gene interactions

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Two-hybrid screening, also known as the yeast two-hybrid (Y2H) system, is a molecular biology technique used to detect and characterize protein-protein interactions (PPIs) in vivo within the nucleus of Saccharomyces cerevisiae yeast cells. Developed by Stanley Fields and Ok-Kyu Song in 1989, the method exploits the modular architecture of the yeast transcription factor GAL4, which consists of a DNA-binding domain (DBD) and a transcription activation domain (AD). In the assay, a protein of interest called the "bait" is genetically fused to the GAL4 DBD, while a potential binding partner termed the "prey" is fused to the GAL4 AD; these hybrid constructs are introduced into yeast cells containing reporter genes under the control of GAL4-responsive upstream activation sequences (UAS). Interaction between the bait and prey proteins reconstitutes a functional GAL4 transcription factor, driving expression of reporter genes such as HIS3 (enabling growth on histidine-deficient media) or lacZ (producing β-galactosidase for colorimetric detection). This genetic readout provides a sensitive, selectable indicator of direct binary PPIs, distinguishing the technique from biochemical methods like co-immunoprecipitation. Since its original description, the Y2H system has evolved through numerous optimizations to enhance specificity and scope, including the use of multiple reporter genes to minimize false positives and the development of library-based screening formats for high-throughput discovery of novel interactors. Key advancements include reverse two-hybrid approaches to detect disrupting mutations or protein dissociation, and or bacterial variants (e.g., split-ubiquitin or B2H systems) to study non-nuclear or prokaryotic interactions that the classic Y2H cannot capture. The technique's impact is evident in genome-wide interactome projects, such as the mapping of thousands of PPIs in and over 52,000 in humans (as of 2023), which have illuminated complex protein networks. Applications of two-hybrid screening span basic and applied research, including elucidation of signaling pathways in , identification of viral-host protein interactions, and efforts to validate therapeutic targets or screen for small-molecule modulators of PPIs. For instance, Y2H has been employed to discover inhibitors of protein interactions relevant to diseases like cancer and to map interactomes in model organisms for . Despite these strengths, limitations persist, such as the propensity for false positives from nonspecific activation, false negatives from steric hindrance or lack of post-translational modifications, and bias toward soluble, nuclear-compatible proteins, often necessitating orthogonal validation with techniques like co-affinity purification. Ongoing innovations, including array-based variants, CRISPR-integrated systems, and massively parallel sequencing-based approaches (e.g., MP3-seq as of 2024), continue to address these challenges and extend the method's utility in .

Overview

Basic premise

The two-hybrid screening technique, also known as the yeast two-hybrid system, is a molecular biology method designed to detect protein-protein interactions (PPIs) in vivo by leveraging the modular structure of eukaryotic transcription factors. At its core, transcription factors consist of a DNA-binding domain (DBD) that recognizes specific promoter sequences and an activation domain (AD) that recruits transcriptional machinery to initiate gene expression; the two-hybrid system exploits this modularity to report PPIs through reporter gene activation, providing a selectable readout for interactions that would otherwise be difficult to observe directly. In the standard setup, a "bait" protein of interest is genetically fused to a DBD, such as the N-terminal DNA-binding domain of the Saccharomyces cerevisiae GAL4 transcription factor, which targets upstream activating sequences (UAS) in the yeast genome. Separately, a "prey" protein—often from a cDNA library—is fused to an AD, such as the C-terminal acidic activation domain of GAL4. These hybrid constructs are expressed in yeast cells harboring reporter genes under UAS control; if the bait and prey proteins physically interact, the DBD and AD are brought into close proximity, reconstituting a functional transcription factor that drives expression of the reporter gene. The GAL4 DBD also contains a nuclear localization signal (NLS), ensuring that both hybrids are transported to the nucleus where transcription occurs, thus confining detection to interactions within this compartment. Reporter gene activation provides a quantifiable readout of the interaction. Common reporters include the HIS3 , which complements a histidine auxotrophy in the host strain, allowing growth on histidine-deficient media as a for positive interactions; another is the lacZ encoding β-galactosidase, which produces a colorimetric signal upon substrate cleavage for easier screening of large libraries. This mechanism enables high-throughput identification of PPIs, as only cells with interacting bait-prey pairs survive selection or exhibit the reporter , establishing a direct link between molecular association and observable cellular response.

Fixed domains

In two-hybrid screening, the fixed domains serve as modular scaffolds: the DNA-binding domain (DBD) targets specific promoter sequences upstream of reporter genes, while the activation domain (AD) recruits transcriptional machinery to activate reporter expression upon protein-protein interaction. The most commonly used DBD is derived from the yeast transcriptional activator GAL4 (), consisting of 1-147, which binds to upstream activation sequences (UAS) without inherent transcriptional activity. Another prevalent DBD is from the bacterial LexA repressor (), spanning 1-202, which recognizes LexA operators and was adapted for eukaryotic systems to provide an orthogonal binding specificity. The primary ADs include the C-terminal region of GAL4 (amino acids 768-881), but for enhanced sensitivity, the strong AD from the herpes simplex virus type 1 VP16 protein ( 413-490) is frequently employed, as it potently recruits coactivators like TFIID and complexes. Engineering of these fixed domains emphasizes modularity and functionality. Domains are selected or modified to prevent auto-activation, where the unfused DBD or AD alone might spuriously drive reporter expression; for instance, GAL4 DBD variants with deletions in potential cryptic activation motifs are used, and baits are screened to exclude those causing self-activation. Flexible linker s, typically glycine-serine repeats (e.g., (Gly4Ser)3), are inserted between the fixed domain and the protein of interest to minimize steric hindrance and allow independent folding, reducing the risk of the fusion partner masking the DBD's DNA-binding site or the AD's recruitment capability. Nuclear localization signals (NLS), such as the SV40 large T antigen (PKKKRKV) or the bipartite NLS from nucleoplasmin, are often appended to both (DBD-fusion) and prey (AD-fusion) constructs to ensure nuclear import in , particularly for non-nuclear proteins, thereby facilitating colocalization at reporter promoters. Potential issues like domain interference arise when the fused protein alters the fixed domain's conformation or affinity, such as a sterically blocking LexA operator binding, which can be mitigated by testing alternative fusion orientations (N- or C-terminal) or using minimal domain fragments. Fixed scaffolds promote by standardizing these components across experiments, enabling interchangeable and prey libraries while maintaining consistent reporter activation thresholds and minimizing system-specific artifacts.

History

Initial development

The two-hybrid screening technique was invented in 1989 by Stanley Fields and Ok-kyu Song at the at Stony Brook. This method emerged as a genetic approach to detect protein-protein interactions , addressing the limitations of traditional biochemical techniques such as crosslinking and co-immunoprecipitation, which often required purified proteins and struggled with transient or weak interactions. The system was developed using the GAL4 transcriptional activator from the yeast Saccharomyces cerevisiae. GAL4 consists of a DNA-binding domain and an activation domain; Fields and Song created hybrid proteins by fusing the DNA-binding domain to one protein of interest (bait) and the activation domain to another (prey). When the bait and prey interact, the domains are brought into proximity, reconstituting GAL4 activity and driving transcription of a reporter gene under control of GAL4 upstream activation sequences (UAS). Initial proof-of-concept experiments demonstrated the system's efficacy by testing interactions between the proteins SNF1 and SNF4, known to associate in the regulation of metabolism; transcriptional activation occurred only when both hybrid constructs were co-expressed. The technique also detected homodimerization of the GAL4 activation domain with itself, confirming its ability to identify self-interactions. These findings, published in Nature, established the foundational framework for subsequent expansions in protein interaction studies.

Key advancements

One significant advancement in the early was the introduction of library screening to identify unknown protein interactors, enabling the discovery of novel binding partners beyond predefined candidates. In 1992, Chevray and Nathans demonstrated this approach by constructing a mammalian fused to the activation domain and screening for interactors with the domain of Jun, successfully identifying proteins such as alpha- and beta-tropomyosin and an ATF/CREB family member that interact with the Jun . This method expanded the technique's utility for genome-wide interaction mapping, building on the foundational 1989 two-hybrid system by Fields and Song. In the mid-1990s, the reverse two-hybrid system was developed to detect disruptions in protein interactions, such as those caused by mutations or small molecules that lead to loss-of-interaction. Vidal et al. (1996) introduced this variant using a counterselectable reporter like , where interaction activates expression of a toxic gene or , allowing positive selection for non-interacting variants; for example, they identified mutations in and MDM2 that abolish binding. This complemented forward screening by facilitating functional studies of interaction interfaces and inhibitor discovery. By the early 2000s, integration of two-hybrid screening with comprehensive cDNA libraries further scaled interaction detection, while strategies to reduce false positives improved reliability. Key milestones included the first genome-wide two-hybrid interactome maps in , such as Uetz et al. (2000), which identified 692 protein-protein interactions, and Ito et al. (2001), which reported 4,549 interactions, enabling systematic analysis of the . Tong et al. (2002) combined two-hybrid data with results to validate interactions in recognition modules, intersecting datasets to filter artifacts and map high-confidence networks in and systems. A key milestone was the -based library screening in , as introduced by Fromont-Racine et al. (1997) and further optimized in studies such as Soellick and Uhrig (2001), which allowed efficient high-throughput pairing of bait and prey libraries through , enabling screens of millions of combinations with reduced manual effort and applied to projects like the Plasmodium falciparum interactome yielding thousands of validated pairs.

Methodology

Construction of expression plasmids

In two-hybrid screening, the construction of expression plasmids begins with the design of and prey vectors that fuse the protein of interest to fixed domains of a , typically the (DB) for the bait and the activation domain (AD) for the prey. Commonly used bait vectors, such as pGBKT7, incorporate the GAL4 DB fused to the of the bait protein, while prey vectors like pGADT7 use the GAL4 AD for N-terminal fusions to prey proteins; these plasmids are based on high-copy TRP1 and LEU2 selectable markers, respectively, enabling co-expression in hosts. Alternative systems employ LexA-based vectors, such as pLexA for bait and pB42AD for prey, offering flexibility in promoter strength and reporter compatibility. Cloning strategies for inserting target genes into these vectors often involve PCR amplification of open reading frames (ORFs) from cDNA or genomic DNA, followed by ligation into multiple cloning sites (MCS) using restriction enzymes to preserve the reading frame and avoid stop codons that could disrupt fusions. More efficient recombinational approaches, such as Gateway cloning, utilize site-specific recombination via lambda phage att sites to transfer PCR-amplified inserts (flanked by attB sequences) into destination vectors like Gateway-adapted pGBKT7 and pGADT7, bypassing restriction enzyme limitations and enabling high-throughput construction. For prey libraries aimed at identifying interaction domains, random fragmentation of cDNA (via partial DNase I digestion or mechanical shearing) generates diverse fragments that are directionally cloned into the prey vector, typically yielding libraries with 5–10 × 10^6 independent clones to ensure comprehensive coverage. Quality control measures are essential to validate plasmid integrity and functionality. Constructs are verified by Sanger sequencing to confirm insert orientation, reading frame, and absence of mutations in the fusion junctions. Transformation efficiency into yeast is assessed using lithium acetate methods, targeting at least 10^5–10^6 transformants per microgram of DNA to support library-scale screens, with serial dilutions on selective media quantifying viable colonies. These steps ensure reliable expression of fusion proteins without auto-activation or toxicity issues prior to screening.

Recovery of protein information

Following identification of positive yeast colonies in a two-hybrid screen, the recovery of genetic information encoding the interacting proteins is essential to characterize the prey proteins from the cDNA library. This process begins with plasmid rescue, where total DNA is extracted from the yeast cells using enzymatic lysis, such as with lyticase to disrupt the cell wall, followed by phenol-chloroform extraction and ethanol precipitation to yield plasmid DNA suitable for downstream applications. The extracted DNA, containing both bait and prey plasmids, is then transformed into Escherichia coli for selective amplification; for instance, E. coli strains are plated on media with appropriate antibiotics (e.g., ampicillin for prey plasmids with AMP^R markers) to isolate individual plasmids, enabling high-yield propagation and purification. This step leverages the auxotrophic markers on the plasmids, such as LEU2 for the prey, to ensure retention during yeast growth while facilitating bacterial propagation. Once purified, the prey plasmid inserts are sequenced, traditionally using Sanger sequencing with primers specific to the activation domain vector, such as the T7 promoter primer or a 3' activation domain sequencing primer, to read the cDNA insert fused to the GAL4 activation domain. The resulting sequences are analyzed for open reading frames (ORFs) in the correct frame, excluding out-of-frame or non-coding inserts that may represent false positives. To identify known proteins, the sequences are aligned against nucleotide or protein databases using tools like BLAST (Basic Local Alignment Search Tool), comparing to repositories such as GenBank or EMBL to find homologous matches. For example, high-scoring alignments indicate the prey protein's identity, allowing preliminary validation of the interaction specificity through retransformation into yeast with the original bait. Functional annotation of identified interactors involves mapping sequences to comprehensive protein databases like UniProt, which provides curated information on protein function, domains, and subcellular localization to contextualize the interaction. This step aids in validating biological relevance, such as confirming if the prey encodes a known binding partner of the bait. For partial sequences, which often arise from incomplete cDNA libraries, strategies include using rapid amplification of cDNA ends (RACE) to extend the 5' or 3' ends or screening overlapping cDNA clones from the library to reconstruct full-length genes. Novel sequences without strong database matches require de novo assembly, potentially followed by predictive modeling of protein structure or function via tools like homology modeling, and experimental confirmation through co-immunoprecipitation to establish the interaction. These approaches ensure comprehensive recovery even for uncharacterized proteins, though false positives from sequencing artifacts or non-specific interactions necessitate orthogonal validation.

Controlling sensitivity

In two-hybrid screening, sensitivity is primarily controlled through the selection of reporter genes that enable stringent detection of protein-protein interactions while minimizing false positives and negatives. Commonly used reporters include HIS3, which restores biosynthesis and allows growth on histidine-deficient media; ADE2, which complements adenine auxotrophy for enhanced stringency; and lacZ, which produces β-galactosidase for colorimetric or enzymatic readout. Employing multiple reporters in a single yeast strain, such as the EGY48 or PJ69-2A systems, increases specificity by requiring activation of at least two independent genes for positive selection, thereby reducing the likelihood of spurious interactions passing through. For instance, HIS3 provides initial nutritional selection, while lacZ offers a quantitative confirmation, ensuring that only robust interactions are identified. Counterscreens are essential to address common artifacts like auto-activation, where the fusion protein alone activates transcription, or from overexpressed fusions. Auto-activation tests involve transforming with the bait construct alone and plating on selective media lacking ; growth indicates non-specific activation, which can be mitigated by titrating 3-amino-1,2,4-triazole (3-AT), a competitive inhibitor of the HIS3 product, at concentrations from 0.5 mM to 50 mM to suppress basal expression and raise the interaction threshold. assays assess whether the bait or prey inhibits growth by comparing transformation efficiency on non-selective versus selective media, often resolved using low-copy plasmids or inducible promoters. Additional growth condition adjustments, such as varying 3-AT levels during library screening, fine-tune sensitivity to balance detection of weak interactions against background noise. Further optimization involves sensitivity tuning through domain swapping or targeted in the bait or prey fusions to adjust interaction thresholds. Swapping the (e.g., from N-terminal to C-terminal GAL4) or deleting acidic activation domains in the can eliminate auto-activation without altering the protein of interest, allowing detection of subtler interactions. , such as those introducing linker flexibility or altering fusion boundaries, refine specificity by modulating the effective concentration of interacting domains. For quantitative assessment of interaction strength, β-galactosidase assays measure lacZ expression levels, where activity (e.g., in Miller units) correlates with interaction affinity—strong interactions yield high activity within 2 hours, while weaker ones require overnight incubation—providing a scalable metric for ranking candidates beyond binary selection. These approaches collectively enhance the reliability of two-hybrid results by integrating qualitative selection with quantitative validation.

Variants

Non-fusion systems

Non-fusion systems in two-hybrid screening modify the standard approach to detect protein-protein interactions without requiring the test proteins to be fused to both a and an activation domain, thereby facilitating the study of native protein conformations and localizations. These variants leverage endogenous transcriptional machinery or alternative signaling pathways, such as the intrinsic activation domains of certain prey proteins or operator-linked reporter constructs that respond to protein-mediated modulation without additional fusion tags. This allows interactions to be monitored in contexts closer to physiological conditions, particularly for proteins lacking suitable fusion sites or prone to misfolding when tagged. A key advantage of non-fusion systems lies in their suitability for non-nuclear proteins, which may not interact properly when artificially localized to the nucleus in traditional yeast two-hybrid setups. For instance, the bacterial two-hybrid system employing the enables screening in prokaryotic hosts, bypassing eukaryotic nuclear requirements and accommodating bacterial or prokaryotic-specific proteins. In this mechanism, the interaction between bait and prey promotes the of hybrid molecules to dual operators upstream of reporter genes, resulting in transcriptional repression or without relying on a separate activation domain fusion; the 's intrinsic DNA-binding and dimerization properties drive the signal. This system, developed by Joung et al., has been widely adopted for high-throughput interaction mapping in , offering advantages like simpler genetics and reduced costs compared to yeast-based methods. The SOS recruitment system represents a prominent yeast-based tailored for cytosolic and peripheral proteins. Here, the bait protein is anchored to the inner plasma via a myristoylation signal, while the prey is fused only to the catalytic domain of human (a Ras ); interaction recruits Sos to the , activating endogenous Ras and the MAPK pathway, which in turn induces expression. Introduced by Aronheim et al. (1997), this system circumvents nuclear translocation issues, allowing detection of interactions in the where many signaling proteins function natively. Despite these benefits, non-fusion systems generally suffer from reduced signal strength relative to fusion-based counterparts, as the reliance on indirect recruitment or endogenous components can amplify noise from basal activity or weaken the interaction-dependent response, necessitating optimized reporters or selection conditions for reliable results.

Split-ubiquitin and membrane protein detection

The split-ubiquitin two-hybrid system adapts the conventional yeast two-hybrid approach to detect protein-protein interactions occurring at cellular membranes, particularly for integral membrane and non-soluble proteins that cannot easily translocate to the nucleus. In this variant, the ubiquitin protein is divided into its N-terminal half (Nub) and C-terminal half (Cub). The bait protein, typically a transmembrane protein, is fused to the C-terminus of Cub, which is in turn linked to a transcription factor such as LexA-VP16. The prey protein is fused to the N-terminus of Nub, often with a stabilizing mutation (NubG, I13G) to minimize spontaneous reassembly and background activation. When the bait and prey proteins interact in their native membrane environment, the Nub and Cub moieties reassemble to form a pseudo-ubiquitin structure attached to the transcription factor. This reconstituted ubiquitin is recognized by cytosolic deubiquitinating enzymes, which cleave the fusion precisely at the C-terminal boundary of Cub, releasing the transcription factor. The free transcription factor then translocates to the nucleus, where it binds to upstream activating sequences and activates reporter genes such as HIS3, ADE2, or lacZ, allowing selection of interacting partners through yeast growth on nutrient-deficient media or colorimetric assays. This mechanism avoids the need for nuclear import of membrane proteins, enabling detection of interactions in their physiological topology and context. The foundational split-ubiquitin concept was introduced in 1994 for monitoring cytosolic interactions but was adapted for membrane proteins in 1998 using as the host organism. This adaptation proved particularly valuable for studying transmembrane proteins, such as G-protein coupled receptors and ion channels, where traditional two-hybrid systems fail due to solubility and localization constraints. For instance, it has facilitated mapping of interactions in yeast plasma membrane complexes and heterologous mammalian membrane proteins expressed in yeast. A prominent variant is the Yeast Two-Hybrid (MYTH) system, refined in the early by the Stagljar laboratory to enhance for topology-dependent interactions. MYTH incorporates the NubG mutation and multiple reporter genes to reduce false positives and support of cDNA libraries for interactors, making it suitable for dissecting signaling pathways and protein complexes in native orientations. This system has been widely adopted for its ability to detect both stable and transient interactions, with applications extending to targets involving membrane-bound proteins.

Fluorescent and enzymatic assays

The fluorescent two-hybrid (F2H) assay represents an optical of the two-hybrid designed for visualization of protein-protein interactions in living cells. In this approach, a bait protein fused to a fluorescent protein and the is tethered to an integrated lac operator array in the , creating a distinct fluorescent spot observable by ; a fluorescently tagged prey protein co-localizes at this spot only if it interacts with the bait, allowing assessment of interactions across subcellular compartments such as the nucleus, , and mitochondria. Developed in 2008, the F2H assay excels in mammalian cell lines like HEK293T and U2OS, enabling real-time observation of dynamic interactions, including cell cycle-dependent associations and responses to external stimuli, without relying on transcriptional activation. Its microscopy-based readout provides spatial context and simplifies verification of high-throughput screen hits by distinguishing specific co-localization from nonspecific fluorescence. For quantitative analysis, F2H integrates with to measure interaction efficiencies at the single-cell level, often through fluorescent intensity correlations that correlate with binding affinity . Enzymatic two-hybrid variants employ kinase-mediated to generate measurable activity as an interaction readout, offering enhanced sensitivity for low-abundance or transient associations. The kinase substrate sensor () system, for instance, fuses the bait to a such as TYK2 and the prey to a phosphotyrosine-binding domain on a scaffold; interaction recruits the kinase to phosphorylate STAT3 docking sites, activating a reporter via JAK-STAT signaling for luminescent detection. Introduced in 2014, operates effectively in mammalian cells, accommodating post-translational modifications and transmembrane proteins like GPCRs, while supporting real-time monitoring of interaction dynamics modulated by stimuli, though reporter limits resolution to hours. This enzymatic cascade provides a quantifiable signal proportional to interaction strength, facilitating in formats like 96-well plates. These fluorescent and enzymatic assays expand two-hybrid capabilities beyond yeast-based transcriptional outputs, enabling studies in physiologically relevant mammalian systems with direct, non-invasive readouts.

Sequencing-based approaches

Sequencing-based approaches integrate next-generation sequencing (NGS) with two-hybrid systems to enable high-throughput identification of protein-protein or protein-DNA interactions, surpassing the limitations of traditional colony-based readouts by quantifying interactions at scale through barcode enrichment or direct sequencing of interactors. In yeast one-hybrid sequencing (Y1HS), a variant of the two-hybrid system, DNA bait sequences are integrated upstream of a reporter gene in yeast, and a library of transcription factors (TFs) is screened for binding; interacting TFs activate the reporter, and NGS is used to sequence and identify enriched TF open-reading frames (ORFs) from pooled selections, facilitating genome-wide mapping of TF-DNA binding sites without reliance on predefined motifs. This method achieves high precision, with approximately 90% of identified interactions confirmed by independent motif scanning, and has been applied to mouse TFs, revealing novel regulatory interactions such as RFX2 and ONECUT2 binding to the Mcts2-Id1 enhancer. For protein-protein interactions, deep sequencing in yeast two-hybrid (Y2H) employs barcode-labeled libraries where unique DNA barcodes are fused to bait and prey ORFs; upon mating and selection, NGS quantifies interaction strength by counting co-enriched barcode pairs, enabling quantitative proteomics in pooled formats. Seminal implementations include Barcode Fusion Genetics-Y2H (BFG-Y2H), which screens full matrices of protein pairs in multiplexed pools to generate comprehensive interactomes, and massively parallel PPI measurement by sequencing (MP3-seq), which simplifies high-throughput Y2H for thousands of interactions with minimal hands-on time. These approaches offer for proteome-wide screens, processing millions of potential interactions in parallel, and incorporate error correction through biological replicates and statistical modeling to reduce false positives. Recent applications include human interactome mapping projects, such as extensions of the HI-III dataset, where pooled Y2H with NGS has identified thousands of binary interactions to refine disease-associated networks.

One-, three-, and reverse hybrid variants

The one-hybrid system modifies the two-hybrid approach to identify proteins that bind specific DNA sequences, thereby detecting DNA-protein interactions. In this variant, the bait consists of a DNA sequence of interest placed upstream of a reporter gene promoter, while the prey comprises a library of cDNA fragments fused to a transcriptional activation domain, such as that from VP16 or B42. Proteins capable of binding the DNA bait recruit the activation domain to the promoter, driving reporter gene expression and enabling selection of DNA-binding transcription factors or other regulatory proteins. This method was pioneered in the early 1990s, with Wilson et al. (1991) first applying it to isolate the DNA binding site for the transcription factor NGFI-B through genetic selection in yeast. Subsequent advancements, including Li and Herskowitz (1993), used the system to clone ORC6, a component of the origin recognition complex, by screening for proteins binding to ARS1 DNA sequences. The one-hybrid has facilitated mapping of regulatory elements and transcription factor binding sites in eukaryotic genomes, contributing to understanding gene regulatory networks. The three-hybrid system extends the two-hybrid framework to probe ternary interactions, particularly those involving RNA-protein complexes, such as protein-RNA-protein assemblies critical for processes like mRNA stability and splicing. It incorporates three components: a DNA-binding domain hybrid (e.g., LexA), an RNA-binding hybrid protein fused to a transcriptional activation domain (e.g., MS2 coat protein-VP16), and a chimeric RNA molecule containing the RNA bait of interest ligated to binding sites for the RNA-binding domain (e.g., MS2 stem-loops). The RNA bridges the two hybrids, reconstituting the transcription factor and activating reporters like HIS3 or LacZ. Independently developed in 1996, SenGupta et al. from the Wickens and Fields labs described a system using bifunctional RNA to detect interactions between iron response element RNA and iron regulatory protein 1. Concurrently, McWhirter and Wang (1996) introduced a GAL4-based tri-hybrid for analyzing RNA-protein interactions in cell cycle regulation. This variant has been instrumental in elucidating RNA-mediated regulatory networks, including those in post-transcriptional control. The reverse two-hybrid system, sometimes termed the one-two-hybrid, inverts the selection logic of the standard two-hybrid to identify conditions disrupting protein-protein interactions, such as loss-of-function or small-molecule inhibitors. Here, interaction between and prey activates a "negative" reporter, typically a cytotoxic like CYH2 (conferring sensitivity) or (enabling 5-fluoroorotic acid toxicity), so that non-interacting pairs confer survival under selective conditions. This enables of interaction interfaces or screening for disruptors. Vidal et al. (1996) established the reverse two-hybrid using integrated reporters in to select dissociating protein-protein or DNA-protein complexes, demonstrating its utility for structure-function analysis. Building on this, Licitra and Liu (1996) refined the approach with a -based selection to isolate mutants defective in specific interactions, such as those in the RAS pathway. These systems have advanced studies of regulatory networks by pinpointing critical residues and interaction modulators in signaling pathways.

Host Organisms

Yeast systems

The yeast two-hybrid (Y2H) system predominantly employs Saccharomyces cerevisiae as the host organism, leveraging its eukaryotic cellular environment to facilitate protein expression, nuclear localization, and initial post-translational processing akin to higher eukaryotes. This setup, first described by Fields and Song in 1989, reconstructs protein interactions within the yeast nucleus to activate reporter genes under the control of hybrid transcription factors. Common strains include AH109 (MATa, ura3-52, his3-200, ade2-101, trp1-901, leu2-3, 112, LYS2::GAL1UAS-GAL1TATA-HIS3, GAL2UAS-GAL2TATA-ADE2, URA3::MEL1UAS-MEL1TATA-lacZ) and Y190 (MATa, ura3-52, his3-200, lys2-801, ade2-101, trp1-901, leu2-3, 112, gal4Δ, gal80Δ, URA3::MEL1UAS-MEL1TATA-lacZ, LYS2::GAL1UAS-GAL1TATA-HIS3), both featuring integrated reporter genes responsive to GAL4-based interactions. AH109 integrates three reporters—HIS3, ADE2, and MEL1 (encoding α-galactosidase)—under distinct GAL4 upstream activating sequences to minimize false positives through stringent selection. Y190, in contrast, primarily uses HIS3 and lacZ reporters for histidine prototrophy and β-galactosidase activity, respectively, enabling dual readout confirmation of interactions. Key advantages of S. cerevisiae stem from its well-characterized and eukaryotic machinery, which support proper and weak or transient interactions that might evade detection in prokaryotic systems. Auxotrophic markers such as TRP1 (typically on plasmids) and LEU2 (on prey plasmids) enable co-selection of transformants on synthetic dropout media lacking and , ensuring stable maintenance without antibiotics. Additionally, the compatibility of haploid strains—such as a-mating AH109 with α-mating Y187—facilitates high-throughput library screening by generating diploids that combine and prey libraries efficiently, often yielding millions of potential interactors in a single cross. Standard protocols involve high-efficiency transformation via the lithium acetate method, achieving up to 10^5 transformants per microgram of DNA, followed by plating on selective dropout media to isolate co-transformants. Reporter activation is then assessed by replica plating onto media lacking histidine (for HIS3) or adenine (for ADE2), where growth indicates interaction-mediated prototrophy; colorimetric detection uses X-gal overlay for lacZ (blue colonies) or α-gal for MEL1. These steps, often performed at 30°C for 3–5 days, align with yeast shuttle plasmids that ensure compatibility and replication in both yeast and E. coli. Despite these strengths, a notable limitation is the absence of mammalian-specific post-translational modifications, such as certain glycosylations or phosphorylations, which can prevent detection of interactions dependent on such events.

Bacterial systems

Bacterial two-hybrid screening utilizes as the primary host organism, leveraging its prokaryotic simplicity for detecting protein-protein interactions. Common strains include for plasmid cloning and propagation, which supports blue-white screening through its lacZΔM15 mutation, and BL21 for protein expression due to its protease-deficient genotype that minimizes degradation of fusion proteins. Strains engineered with lacI^q, such as derivatives of XL1-Blue, provide enhanced repression of lac promoter-driven expression to reduce background activation and improve assay specificity. The core of bacterial two-hybrid systems involves or mediated by protein interactions. In the cI-based system, originally developed by Hu et al., one protein is fused to the of cI , and the other to the or dimerization domain of cI or the related 434 ; interactions promote cooperative binding to operator sites, or reporters like lacZ. Alternatively, the AraC-based system, described by Kornacker et al., fuses proteins to AraC (an arabinose-responsive activator) and a ; interactions induce DNA looping that inhibits AraC-mediated transcription of reporters such as lacZ, allowing detection of interactions through loss of . These setups enable quantitative assessment via β- activity from lacZ. Advantages of bacterial systems include faster growth rates (typically 20-30 minutes versus 90-120 minutes in ), facilitating of large libraries (>10^8 transformants), and easier genetic manipulation due to high transformation efficiency and cost-effective media. However, challenges arise from prokaryotic expression, such as the formation of when fusion proteins misfold or aggregate, which can mask true interactions and lead to false negatives. For recovery of interacting plasmids, protocols often incorporate blue-white screening with and IPTG; interacting hybrids activate lacZ, producing colonies on indicator plates, while non-interactors remain , allowing direct selection and isolation of positives prior to sequencing or retransformation, as outlined in general protein recovery methods.

Mammalian and other eukaryotic systems

Two-hybrid screening has been adapted for mammalian cells to enable detection of protein-protein interactions in a cellular environment that more closely mimics human physiology, particularly for studying post-translational modifications such as and that are often absent in . A common implementation uses human embryonic kidney (HEK293) cells transfected with constructs expressing fusion proteins linked to a and an activation domain, driving expression upon interaction. This system offers advantages for analyzing human proteins, as the mammalian host supports native folding and modifications, reducing false negatives from improper processing. Delivery of two-hybrid constructs in mammalian systems typically relies on transient transfection methods, such as lipofection, or stable integration via lentiviral vectors to achieve high expression levels in diverse cell types. However, challenges include potential from overexpression of fusion proteins, which can lead to non-specific activation or cell death, necessitating optimized low-dose protocols and viability controls. Beyond mammalian cells, two-hybrid approaches have been developed for other eukaryotes to address tissue-specific or organismal relevance. In plants, a two-hybrid (P2H) system in mesophyll protoplasts uses GAL4-based fusions to detect interactions via reporter activation, allowing transient assays in a native plant cellular context without stable transformation. For neural studies, an two-hybrid system in cultured neurons has been employed to examine interactions among transcription factors like ApCREB1a, ApCREB2, and ApC/EBP involved in facilitation, providing insights into molluscan neuronal signaling. In insects, two-hybrid screening in cultured cells, such as those derived from ovarian tissue, facilitates analysis of silk-related or developmental protein interactions using reporter systems adapted for lepidopteran hosts. Additionally, a two-hybrid (C2H) system in this enables of protein interactions, including those between effectors and host factors, supporting studies of host- dynamics in eukaryotic models.

Applications

Protein interaction mapping

Two-hybrid screening has been instrumental in conducting genome-wide analyses to map protein-protein interactions (PPIs) on a large scale, particularly in yeast. One seminal study employed a matrix-based approach, testing 192 bait proteins against approximately 6,000 activation domain (AD) fusions derived from the Saccharomyces cerevisiae genome, identifying 692 interactions, while a complementary cDNA library screen yielded 838 interactions, resulting in 957 unique putative PPIs involving 1,004 proteins. This effort highlighted the system's capacity for systematic binary interaction discovery, revealing connections that place unclassified proteins into functional contexts, such as DNA repair and cell cycle regulation. Building on this, a more comprehensive screen systematically assayed all pairwise combinations among 6,000 open reading frames, detecting 4,549 interactions among 3,278 proteins and uncovering unexpected patterns, including multi-protein complexes and hub proteins with high connectivity. These large-scale two-hybrid datasets have facilitated the construction of protein interaction networks, enabling the visualization of interactomes as graphs where nodes represent proteins and edges denote binary interactions. By integrating two-hybrid results with other experimental data, such as affinity purification-mass spectrometry, researchers have generated comprehensive PPI maps that reveal modular network structures, like densely connected modules corresponding to cellular processes. For instance, two-hybrid-derived interactions from have been incorporated into databases like , which combines over 2 billion PPIs across organisms by fusing high-throughput screens with curated knowledge, allowing users to query and analyze interaction confidence scores and functional enrichments. This integration supports network-based predictions, such as identifying essential genes or pathway perturbations, while emphasizing the binary nature of two-hybrid outputs as direct, physical contacts amenable to graph-theoretic analysis. To ensure reliability, interactions from two-hybrid screens are typically validated using orthogonal biochemical methods, such as co-immunoprecipitation (co-IP), which confirms associations under native conditions, though the primary value lies in the high-throughput generation of candidate binary pairs for further study. At a finer resolution, two-hybrid screening excels in domain-level mapping, where truncated or mutated constructs pinpoint minimal interaction motifs. A classic example is the delineation of Ras-Raf binding regions, where two-hybrid assays with deletion mutants identified the Ras effector region (residues 32-40) and the Raf (, residues 51-131) as sufficient for interaction, establishing a paradigm for motif-specific PPI characterization. Such approaches have since been refined for high-precision mapping, aiding in the design of interaction-targeted interventions.

Functional protein analysis

Two-hybrid screening enables the functional of proteins by identifying specific sequences essential for their interactions and activities, such as domains or motifs that mediate binding or . Through systematic deletion mapping, researchers generate mutants of the bait or prey proteins and assess their ability to restore reporter gene activation in the two-hybrid system. This approach pinpoints minimal interaction regions by revealing which abolish or retain binding, thereby delineating functional domains like the Src homology 3 (SH3) domain, which was mapped in early studies of signaling proteins to define proline-rich peptide recognition sites critical for protein recruitment. Mutagenesis screens further refine these insights by introducing targeted substitutions to probe the contributions of individual residues to protein function. mutagenesis, where key positions are replaced with to disrupt potential interactions without altering overall structure, is particularly effective in the two-hybrid context; loss of reporter activity identifies residues forming the interaction interface, such as conserved hotspots in antibody-antigen complexes that dictate binding affinity. This method has been applied to delineate energetic hotspots in protein-protein interfaces, confirming their role in stabilizing complexes and providing a basis for understanding regulatory mechanisms. A notable example involves zinc finger proteins, where two-hybrid variants, including bacterial systems, have been used to determine DNA-binding specificity by screening mutant libraries against target sequences; this revealed how specific residues in the zinc finger alpha-helix confer selectivity for nucleotide triplets, guiding the engineering of custom transcription factors. Integrating two-hybrid data with computational structure prediction enhances functional interpretation, as interaction motifs identified experimentally can be modeled in three-dimensional contexts to predict conformational changes or allosteric effects upon binding. For instance, combining two-hybrid-derived domain boundaries with homology modeling has illuminated how truncation of regulatory elements alters protein conformation, offering insights into disease-associated mutations that disrupt function.

Drug discovery and selection

The reverse two-hybrid system modifies the standard two-hybrid approach to select for molecules that disrupt protein-protein interactions, providing a growth advantage to yeast cells by inactivating a toxic reporter gene activated by the undesired interaction. This method is particularly useful for identifying inhibitors of toxic or pathogenic protein complexes, such as those involved in disease progression, by screening libraries of small molecules or peptides that prevent bait-prey binding. For instance, the system has been engineered to detect dissociation events, enabling genetic selection against specific interactions in yeast strains where interaction triggers lethality via reporters like URA3 or CYH2, allowing survivors to be enriched for disruptors. High-throughput adaptations of two-hybrid and reverse two-hybrid systems facilitate screening of large chemical libraries against predefined bait-prey pairs to identify small-molecule modulators of protein interactions relevant to therapeutics. In these assays, cells expressing interacting proteins are exposed to compound libraries, with interaction-dependent reporter activation (e.g., growth inhibition or ) used to select hits that restore viability or alter output when disrupting the pair. Such screens have been scaled to evaluate thousands of compounds, prioritizing those with specificity for disease-related targets while minimizing off-target effects. A notable application involved screening for inhibitors of dimerization, where a modified two-hybrid system was used for genetic selection of peptides from a combinatorial that inhibit intracellular dimerization essential for viral maturation. This approach identified disruptors with micromolar potency that blocked protease activity and informed subsequent . More recently, two-hybrid-based has been applied to discover inhibitors of SARS-CoV-2 spike protein-ACE2 interactions, supporting therapeutic development against variants (as of 2025). In zinc finger engineering, iterative two-hybrid selections enhance DNA-binding specificity by evolving C2H2 proteins through cycles of and screening against target sequences. Using bacterial two-hybrid systems, libraries of zinc finger variants are expressed with a as bait and operator sequences as prey, selecting for fingers that activate reporters only on specific DNA sites while avoiding off-target binding. This approach has produced multi-finger arrays with high affinity and specificity, such as those recognizing 9-bp targets, advancing applications in gene regulation and .

Advantages and Limitations

Strengths

Two-hybrid screening enables the detection of protein-protein interactions (PPIs) , within the physiological environment of living cells, allowing interactions to occur under native conditions that mimic those in the organism. This contrasts with in vitro methods such as pull-down assays, which often require and may disrupt natural cellular contexts, potentially leading to false negatives or altered interaction dynamics. The technique supports of large cDNA libraries, typically comprising up to 10^7 clones, facilitating the identification of novel interactors on a genome-wide scale without the need for prior knowledge of potential partners. This scalability has enabled systematic mapping of interactomes, such as the detection of thousands of PPIs in and proteomes through automated and selection protocols. Two-hybrid systems are cost-effective due to their reliance on genetic selection in microbial hosts, which eliminates the expense of biophysical assays like or that demand specialized instrumentation and purified proteins. These genetic approaches can be implemented in standard laboratories, making them accessible for widespread use in interaction discovery. The method's versatility stems from its adaptability across diverse host organisms, including , , and mammalian cells, as well as through variants like one-hybrid or reverse-hybrid systems that extend its utility to non-PPI applications while maintaining the core detection principle.

Weaknesses

One major limitation of two-hybrid screening, particularly in systems, is the high rate of false positives, which can arise from sticky proteins that non-specifically interact with multiple baits or from non-specific activation of reporter genes by misfolded or out-of-frame proteins. In library-based screens, false positive rates have been reported to exceed 50% in up to 35% of studies, necessitating extensive validation with orthogonal methods. These artifacts are exacerbated by high expression levels and interactions with reporter components, such as LexA, leading to overall accuracy estimates below 10% in some high-throughput applications. The method also suffers from inherent biases due to the requirement for nuclear translocation of fusion proteins, which precludes detection of interactions involving membrane-bound, secreted, or compartmentalized proteins that do not localize to the nucleus in . Additionally, stringent selection often misses weak or transient interactions, as these may not sufficiently activate transcription under the conditions. Such biases result in incomplete coverage, with detection rates for known interactions as low as 23-33% in optimized screens. Indirect effects further complicate results, where apparent binary interactions are actually bridged by a third endogenous yeast protein, mimicking direct binding but representing ternary complexes. This issue arises because the assay environment can promote non-physiological associations not observed in native cellular contexts. Recent advancements as of 2025 include efforts to integrate computational tools like for post-screen filtering to reduce false discovery rates and new variants such as integrated membrane yeast two-hybrid systems to better detect interactions.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.