Hubbry Logo
Locus (genetics)Locus (genetics)Main
Open search
Locus (genetics)
Community hub
Locus (genetics)
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Locus (genetics)
Locus (genetics)
from Wikipedia
Parts of a typical chromosome:

(1) Chromatid
(2) Centromere
(3) Short (p) arm
(4) Long (q) arm

In genetics, a locus (pl.: loci) is a specific, fixed position on a chromosome where a particular gene or genetic marker is located.[1] Each chromosome carries many genes, with each gene occupying a different position or locus; in humans, the total number of protein-coding genes in a complete haploid set of 23 chromosomes is estimated at 19,000–20,000.[2]

Genes may possess multiple variants known as alleles, and an allele may also be said to reside at a particular locus. Diploid and polyploid cells whose chromosomes have the same allele at a given locus are called homozygous with respect to that locus, while those that have different alleles at a given locus are called heterozygous.[3] The ordered list of loci known for a particular genome is called a gene map. Gene mapping is the process of determining the specific locus or loci responsible for producing a particular phenotype or biological trait. Association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci (QTLs) that takes advantage of historic linkage disequilibrium to link phenotypes (observable characteristics) to genotypes (the genetic constitution of organisms), uncovering genetic associations.

Nomenclature

[edit]
Cytogenetic banding nomenclature

The shorter arm of a chromosome is termed the p arm or p-arm, while the longer arm is the q arm or q-arm. The chromosomal locus of a typical gene, for example, might be written 3p22.1, where:[citation needed]

  • 3 = chromosome 3
  • p = p-arm
  • 22 = region 2, band 2 (read as "two, two", not "twenty-two")
  • 1 = sub-band 1

Thus the entire locus of the example above would be read as "three P two two point one". The cytogenetic bands are areas of the chromosome either rich in actively-transcribed DNA (euchromatin) or packaged DNA (heterochromatin). They appear differently upon staining (for example, euchromatin appears white and heterochromatin appears black on Giemsa staining). They are counted from the centromere out toward the telomeres.[citation needed]

Example of cytogenetic bands
Component Explanation
3 The chromosome number
p The position is on the chromosome's short arm (a common apocryphal explanation is that the p stands for petit in French); q indicates the long arm (chosen as next letter in alphabet after p; it is also said that q stands for queue, meaning "tail" in French[4]).
22.1 The numbers that follow the letter represent the position on the arm: region 2, band 2, sub-band 1. The bands are visible under a microscope when the chromosome is suitably stained. Each of the bands are numbered, beginning with 1 for the band nearest the centromere. Sub-bands and sub-sub-bands are visible at higher resolution.[citation needed]

A range of loci is specified in a similar way. For example, the locus of gene OCA1 may be written "11q1.4-q2.1", meaning it is on the long arm of chromosome 11, somewhere in the range from sub-band 4 of region 1 to sub-band 1 of region 2.[citation needed]

The ends of a chromosome are labeled "pter" and "qter", and so "2qter" refers to the terminus of the long arm of chromosome 2.[citation needed]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
In , a locus (plural: ) is a specific, fixed physical site on a where a particular or DNA sequence is located, serving as a precise within the . This position determines the inheritance pattern of the genetic material at that site, enabling to and study genomic features. Loci are central to understanding , as each can harbor multiple alleles—alternative forms of the or that differ in their composition and may produce varying effects on an organism's . For instance, alleles at the ABO locus on determine blood types through codominant or recessive interactions, illustrating how allelic differences at a single locus can influence observable traits. Beyond individual genes, loci encompass broader DNA segments, including regulatory regions, and are crucial for identifying quantitative trait loci (QTLs), which are chromosomal regions contributing to complex, measurable traits like or disease susceptibility through cumulative allelic effects. The study of loci has revolutionized , facilitating techniques such as linkage analysis to detect co-inherited markers and genome-wide association studies (GWAS) to pinpoint variants linked to traits or disorders. By revealing patterns of selection, polymorphism, and functional diversity, loci provide insights into evolutionary processes, , and , underscoring their foundational role in modern biological research.

Definition and Basics

Definition

In genetics, a locus (plural: loci) is the specific, fixed physical position of a single gene, DNA sequence, or genetic marker on a chromosome, often described as a "genetic street address." The term derives from the Latin word locus, meaning "place" or "location," and was adopted in genetics during the early 20th century to denote precise chromosomal sites as researchers like Thomas Hunt Morgan developed the chromosomal theory of inheritance. This positional concept emerged from studies on inheritance patterns, emphasizing the stable arrangement of genetic elements along chromosomes. Unlike a , which refers to a functional unit of capable of coding for proteins or , a locus is purely a positional designation and does not imply function. Similarly, an represents a variant form of a gene or sequence occurring at a particular locus, such as different versions that can influence traits, but the locus itself remains the fixed site regardless of the variant present. These distinctions highlight how loci serve as reference points in genomic mapping and analysis, independent of the or variation at that position. A representative example is the ABO blood group locus on the long arm of (9q34.1-q34.2), where different alleles determine the A, B, AB, or O blood types by encoding enzymes that modify cell surface antigens. This locus illustrates how positional specificity enables tracking of heritable variations across generations.

Key Characteristics

A genetic locus refers to a specific, fixed position on a where a particular or DNA sequence is located. This positional stability is maintained through accurate during and faithful segregation during , ensuring that the locus remains consistent across generations and is transmitted intact to , barring rare mutational events or chromosomal rearrangements. Such stability makes loci essential anchors for genetic mapping and , as their consistent chromosomal coordinates allow researchers to track patterns and variations reliably. At any given locus, genetic variability arises from the presence of multiple alleles—alternative forms of the DNA sequence—which can differ in frequency within a population. A locus is considered polymorphic if at least two alleles occur, with one variant present at a frequency of 1% or greater, contributing to overall genetic diversity. Individuals may be homozygous, possessing two identical alleles at the locus, or heterozygous, with two different alleles; heterozygosity is maximized when allele frequencies are equal (each at 0.5), promoting variability in traits and serving as a reservoir for evolutionary adaptation. Loci exhibit patterns, where each contributes one to the via gametes, following the law of segregation to produce predictable genotypic ratios in progeny. In diploid organisms like humans, this results in inheriting exactly two per locus—one from each —with random assortment ensuring equal transmission probabilities absent distorting factors. This mechanism underpins the stable yet variable transmission of genetic material across generations. Functionally, loci can be coding, where the sequence directly encodes proteins (comprising approximately 1.5% of the human genome), or non-coding, encompassing regulatory elements like promoters and enhancers that modulate gene expression without producing proteins. Coding loci influence phenotypes through protein synthesis and structure, while non-coding loci exert effects via transcriptional regulation, chromatin organization, and RNA stability, often linking genetic variants to complex traits and diseases. For instance, variants in non-coding regulatory regions can alter gene dosage, as seen in promoter mutations leading to overexpression in pathological conditions.

Chromosomal and Genomic Context

Position and Location

In , the position of a locus is initially described through chromosomal assignment, which identifies its location on a specific using a standardized notation that includes the chromosome number followed by the arm (p for short arm or q for long arm) and the cytogenetic band. For instance, the TP53 locus, encoding the tumor suppressor protein , is assigned to chromosome 17p13.1. This notation provides a coarse localization based on the visible structure of chromosomes. Cytogenetic bands, visualized through techniques like , further refine this positioning by staining chromosomes to reveal alternating light and dark regions corresponding to and . , which uses treatment followed by Giemsa staining, allows approximation of locus positions within these bands, though its resolution is limited to approximately 5-10 megabases (Mb) per band at standard 550-band resolution. This method remains essential for detecting large-scale chromosomal abnormalities but cannot pinpoint smaller structural variants. For higher precision, molecular coordinates specify the exact base-pair position of a locus within a assembly, such as GRCh38, the current human released by the Genome Reference Consortium. These coordinates denote the start and end positions on the ; for example, the TP53 locus spans from 7,668,421 to 7,687,490 on 17 in GRCh38. This level of detail, achieved through sequencing, enables accurate alignment of genetic data across studies and is crucial for identifying variants at the level. The physical distance between loci on a chromosome does not always directly correspond to their genetic distance, which is influenced by recombination and crossing over during meiosis. Recombination frequency measures how often loci are separated by crossovers, with genetic distance expressed in centimorgans (cM), where 1 cM approximates a 1% recombination rate between two loci. On average, 1 cM corresponds to about 1 Mb of DNA, though this varies across the genome due to recombination hotspots and coldspots. This distinction between physical (base-pair) and genetic (recombination-based) maps is fundamental for understanding inheritance patterns.

Relation to Genes and Alleles

In , a locus refers to a specific, fixed position on a where a or genetic element is located. While a single typically occupies a defined locus, a locus can encompass multiple genes, as seen in gene clusters such as the RCCX locus on , which includes several protein-coding genes like STK19 and CYP21A2. Additionally, loci may contain non-genic elements, including regulatory sequences like promoters and enhancers that influence , or pseudogenes, which are duplicated, non-functional copies of that have accumulated disabling mutations but retain sequence similarity to their parental genes. Alleles are the alternative forms of DNA sequence that can occupy the same locus, arising from variations such as single nucleotide polymorphisms or insertions/deletions. At any given locus, one allele is often designated as the wild-type, representing the most common or ancestral form in a population that typically confers normal function, while mutant alleles introduce changes that may alter or disrupt that function, potentially leading to phenotypic differences. These allelic variants are inherited according to Mendelian principles and are distinguished only when they occur at the identical chromosomal position. In diploid organisms such as humans, each autosomal locus exists in two copies—one on each —allowing an individual to carry two , which form the at that locus. This diploid configuration enables interactions between alleles, including dominance, where the of one allele masks the effect of the other (complete dominance), or recessiveness, where the trait appears only in homozygotes for that allele; partial or incomplete dominance can also occur, resulting in intermediate phenotypes. Such allelic relationships underpin inheritance patterns and . A haplotype represents a set of alleles at multiple linked loci that are inherited together on the same due to low recombination rates between them, forming a characteristic combination transmitted as a unit during . Haplotypes are particularly useful in for tracing ancestry and evolutionary history, as they reflect non-random associations of alleles across loci.

Types of Loci

Gene Loci

Gene loci represent specific chromosomal positions occupied by genes, which are segments of DNA transcribed into messenger RNA (mRNA) that either encode proteins or produce functional non-coding RNAs such as ribosomal RNAs or microRNAs. These loci encompass the entire genomic region necessary for the gene's expression, distinguishing them from broader genomic features by their direct role in producing biologically active transcripts. In eukaryotic organisms, gene loci are fundamental units of heredity, enabling the synthesis of proteins essential for cellular functions and organismal development. The structure of a typical eukaryotic gene locus includes a promoter region that initiates transcription, coding exons interspersed with non-coding introns that are spliced out during mRNA processing, and distal enhancers that modulate expression levels. This organization allows for precise regulation and diversity in gene products through . In humans, the mean gene locus spans about 67 kb, with the median around 27 kb, accommodating variable numbers of exons and extensive intronic sequences that can exceed the coding regions in length. Prominent examples illustrate the biological significance of gene loci. The locus, situated on 17q21, encodes repair protein critical for maintaining genomic stability; mutations here confer susceptibility to breast and ovarian cancers. Similarly, the CFTR locus on 7q31.2 produces a vital for epithelial function, and its defects underlie , a common autosomal recessive disorder. From an evolutionary perspective, gene loci are dynamic, often undergoing duplication—either or via whole-genome events—or transposition to new chromosomal sites, which fosters the of gene families with specialized functions. Such mechanisms have expanded multigene families like those involved in immunity or , driving adaptive across species. For instance, segmental duplications can lead to paralogous genes that diverge in function over time.

Non-Gene Loci

Non-gene loci are specific chromosomal positions that do not encode proteins but play crucial roles in , , and analysis. These loci encompass sequences that facilitate identification of individuals, modulate , or accumulate neutral mutations for evolutionary studies. Unlike gene loci, which directly contribute to protein synthesis, non-gene loci primarily influence genomic function through structural or regulatory mechanisms. Among the primary types of non-gene loci are single nucleotide polymorphisms (SNPs), which involve single base substitutions in non-coding regions and represent the most common form of . Approximately 90% of SNPs identified in genome-wide association studies occur in non-coding areas, where they can affect regulatory elements without altering protein sequences. Microsatellites, or short tandem repeats (STRs), consist of tandemly repeated DNA motifs of 2-6 base pairs in non-genic intervals, exhibiting high rates due to replication slippage. These repeats are abundant in the and contribute significantly to polymorphism without coding potential. Copy number variations (CNVs) at non-genic sites involve duplications or deletions of DNA segments ranging from 50 base pairs to several megabases, affecting about 4.8-9.5% of the genome and influencing non-coding RNAs or structure. Regulatory non-gene loci include enhancers, silencers, and insulators, which control over long distances without containing coding sequences. Enhancers are cis-acting elements that boost transcription by looping to interact with promoters, often located in non-coding intergenic regions. Silencers repress gene activity similarly, while insulators block enhancer-promoter contacts or prevent the spread of , thereby delineating functional genomic domains. For instance, the gypsy insulator in disrupts enhancer communication across more than 20 enhancers, maintaining boundaries. These elements are essential for spatial genome organization and epigenetic regulation. Prominent examples of non-gene loci include the DYS391 locus, a Y-chromosome used in for lineage tracing due to its uniparental and high variability. This tetranucleotide repeat aids in resolving male-female DNA mixtures in crime scene analysis. (VNTR) loci, such as those employed in early DNA fingerprinting, are hypervariable non-coding sequences that enable individual identification through patterns. Their extreme polymorphism stems from differing repeat copy numbers, making them invaluable for forensic applications without genic disruption. Many non-gene loci undergo neutral evolution, accumulating that confer no fitness advantage or disadvantage, which serves as a for phylogenetic reconstruction. Neutral in non-coding regions, such as SNPs or STR expansions, fix via at a relatively constant rate, allowing estimation of divergence times in evolutionary studies. This neutrality contrasts with selected changes in coding regions and enhances resolution in phylogenies.

Nomenclature and Mapping

Naming Conventions

The (HGNC) establishes standardized guidelines for naming human genetic loci to promote consistency in scientific communication and data sharing. These guidelines apply to protein-coding genes, genes, and pseudogenes, requiring unique symbols consisting of uppercase Latin letters and , ideally 3 to 6 characters in length, although longer symbols are used in some cases such as for predicted open reading frames, without punctuation or Greek letters. Gene symbols are italicized in text (e.g., TP53 for the tumor protein locus), while the corresponding protein products use non-italicized symbols in (e.g., TP53). Each approved symbol is assigned a unique HGNC identifier in the format HGNC: followed by a numerical ID (e.g., HGNC:11998 for TP53), which serves as a stable reference across resources. For alleles and variants at a locus, nomenclature incorporates superscripts or descriptive notations to indicate specific mutations. In human genetics, common variants are denoted with the gene symbol followed by a superscript describing the change, such as CFTRΔF508 for the delta F508 deletion in the cystic fibrosis transmembrane conductance regulator locus, a class II mutation affecting protein maturation. This notation builds on HGNC symbols but adheres to broader standards from the Human Genome Variation Society (HGVS), which specifies formats for sequence variants relative to a reference sequence. For coding DNA changes, HGVS uses the prefix "c." followed by position and nucleotide substitution, as in c.1521_1523del for the CFTR ΔF508 deletion, ensuring precise description of alterations at the locus. Naming conventions for non-human organisms differ by species but often emphasize concise, phenotype-based symbols in all caps or mixed case to distinguish from human nomenclature. In model organisms like mice, gene symbols are italicized with an initial capital letter followed by lowercase (e.g., Tyr for tyrosinase), while protein names are all uppercase and non-italicized (e.g., TYR); this contrasts with human all-uppercase italic symbols. For Drosophila melanogaster, loci historically derive from mutant phenotypes and use lowercase italic symbols, such as w for the white-eye locus, originally named descriptively as "white" but standardized to w following early genetic studies. These organism-specific rules facilitate comparative genomics while avoiding overlap with human symbols. Integration with genomic databases ensures locus names are linked and accessible for research. HGNC symbols serve as primary identifiers in resources like Ensembl and NCBI Gene, where they map to detailed annotations, including synonyms and cross-references to other nomenclatures. For instance, the TP53 locus in NCBI Gene (ID: 7157) and Ensembl (ENSG00000141510) includes HGNC-approved names alongside historical aliases, supporting standardized querying and variant reporting. This database harmonization reflects evolving standards, such as shifts from classical descriptive names to symbolic formats, as seen in the w locus transition in nomenclature.

Genetic Mapping Techniques

Genetic mapping techniques encompass a range of methods used to determine the position of genetic loci on chromosomes by estimating their relative locations and distances. These approaches have evolved from early statistical analyses of patterns to high-resolution sequencing-based strategies, enabling precise identification of loci associated with traits or diseases. Classical techniques rely on observing recombination events in families, while physical methods directly visualize or fragment DNA, and modern tools leverage genomic technologies for genome-scale analysis. Classical genetic mapping began with pedigree analysis, where inheritance patterns in multi-generational families are examined to infer locus positions through the of recombination between markers and target loci. Recombination , expressed as a proportion of recombinant , serves as a measure of genetic distance, with 1% recombination approximating 1 (cM). This method, foundational in early , allowed mapping of loci like those for by tracking co-segregation in affected pedigrees. To quantify linkage evidence, the logarithm of odds (LOD) score is calculated as the base-10 logarithm of the likelihood ratio of data under linkage (at recombination fraction θ) versus no linkage (θ=0.5): LOD=log10(L(θ)L(0.5))\text{LOD} = \log_{10} \left( \frac{L(\theta)}{L(0.5)} \right) A LOD score greater than 3 is conventionally interpreted as significant evidence for linkage. This parametric approach assumes a known model and was pivotal in constructing initial human genetic maps in the 1980s, achieving resolutions on the order of 10 cM (roughly 10 megabases, Mb). Physical mapping techniques provide direct chromosomal localization independent of recombination. uses fluorescently labeled DNA probes that hybridize to specific sequences on chromosomes or nuclei, allowing visualization under for cytogenetic positioning. Developed in the mid-1980s, FISH enabled mapping of loci to specific chromosomal bands with resolutions of about 1 Mb, facilitating the integration of genetic and physical maps during the . Radiation hybrid (RH) mapping complements this by exposing cells to to break chromosomes into fragments, then fusing them with recipient cells to create hybrid panels; statistical analysis of marker retention in these panels estimates locus order and distance in centirays (cR), where 1 cR approximates 10-50 kb depending on the panel. RH mapping achieved finer resolutions of 100-500 kb and was instrumental in ordering markers across mammalian genomes in the . Modern techniques have dramatically improved resolution through high-throughput genomics. Genome-wide association studies (GWAS) scan hundreds of thousands of single nucleotide polymorphisms (SNPs) using microarray-based genotyping to identify loci associated with traits via statistical tests for allele frequency differences between cases and controls. The first large-scale GWAS, published in 2007, identified multiple loci for seven common diseases, establishing SNPs as effective markers for fine-mapping with resolutions down to 10-100 kb. Whole-genome sequencing (WGS) further refines this by directly reading DNA sequences, pinpointing causal variants at base-pair (bp) resolution without relying on predefined markers. WGS has revolutionized locus identification in rare disease studies, as demonstrated in projects like the Deciphering Developmental Disorders initiative, where it resolved diagnoses for over 1,000 cases by detecting de novo mutations at specific loci. Advancements in mapping resolution have progressed from megabase-scale in the (via linkage and early FISH) to kilobase and now base-pair precision today, driven by sequencing cost reductions and computational integration of multi-omic data. Tools like CRISPR-Cas9 enable functional validation of mapped loci by targeted editing, confirming causality through phenotypic rescue or enhancement in model organisms.

Applications in Research

Disease and Trait Association

In , loci are central to understanding disease and trait associations, particularly through pathogenic variants that disrupt normal gene function. Mendelian diseases often arise from single-locus alterations, such as the HTT locus on chromosome 4p16.3, where expansions of CAG trinucleotide repeats exceeding 36 copies lead to by producing a toxic polyglutamine tract in the huntingtin protein. This autosomal dominant disorder exemplifies how locus-specific repeat instability causes progressive neurodegeneration, with onset typically in mid-adulthood. For complex traits and disorders, genome-wide association studies (GWAS) have identified numerous polygenic loci contributing to phenotypic variation. Human height, a classic polygenic trait, involves hundreds of loci with small effect sizes; early GWAS pinpointed variants near genes like HMGA2, explaining about 4-6% of height variance, while larger meta-analyses have mapped over 700 loci accounting for up to 40% of heritability. Similarly, schizophrenia risk is polygenic, with the latest large-scale GWAS (as of 2022) identifying 287 independent loci associated with the disorder, implicating pathways in synaptic plasticity and neuronal development. These findings highlight how common variants across multiple loci cumulatively influence disease susceptibility in non-Mendelian contexts. Pathogenic variants at disease-associated loci can exert effects through loss-of-function (LoF) or gain-of-function (GoF) mechanisms, altering protein activity and downstream . LoF variants, such as nonsense mutations introducing premature stop codons, reduce or eliminate protein output, as seen in recessive disorders like where CFTR locus mutations impair chloride transport. In contrast, GoF variants enhance or confer novel protein functions, exemplified by activating mutations in the locus causing McCune-Albright syndrome through constitutive G-protein signaling. Dominant-negative effects, where mutant proteins interfere with wild-type counterparts, further complicate locus pathogenicity, underscoring the need to characterize variant impacts at the molecular level. Clinical applications leverage locus-specific insights for , including and targeted therapies. The locus on 22q13.2 exemplifies pharmacogenomic relevance, as copy-number variations and alleles like *4 lead to poor metabolizer phenotypes, affecting the efficacy and toxicity of drugs such as , which requires activation to . In gene editing, CRISPR-Cas9 technologies enable precise correction of pathogenic variants at specific loci; for instance, base editing has been used to shorten CAG repeats in the HTT locus in cellular models of , reducing toxicity without off-target effects. These approaches hold promise for treating monogenic disorders by restoring locus function, though challenges like delivery efficiency persist.

Population Genetics

In population genetics, a locus serves as the basic unit for analyzing frequencies, which describe the proportion of each variant at that position across individuals in a . For a biallelic locus with alleles having frequencies pp and qq (where p+q=1p + q = 1), the Hardy-Weinberg equilibrium (HWE) predicts genotype frequencies under assumptions of random mating, infinite , no migration, no , and no selection: homozygous dominant at p2p^2, heterozygous at 2pq2pq, and homozygous recessive at q2q^2, such that p2+2pq+q2=1p^2 + 2pq + q^2 = 1. This equilibrium, independently derived by Hardy and Weinberg, provides a null model for detecting evolutionary forces; significant deviations at a locus, such as excess homozygotes, signal non-random mating or , while shifts in frequencies over generations indicate selection, , or . Genetic diversity at a locus is quantified through metrics like heterozygosity, the probability that two randomly sampled alleles are different (expected as 2pq2pq under HWE), which reflects the locus's variability and potential for evolutionary response, and polymorphism, defined as the presence of multiple alleles (typically more than one) or variants at the site. To assess differentiation between populations, the FSTF_{ST} measures the proportion of total genetic variance at a locus attributable to between-population differences, ranging from 0 (no differentiation) to 1 (complete isolation); values around 0.15 indicate moderate structure in humans. Developed by , FSTF_{ST} integrates data across loci to infer , with higher values at specific loci suggesting localized or barriers to migration. Loci enable practical applications in forensic science and ancestry inference, such as panels of short tandem repeat (STR) loci in the CODIS system, where population-specific allele frequencies allow estimation of biogeographical ancestry from crime scene DNA, though with limitations in admixed individuals. In admixture mapping, ancestry-informative markers (AIMs)—loci with substantial allele frequency disparities between ancestral groups (e.g., Δ>0.3\Delta > 0.3)—are used to scan genomes for trait-associated segments in recently admixed populations like African Americans, leveraging linkage disequilibrium between ancestry and local variants. Evolutionarily, loci under positive selection, such as the LCT locus harboring the -13910C>T variant for lactase persistence, exhibit elevated FSTF_{ST} and reduced diversity due to recent sweeps in pastoralist populations, contrasting with neutral loci that accumulate mutations clock-like to trace migration routes, as seen in out-of-Africa expansions via unlinked SNPs.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.