Hubbry Logo
logo
Population structure (genetics)
Community hub

Population structure (genetics)

logo
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something to knowledge base
Hub AI

Population structure (genetics) AI simulator

(@Population structure (genetics)_simulator)

Population structure (genetics)

Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations. In a randomly mating (or panmictic) population, allele frequencies are expected to be roughly similar between groups. However, mating tends to be non-random to some degree, causing structure to arise. For example, a barrier like a river can separate two groups of the same species and make it difficult for potential mates to cross; if a mutation occurs, over many generations it can spread and become common in one subpopulation while being completely absent in the other.

Genetic variants do not necessarily cause observable changes in organisms, but can be correlated by coincidence because of population structure—a variant that is common in a population that has a high rate of disease may erroneously be thought to cause the disease. For this reason, population structure is a common confounding variable in medical genetics studies, and accounting for and controlling its effect is important in genome wide association studies (GWAS). By tracing the origins of structure, it is also possible to study the genetic ancestry of groups and individuals.

The basic cause of population structure in sexually reproducing species is non-random mating between groups: if all individuals within a population mate randomly, then the allele frequencies should be similar between groups. Population structure commonly arises from physical separation by distance or barriers, like mountains and rivers, followed by genetic drift. Other causes include gene flow from migrations, population bottlenecks and expansions, founder effects, evolutionary pressure, random chance, and (in humans) cultural factors. Even in lieu of these factors, individuals tend to stay close to where they were born, which means that alleles will not be distributed at random with respect to the full range of the species.

Population structure is a complex phenomenon and no single measure captures it entirely. Understanding a population's structure requires a combination of methods and measures. Many statistical methods rely on simple population models in order to infer historical demographic changes, such as the presence of population bottlenecks, admixture events or population divergence times. Often these methods rely on the assumption of panmictia, or homogeneity in an ancestral population. Misspecification of such models, for instance by not taking into account the existence of structure in an ancestral population, can give rise to heavily biased parameter estimates. Simulation studies show that historical population structure can even have genetic effects that can easily be misinterpreted as historical changes in population size, or the existence of admixture events, even when no such events occurred.

One of the results of population structure is a reduction in heterozygosity. When populations split, alleles have a higher chance of reaching fixation within subpopulations, especially if the subpopulations are small or have been isolated for long periods. This reduction in heterozygosity can be thought of as an extension of inbreeding, with individuals in subpopulations being more likely to share a recent common ancestor. The scale is important — an individual with both parents born in the United Kingdom is not inbred relative to that country's population, but is more inbred than two humans selected from the entire world. This motivates the derivation of Wright's F-statistics (also called "fixation indices"), which measure inbreeding through observed versus expected heterozygosity. For example, measures the inbreeding coefficient at a single locus for an individual relative to some subpopulation :

Here, is the fraction of individuals in subpopulation that are heterozygous. Assuming there are two alleles, that occur at respective frequencies , it is expected that under random mating the subpopulation will have a heterozygosity rate of . Then:

Similarly, for the total population , we can define allowing us to compute the expected heterozygosity of subpopulation and the value as:

If F is 0, then the allele frequencies between populations are identical, suggesting no structure. The theoretical maximum value of 1 is attained when an allele reaches total fixation, but most observed maximum values are far lower. FST is one of the most common measures of population structure and there are several different formulations depending on the number of populations and the alleles of interest. Although it is sometimes used as a genetic distance between populations, it does not always satisfy the triangle inequality and thus is not a metric. It also depends on within-population diversity, which makes interpretation and comparison difficult.

See all
User Avatar
No comments yet.