F-statistics
|
F-statistics in population genetics, are concerned with the level of heterozygosity in a population and the cause of (usually) a reduction in heterozygosity when compared to Hardy-Weinberg expectation. Such changes can be caused by the Wahlund effect, inbreeding, natural selection or any combination of these.
The concept of F-statistics was developed during the 1920s by the American geneticist Sewall Wright, who was interested in inbreeding in cattle. However, because complete dominance causes the phenotypes of homozygote dominants and heterozygotes to be the same, it was not until the advent of molecular genetics from the 1960s onwards that heterozygosity in populations could be measured.
Contents |
Definition and equations
The measure F is defined as the observed heterozygosity in a population divided by the heterozygosity that would be expected from Hardy–Weinberg equilibrium:
- <math> F = \frac{\operatorname{O}(f(\mathbf{Aa}))} {\operatorname{E}(f(\mathbf{Aa}))}, \!<math>
where the expected value from Hardy–Weinberg equilibrium is given by
- <math> \operatorname{E}(f(\mathbf{Aa})) = 2\, p\, q, \!<math>
and where p and q are the allele frequencies of A and a respectively. It is also the probability that at any locus, two alleles from the population are identical by descent.
For example, consider the data from E.B. Ford (1971) on the scarlet tiger moth:
Genotype | White-spotted (AA) | Intermediate (Aa) | Little spotting (aa) | Total |
---|---|---|---|---|
Number | 1469 | 138 | 5 | 1612 |
From this, the allele frequencies can be calculated, and the expectation of f(AA) derived:
<math>p<math> <math>= {2 \times obs(AA) + obs(Aa) \over 2 \times (obs(AA) + obs (Aa) + obs(aa))}<math> <math>= {1469 \times 2 + 138 \over 2 \times (1469+138+5)}<math> <math>= { 3976 \over 3224} <math> <math>= 0.954<math>
<math>q<math> <math>= 1 - p<math> <math>= 1 - 0.954<math> <math>= 0.046<math>
<math>F<math> <math>= {138 \over 141.2}<math> <math>=0.977<math>
Partition
F-statistics.png
Consider a population that has a population structure of two levels, one from the individual (I) to the subpopulation (S) and one from the subpopulation to the total (T). Then the total F, known here as FIT, can be partitioned into FIS (or θ) and FST (or f):
- <math> 1 - F_{IT} = (1 - F_{IS})\,(1 - F_{ST}). \!<math>
FST can be calculated from:
- <math> F_{ST} = \frac{\operatorname{var}(p)}{p\,(1 - p)} \!<math>
This may be further partitioned for population substructure, and it expands according to the rules of binomial expansion, so that for I partitions:
- <math> 1 - F = \prod_{i=0}^{i=I} (1 - F_{i,i+1}) \!<math>
Effective population size
F is used to define effective population size.