Relationship between genetic diversity within and between species

Relationship between genetic diversity within and between species

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Here is a quote from Wagner (2008)

A second line of evidence [against neutralism] comes from the relationship between the mean number of polymorphic differences between alleles within a species, $pi$, and the number of fixed differences between genes in two species, $d$. For neutral mutations, a positive association between $pi$ and $d$ should exist, because the neutral theory predicts that both quantities are linearly proportional to the rate at which neutral mutations arise. Recent genome-scale data shows instead that this association is in fact negative.

What do "the mean number of polymorphic differences between alleles within a species" and "the number of fixed differences between genes in two species" have to do with each other? Why should there be any relationship at all? And by "two species", are they talking about Eastern Yellowback Whooping Finches versus Western Yellowback Whooping Finches, or any arbitrary two species, like E. Coli versus Muskrats?

I found this related question, but it doesn't say anything about different species.

Metrics of interest

The two metrics you are interested in are

  • $pi$ - the mean number of differences between two randomly sampled (with replacement) alleles in a population
  • $d$ - the mean number of differences between two randomly sampled (with replacement) alleles coming from two different species

Consider two sequences


There are 2 pairwise differences between these two sequences (positions 3 and 6).

The whole point here is to understand that two individuals in the same population coalesce at a given time in the past just like two individuals coming from two different species. The number of pairwise differences is just equal to the rate at which mutations accumulate multiplied by the coalescence time. Let me develop this idea with a few equations below.

Neutral Expectations

Let's do the math! We will do two important assumptions below.

  1. Every mutation makes a new allele (it is an infinite allele model)
  2. All mutations are substitutions (no indels, no gene duplication, etc… )

Intro to Coalescence Theory

Let's first make sure you understand the concept of coalescence. Imagine you were to look at an evolving population backward in time. A coalescent event, is an event by which two lineages become one (looking backward in time). For simplicity consider a case of an asexual population (but future calculations don't do this assumption). In an asexual population, two siblings coalesced the previous generation, as in the previous generation the genes they are carrying were in a single individual (their parent).

We will see below how to calculate coalescence time between two randomly drawn individual from a population and how we can make inference about genetic diversity from this coalescent time.

Rate of accumulation of neutral mutations

The rate of accumulation of neutral mutation is also called the fixation rate of neutral mutations. Let $fix$ be this fixation rate. Consider a panmictic diploid population of constant size $N$. At a given locus of interest, the mutation rate is $mu$. The number of mutations occurring every generation in this population is $2Nmu$ (The $2$ comes from the fact that in a diploid population, there are two homologous copies of every gene in each cell). If the mutation is neutral, then every single individual in the population has the same probability to end up being the parent of the whole population in a long time in the future. In other words, starting to look at this problem from the future and looking backward in time, any individual in a distance far enough in the future descends from a single individual in the present. As a consequence, the probability of a new mutation to reach fixation is $frac{1}{2N}$. Multiplying the probability of fixation by the rate at which mutations occur in the whole population we obtain $frac{1}{2N} 2Nmu = mu$. In other words the rate of fixation of new neutral mutations is simply equal to the rate at which mutation occur.

Key equation:

$$fix=frac{1}{2N} 2Nmu = mu$$

$d$ - number of pairwise differences between randomly sampled (with replacement) alleles coming from different species

Consider two species which common ancestor lived $T$ generations ago. Each lineage accumulated mutations at rate $mu$ (and explained above) and therefore the expected (average) total number of pairwise differences is $ar d=2mu T$. The probability of having exactly $d$ pairwise differences comes from a Poisson distribution with rate $2mu T$ ($P(D=d) = Poisson(2mu T)$).

Key equation: $$ar d=2mu T$$

$pi$ - number of pairwise differences between randomly sampled (with replacement) alleles coming from the same population

Here the calculations follows the same logic as above $pi = 2 mu T$. The whole issue is that $T$ (the coalescence time) is unknown for the moment and we need to calculate it.

Let $P(T)$ be the probability that two randomly sampled (with replacement) individuals from a diploid panmictic population coalesce exactly $T$ generations ago. The probability of coalescing in a given generation is simply the probability for drawing the same individual, that is $frac{1}{2N}$ and the probability of not coalescing in any subsequent generation is $1-frac{1}{2N}$. Iterating over the generations we obtain

$$P(T) = frac{1}{2N} left( 1-frac{1}{2N} ight)^T approx frac{1}{2N}e^{-frac{T}{2N}}$$

, where $e approx 2.7$ is Euler's number. The above approximation is accurate for large $N$ (say larger than 100). In case you are interest, you will find on this Math.SE post, an explanation for this approximation.

The expected value (average) $ar T$ of this distribution is $ar T=2N$ (and the variance is $var(T) = 4N^2$). As a consequence the expected number of pairwise differences between these two individuals is $pi = 2 mu 2 N = 4Nmu$

Quickly speaking, these calculations can be extended to calculate the coalescent time between $k$ randomly sampled alleles. The expected coalescent time is then $ar T = frac{4N}{k(k-1)}$ and therefore some algebra shows that the total expected time (along all branches) is $4 N sum_{i=1}^{k-1}frac{1}{i}$ and the expected number of segregating sites is $4 N mu sum_{i=1}^{k-1}frac{1}{i}$.

Key equation: $$pi = 4Nmu$$


It is clear from the above that $pi = 4Nmu$ and $d=2mu T$ are correlated as they are both linearly related to the mutation rate. As it is said in your quote

For neutral mutations, a positive association between $pi$ and $d$ should exist, because the neutral theory predicts that both quantities are linearly proportional to the rate at which neutral mutations arise

The quote goes one saying

Recent genome-scale data shows instead that this association is in fact negative.

Such negative relationship between $d$ and $pi$ cannot be caused by neutral processes. Therefore selection must be involved somehow.

Imagine for example that two species are both selected at different optima (opposite selection pressures). In such case, you would see purifying selection within species reducing the probability of a new mutation to reach fixation to a value lower than $frac{1}{2N}$ and therefore reducing the rate of fixation $fix$ so that $fix. However, such opposite selection pressure among species will yield the species to diverge more than expected by random processes that is $pi > 4Nmu$. Such selective pattern could explain the observed negative relationship.

More information

This post offers book recommendation on the subject (population genetics).

Difference Between Genetic Diversity and Species Diversity

The variety of life forms of a particular area are referred to as biodiversity. The diversity of our biosphere ranges from macromolecules of a cell to different biomes. Genetic diversity, species diversity, and ecological diversity are three types of biodiversity. The main difference between genetic diversity and species diversity is that genetic diversity is the differences of DNA among individuals of a particular species whereas species diversity is the variety of species in a particular region. Ecological diversity is the variety of ecosystems in a particular area. In order to conserve the biodiversity, the ecosystems and habitats should be protected.

Key Areas Covered

Key Terms: Biodiversity, Characteristics, Ecosystems, Genetic Diversity, Habitats, Population, Species, Species Diversity

Why does biodiversity matter what is the relationship between biodiversity and number of populations?

Population is recognized as an indirect driver of biodiversity loss, as human demands for resources like food and fuel play a key role in driving biodiversity degradation. Slowing population growth will not only ease off pressure on biodiversity, but will also empower women and their families.

Also Know, how does overexploitation affect biodiversity? Threats to Biodiversity: Overexploitation. Overexploitation means harvesting species from the wild at rates faster than natural populations can recover. Overfishing and overhunting are both types of overexploitation. Currently, about a third of the world's endangered vertebrates are threatened by overexploitation.

Furthermore, what is the difference between population and biodiversity?

Biodiversity: The range of variation found among microorganisms, plants, fungi, and animals. Also the richness of species of living organisms. Community: Populations of organisms of different species that interact with one another. Population: A group of individuals belonging to one species living in an area.

How is population growth a threat to biodiversity?

The core threats to biodiversity are human population growth and unsustainable resource use. To date, the most significant causes of extinctions are habitat loss, introduction of exotic species, and overharvesting. Habitat loss occurs through deforestation, damming of rivers, and other activities.

Access options

Get full journal access for 1 year

All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Get time limited or full article access on ReadCube.

All prices are NET prices.

Species Diversity and Genetic Diversity: Parallel Processes and Correlated Patterns

Species diversity and genetic diversity may be correlated as a result of processes acting in parallel at the two levels. However, no theories predict the conditions under which different relationships between species diversity and genetic diversity might arise and therefore when one level of diversity may be predicted using the other. I used simulation models to investigate the parallel influence of locality area, immigration rate, and environmental heterogeneity on species diversity and genetic diversity. The most common pattern was moderate to strong positive species‐genetic diversity correlations (SGDCs). Such correlations may be driven by any one of the three locality characteristics examined, but important exceptions and patterns emerged. Genetic diversity and species diversity were more weakly correlated when genetic diversity was measured for rare versus common species. Environmental heterogeneity not only imposes spatially varying selection on populations and communities but also causes changes in species’ population sizes and therefore genetic diversity these interacting processes can create positive, negative, or unimodal relationships of genetic diversity with species diversity. When species are considered as part of multispecies communities, predictions from single‐species models of genetic diversity apply in some instances (effects of area and immigration) but often not in others (effects of environmental heterogeneity).

We&rsquove seen that every cell of an organism carries the DNA including gene sequences and other kinds of DNA. The genome of an organism is the entirety of its genetic material (DNA, or for some viruses, RNA). The genome of a common experimental strain of E. coli was sequenced by 1997 (Blattner FR et al. 1997The complete genome sequence of Escherichia coli K-12. Science 277:1452-1474). Sequencing of the human genome was completed by 2001, well ahead of the predicted schedule (Venter JC 2001The sequence of the human genome. Science 291:1304-1351). As we have seen in the re-classification of life from five kingdoms into three domains, nucleic acid sequence comparisons can tell us a great deal about evolution. We now know that evolution depends not only on gene sequences, but also, on a much grander scale, on the structure of genomes. Genome sequencing has confirmed not only genetic variation between species, but also considerable variation between individuals of the same species. Genetic variation within species is in fact the raw material of evolution. It is clear from genomic studies that genomes have been shaped and modeled (or remodeled) in evolution. We&rsquoll consider genome remodeling in more detail elsewhere.

It had been known for some time that gene and protein sequencing could reveal evolutionary relationships and even familial relationships. Read about an early demonstration of such relationships based on amino acid sequence comparisons across evolutionary time in Zuckerkandl E and Pauling L. (1965) Molecules as documents of evolutionary theory. J. Theor. Biol. 8:357-366. It is now possible to extract DNA from fossil bones and teeth, allowing comparisons of extant and extinct species. DNA has been extracted from the fossil remains of humans, other hominids, and many animals. DNA sequencing reveals our relationship to each other, to our hominid ancestors and to animals from bugs to frogs to mice to chimps to Neanderthals to&hellip Unfortunately, DNA from organisms much older than 10,000 years is typically so damaged or simply absent, that relationship building beyond that time is impossible. Now in a clever twist, using what we know from gene sequences of species alive today, investigators recently &lsquoconstructed&rsquo a genetic phylogeny suggesting the sequences of genes of some of our long-gone progenitors, including bacteria (click here to learn more: Deciphering Genomic Fossils). The comparison of these &lsquoreconstructed&rsquo ancestral DNA sequences suggests when photosynthetic organisms diversified and when our oxygenic planet became a reality.


Specimen collection

Samples of A. citrinellus and A. zaliosus were collected from two large ancient lakes and three young crater lakes in Nicaragua in 1987, 2001, and 2003. It has been previously demonstrated that these two species form a monophyletic unit, excluding A. labiatus [4]. As A. labiatus is only found in the great lakes, we have only analyzed the monophyletic unit represented by the other two species. The freshwater fish fauna of the two large lakes are thought to be approximately 500,000 years old and are believed to have separated from an ancient lake that formed ρ mya [14,23]. The three crater lakes investigated show somewhat different patterns. Crater Lake Xiloa is believed to have at one time been a part of the large ancient lake that also included its neighbor, the great Lake Managua, though the time of its separation is uncertain [24]. The other two crater lakes, Lake Masaya and Lake Apoyo are thought to be much younger, with the age of Lake Apoyo less than 23,000 years [14]. Whole fish or fin clips were preserved in ethanol until subsequent genomic DNA extraction [method described in [25]].

Haplotype sequencing and microsatellite genotyping

Relationships among populations

Relationships among the populations from different lakes of the Midas cichlid species complex were examined with both the mitochondrial and nuclear datasets. The degree of interpopulational differentiation was measured using pairwise FST estimates of both the mitochondrial haplotypes and microsatellite alleles for each pair of lakes. Significance was tested using 10,000 random permutations of genotypes among populations, implemented in ARLEQUIN v. 2.001 [26], after sequential Bonferroni correction [27].

Discrimination between populations using nuclear loci was also assessed using the Bayesian assignment procedures implemented in the software STRUCTURE v. 2.1 [28]. To identify the likely number of populations within A. citrinellus, STRUCTURE was used to assign a probability of assignment of each individual to different genotypic clusters defined by the ten microsatellite loci [29]. We used an admixture model of genetic clustering run for 10 6 generations after a burn-in of 10 5 generations. We assumed that there were up to seven clusters (k = 1 to 7 preliminary analyses with higher values of k were highly unlikely) and ran three parallel chains to estimate what number of genetic clusters had the highest probability.

Differentiation within populations

We measured several population genetic parameters that can help to distinguish between the various forces, including demographic and selective pressures, that might be influencing genetic divergence in this species complex. First, to determine if genetic change was occurring in any of the populations, we estimated deviations from Hardy-Weinberg equilibrium for each of the microsatellite loci in each population. Then, to test for the presence of selective neutrality, several metrics were estimated. Tajima's [30] D, Fu and Li's [31] F* and D*, and Fu's [32] Fs were calculated for haplotype data using DNASP v. 4.10 [33]. These methods take into account the particular apportioning of genetic variation based on a neutral model of evolution. Given similar demographic conditions, when mutations segregate in a biased manner on individual haplotypes within populations, selection can be inferred – potentially as a mechanism resulting in deviations from Hardy-Weinberg equilibrium. We also tested for selective neutrality using the sampling distribution of mtDNA alleles in a population as implemented in the Ewans-Watterson tests of selective neutrality. For this, we used Slatkin's [34,35] exact test of neutrality as implemented in ARLEQUIN. Finally, historic demographic effects such as population size expansion can be modeled using a pairwise mismatch distribution of haplotype sequences [36]. This procedure determines the probability that the observed mismatch distribution comes from a population having undergone recent population growth (i.e. is unimodal) by comparison with a randomized distribution of the observed data using a parametric bootstrap under a model of sudden demographic expansion [36,37]. Because we expect mtDNA mutation to be negligible during our study period, we combined all mtDNA haplotypes for each lake and estimated the mismatch distribution for each in ARLEQUIN.

Analyses of genetic diversity

For each lake sampled at each time point, standard nucleotide (π) and haplotype (H) diversities [38] were computed for mtDNA haplotypes using ARLEQUIN. These metrics provide an estimate of the mitochondrial genetic diversity present in a population, allowing the observation of changes in genetic diversity through time. Differences in π and H between lakes within years and between years for each lake were tested using one-tailed t-tests and significance was assessed following sequential Bonferroni correction [27].

Relationship between genetic, chemical, and bacterial diversity in the Atlanto-Mediterranean bath sponge Spongia lamella

Does diversity beget diversity? Diversity includes a diversity of concepts because it is linked to variability in and of life and can be applied to multiple levels. The connections between multiple levels of diversity are poorly understood. Here, we investigated the relationships between genetic, bacterial, and chemical diversity of the endangered Atlanto-Mediterranean sponge Spongia lamella. These levels of diversity are intrinsically related to sponge evolution and could have strong conservation implications. We used microsatellite markers, denaturing gel gradient electrophoresis and quantitative polymerase chain reaction, and high performance liquid chromatography to quantify genetic, bacterial, and chemical diversity of nine sponge populations. We then used correlations to test whether these diversity levels covaried. We found that sponge populations differed significantly in genetic, bacterial, and chemical diversity. We also found a strong geographic pattern of increasing genetic, bacterial, and chemical dissimilarity with increasing geographic distance between populations. However, we failed to detect significant correlations between the three levels of diversity investigated in our study. Our results suggest that diversity fails to beget diversity within a single species and indicates that a diversity of factors regulates a diversity of diversities, which highlights the complex nature of the mechanisms behind diversity.

This is a preview of subscription content, access via your institution.

Possible Resolutions of the Riddle

Why then are neutral diversity levels and allozyme variation contained within such a narrow range? If neutral diversity levels are indicative of the ability of a species to adapt to novel selective pressures, then, as argued in the context of conservation biology, there may be a lower limit beyond which a species cannot maintain the variation necessary to respond to a change in environment and so is rapidly driven to extinction (e.g., [6]). In turn, there may be upper limits imposed by functional or structural constraints for example, excessive heterozygosity could interrupt chromosome pairing [66] or lead to reproductive incompatibilities between individuals living in distant regions of the species' range (e.g., [67]). Another explanation for the upper limit could be that effective population sizes increase extremely slowly with the census population sizes, for example if species that are more numerous experience more frequent or more extreme population bottlenecks, and so remain further from their mutation-drift equilibrium diversity levels [11],[68].

Alternatively, the narrow range of diversity may be due to the effects of selection at linked sites. That habitat and range are predictive of diversity is consistent with a neutralist scenario in which aquatic species, species with larger ranges, or outcrossers have greater and more stable population sizes and therefore maintain higher neutral diversity, but it may also be consistent with models in which positive selection is ubiquitous. Under certain assumptions, widespread adaptation can constrain the range of neutral diversity across species: when adaptation is limited by the input of new mutations, larger populations experience a greater influx of beneficial mutations and therefore greater effects of variation-reducing selection (“genetic draft”) at linked neutral sites [8]. In other words, under certain assumptions, there is more genetic draft in species that experience less genetic drift, and combined, these two evolutionary forces lead to a smaller range of neutral diversity across species than expected from differences in their census population sizes [8]. Higher diversity might then be observed in species with broader ranges because local adaptation maintains variation, or because global selection (and the associated loss of diversity at linked sites) is hindered by population structure [42]. As summarized above, several lines of evidence are consistent with marked effects of selection on diversity levels. Nonetheless, the genetic draft explanation requires strong, frequent selection that reduces diversity levels by orders of magnitude, when the few available estimates (based on contrasting diversity levels in different genomic backgrounds) suggest a much weaker impact [69]–[72]. Thus, it remains unclear whether plausible selection models can readily explain the narrow range of diversity among species.

Genetic diversity and adaptation:

As the number of alleles increases so does genetic diversity in a population.

Genetic diversity allows natural selection to occur.

Evolution is a change in a populations’ alleles and genotypes from generations to generations. Therefore it should be considered at a population level. Five factors affect the proportion of homozygote and heterozygote:

  • Genetic drift: This is a change in the gene pool that occurs in a small population due to chance. Two situations can lead to genetic drift:
  • Population Bottlenecks: This is when a large number of a population is wiped out due to disease, natural disasters or overhunting
  • Founder Effect: This is when a new colony is found by a small number of individuals
  • Gene flow: This is the movement of alleles from one population to another when a member moves into another population. The variety of alleles that this member has can significantly affect the gene pool of a population especially if it has good survival and mating skills.
  • Mutations: These are changes to an organisms DNA. The change is transferred to gametes which immediately changes the gene pool. This is a rare event but the cumulative effect is massive. Mutations themselves play an insignificant role in changing the frequency of alleles in a population.
  • Non-random mating: Homozygous individuals increases when preferred organisms mate with each other, causing frequency of genotypes to differ significantly from equilibrium values. Populations consist of individuals with different genetic make-ups. This means that Hardy-Weinberg equilibrium is not maintained. NB: Hardy-Weinberg equilibrium equation and definition does not need to be known for AS only for A2.
  • Natural selection: Populations consist of individuals with different genetic make-ups. Colourful and vibrant organisms are more susceptible to predation.

Directional selection happens in bacteria for example, where the bacteria are resistant to antibiotics leading to an increase in the frequency of the allele that is resistant to antibiotics.

Stabilising selection happens in an changing environment. Stabilising selection occurs in the natural selection of birth mass in humans. Extremes of the phenotype range are selected leading to a reduction in variation.


The traditional farming systems of outheastern Turkey are characterized by occurrence of sympatric wild progenitor𠅍omesticated forms of chickpea (and likewise cereals and other grain legumes). Therefore, both the authentic crop landraces and the wild populations native to the area are a unique genetic resource. Our results grant support to the notion of domestication within the natural distribution range of the wild progenitor, suggesting that the Neolithic domesticators were fully capable of selecting the desired phenotypes even when facing rare wild-domesticated introgression events.