Submitted to: Conservation Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/22/2018
Publication Date: 8/23/2018
Citation: Reeves, P.A., Richards, C.M. 2018. Biases induced by using geography and environment to guide ex situ conservation. Conservation Genetics. 19:1281-1293. https://doi.org/10.1007/s10592-018-1098-z.
Interpretive Summary: Collections should be efficient, containing maximum useful genetic variation in a small number of samples. In the past, non-genetic features such as the location or environment where a sample was collected have been used to guide collection expeditions or produce subsets of collections used by breeders. This practice stems from the idea that samples from different locations or environments likely contain more genetic variation than samples from similar sites. In this study we test the validity of this idea using high density, genome wide data sets of single nucleotide polymorphisms Germplasm collections hold the genetic variation necessary to improve agricultural traits. (SNPs), which provide a detailed, direct measure of the genetic diversity contained in a collection. Using data from two crop species (Populus, Sorghum) and the model plant Arabidopsis we show that diversity at the level of the DNA is unevenly represented when collections are created to maximize geographic diversity (samples from different places) or environmental diversity (samples from different environments). Some regions of the genome end up “well conserved” and exhibit much DNA level genetic variation, while others end up “poorly conserved”, with less molecular diversity than would be expected if collections were assembled at random. In both crop species, poorly conserved regions were enriched in regulatory genes, which are well known contributors to trait variation. Overall, diversity at genes responsible ~10% of major molecular functions and biological processes was poorly conserved. Using geographic or environmental diversity as a surrogate for genetic diversity may result in collections that lack the specific genetic variation necessary to drive change in some agricultural traits. Data from genomewide assays of SNP variation should be used to produce unbiased general collections, or diverse subsets that target genes implicated in the development of traits valuable to agriculture.
Technical Abstract: Ex situ germplasm collections seek to conserve maximum genetic diversity in a small number of samples. Geographic and environmental information have long been treated as surrogate estimators of genetic diversity, proposed to be useful for increasing allelic diversity of collections. We examine the effect of maximizing geographic and environmental diversity on the retention of distinct haplotype blocks in germplasm subsets, using three species with extensive genomewide genotypic data. We show that maximizing diversity in the surrogate measures produces subsets with uneven representation of haplotypic diversity across the genome. Some regions are well conserved, exhibiting high haplotypic diversity, while others are poorly conserved and contain significantly less haplotypic diversity than would be obtained via random sampling. In two of three species, poorly conserved genomic regions were enriched in regulatory genes which, as a class, contribute to phenotypic variation. Haplotypic diversity at genes responsible for ~10% of major molecular functions and biological processes was poorly conserved. We conclude that geographic and environmental diversity are poor surrogates for allelic diversity, offering little opportunity to enrich collections for haplotypic diversity overall, and ample opportunity to bias the conservation of important functional genetic variation. We propose a bioinformatic bridge between haplotypic diversity and the potential phenotypic diversity residing in collections using the Gene Ontology.