Skip to main content
ARS Home » Plains Area » Fort Collins, Colorado » Center for Agricultural Resources Research » Agricultural Genetic Resources Preservation Research » Research » Publications at this Location » Publication #429222

Research Project: Curation and Research to Safeguard and Expand Collections of Plant and Microbial Genetic Resources and Associated Descriptive Information

Location: Agricultural Genetic Resources Preservation Research

Title: Pooled sequencing data for management and use of heterogeneous germplasm: Examples from sugar beet

Author
item Reeves, Patrick
item Reilley, Ann
item Panella, Leonard
item Richards, Christopher

Submitted to: Crop Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/17/2025
Publication Date: N/A
Citation: N/A

Interpretive Summary: Gene banks are large collections of living plant material containing genes to improve crops. The plant material is maintained as "accessions", often bags of seed, that differ from one another. In many crops, the seeds within an accession are genetically identical. These are "homogeneous" accessions. In other crops, like sugar beet, seeds within an accession differ; these are "heterogeneous". Crop improvement proceeds more quickly when the genome has been sequenced. A genome sequence of a single individual from a homogeneous accession accurately represents the accession, but a single genome sequence from a heterogeneous accession does not. It is too expensive to sequence the genome of many individuals from each heterogeneous accession. The cost to characterize accessions can be reduced by combining individuals into a single pool before sequencing. This approach, called "pooled sequencing", has rarely been used because the data need to be analyzed differently. This study demonstrates proper methods for using pooled sequencing data for common gene bank management activities like confirming accession identity, revealing genetic structure, identifying errors, estimating accession origins, and determining genetic trends among breeding programs that deposit accessions into the gene bank. We also show how scientists who use gene banks can benefit from the same data. This includes finding genes with useful new effects without having to grow the plants, and identifying genes that produce important traits. Generating whole genome pooled sequence data across entire collections could be an important one-time investment to help move valuable genes from gene banks to crops.

Technical Abstract: We promote whole genome pooled sequencing data as a persistent, reusable resource to improve management and utilization of heterogeneous germplasm collections. Using 14.9 Tbp of DNA sequence data from 4987 individuals in the sugar beet primary gene pool, we demonstrate appropriate analytical procedures to reveal population structure, assemble optimized subsets (core collections), perform targeted allele mining, and contribute to the gene discovery pipeline. Table, sugar, fodder, and leaf beet types were found to be genetically distinct, with an affinity shown between wild and leaf beets. Differing genetic trajectories were inferred for germplasm releases from four regional USDA-ARS sugar beet breeding programs. Using a panel designed to represent broad sense variation in B. vulgaris ssp. maritima, we show that the wild relative is variable, divergent, and remains underexploited despite an established, successful history of wild introgressions. We discover that potential novel Rz2 type rhizomania disease resistance alleles are common in table beets and the wild relative but are uncommon in US sugar beet germplasm. Phenotypic characterization data already held in gene banks can be used with pooled sequencing data to accelerate gene discovery. A whole genome signature of selection scan identified BvWIP2 as a candidate gene causing monogerm seed development, a valuable trait in beets, consistent with a recent association study. Mass production of whole genome pooled sequencing data sets linked to gene bank collections would minimize the need to re sequence individuals, in some cases eliminating the wet lab component of genetic studies, shifting the emphasis of gene discovery to phenotyping and bioinformatics.