Research Project: Small Grains Pangenome Assembly and Annotation

Location: Cereal Crops Research

Project Number: 3060-21000-038-37-S
Project Type: Non-Assistance Cooperative Agreement

Start Date: Aug 1, 2020
End Date: Jul 31, 2025

To generate wheat and oat pangenomes that consist of de novo whole-genome assemblies and annotations. The Wheat Pangenome consists of three lines that represent the foundation of Fusarium Head Blight Resistance (FHB) in the US breeding community. The Oat Pangenome consists of 20 hexaploid oat accessions that represent global diversity. These resources will be useful for characterizing core gene sets, identifying novel gene sequences, accumulating comparative sequence information, which will be of particularly use for disease resistance, agronomic, and quality trait mapping. The project is broken into 2 Sub-Objectives: 1. Generate genome assemblies of three wheat lines (Sumai3, Glenn, Rollag). 2. Generate genome annotations of three wheat lines and 20 oat lines.

Whole-genome sequencing technology and analysis has progressed to the point that the complete, chromosome-level genome assemblies of complex crops can be generated rapidly. Wheat currently has a single reference genome with ten new genomes close to completion. The purpose of this wheat pangenome project (WPP) is to focus on FHB resistance sources for the US breeding community. Oat currently does not have a publicly available reference genome, but several have been completed in the private sector. The Oat Pangenome Project (OPP) involves multiple labs worldwide and will generate publicly accessible whole-genome sequence assemblies for approximately 20 diverse hexaploid oat accessions. For Sub-objective 1, the approach for both assemblies will be equivalent and consist of high-molecular DNA extraction, PacBio HiFi sequencing of 20-25 kb fragments to obtain approximately 25X read coverage, assembly of the reads into contigs with Canu2, and scaffolding of the contigs into pseudomolecules with Hi-C sequencing. The OPP genome sequencing efforts will be performed by collaborators outside of this agreement and data will be shared. Comparative analysis among new genomes and those currently available will provide an improved foundation for trait mapping work, molecular marker development, and evolutionary studies. In addition to sequence, annotations of the genomes are especially important to identify the organization of the transcriptome. For Sub-objective 2, mRNA will be extracted from 2-6 botanical and developmental tissue types (3 wheat and 20 oat) and pooled for PacBio IsoSeq sequencing library construction. Size selection of the libraries above 2 Kb will provide the most complete transcripts and un-sized libraries will capture small isoforms of the RNA for a more complete picture of the transcriptome. The ARS PI will extract RNA from the wheat lines, and another collaborator will extract oat RNA and create sequencing libraries.