Location: Plant, Soil and Nutrition Research
Project Number: 8062-21000-043-005-S
Project Type: Non-Assistance Cooperative Agreement
Start Date: Aug 1, 2020
End Date: Jul 31, 2022
The proposed work is a joint effort to develop genomic and bioinformatics approaches to accelerate the breeding of maize, sorghum, range grasses, and specialty crops. The goal is to leverage knowledge developed in crops and model systems to create genomic selection models that are anchored upon a shared mechanism found across angiosperms. Machine learning, statistical, and genomic approaches will be used to create these models. 1. Develop transferable DNA level models that predict gene expression and protein activity based on model systems and major crops. 2. Sequence and compare the genomes of angiosperms to estimate the evolutionary fitness consequence of novel mutations. 3. Contribute to global efforts to understand how maize and other grasses interact with their environments (abiotic stresses – heat, cold, and nitrogen) and contribute to building soil carbon and nitrogen. Changing the growth pattern of warm season grasses is key to future productivity in high heat, sustainability of soils and reducing nitrogen inputs. 4. Deploy tools and information systems help geneticists and breeders accelerate breeding of plants.
1. RNA expression and protein profiles will be developed for maize and other grasses under a wide range of abiotic conditions and tissues. These will be integrated and compared to other existing plant datasets to train machine learning models that accurately predict RNA expression and protein levels. 2. Using the latest whole DNA sequencing technologies and assembly approaches maize and related grasses will be sequenced and assembled. These genomes will be compared to all other available genomes, and machine learning models that account for duplications and genome quality will be developed to estimate the fitness consequence of every base in the genome. 3. Understanding abiotic interactions will involve three strategies: a. Thirty public sector research groups across the country are part of the U.S. Genomes to Field Genotype by Environment experiment to understand how maize genetics interacts with its environment. This project will use genomics to sequence and profile the germplasm in this national experiment, and participate in high intensity phenotypic evaluation with robotic measurement and analysis to understand how daily weather condition interact with genetics. b. The frost tolerance genes will be cloned in the sister genus of maize Transcom dactyloides by mapping in bulk segregant populations, followed by expression profiling, and collaboration to create similar alleles in maize. c. Perennial grasses naturally recycle nitrogen and enhance of soil carbon. Using association genetics of soils with annual and perennial grasses, the Cooperator will evaluate whether there is substantial genetic variation that contributes nitrogen recycling and soil carbon. 4. The TASSEL software project is the international leading software platform for associating plant diversity with function. This project will continue to evolve it work with software workbench (e.g. R Studio and Jupyter Notebooks) based platforms, while enhancing it capability to apply DNA level machine learning to applied breeding contexts. The PHG is first of its kind haplotype graph that can be applied for both genomics and breeding, and it will be enhanced to scale with datasets that are likely to increase 10-100 fold over the next several years.