Project : USDA ARS

ARS Home » Midwest Area » Columbia, Missouri » Plant Genetics Research » Research » Research Project #434365

Research Project: Soybean Seed Quality Improvement through Translational Genomics

2021 Annual Report

Objectives
Objective 1: Develop and make available new approaches to evaluate gene functions in gene networks and verify these tools by examining previously identified gene networks in soybean. Objective 2: Discover, characterize, and make available genes for industry-relevant protein and oil traits from new and existing genetic populations created through various methods, such as fast neutrons, conventional crossing, reverse genetics (TILLING), or mining exotic diversity contained in the USDA National Plant Germplasm System.

Approach
We will apply a genome-wide reverse engineering approach to reconstruct a gene regulatory network in soybean using in-house generated and public available transcriptome sequencing data. An eQTL mapping analysis will conducted with seed transcriptome sequencing and genome sequencing data of the wild and cultivated soybean genotypes to identify the trans-acting eQTL and reveal the relationship of candidate regulatory genes/alleles and their associated genes. The reconstructed gene regulatory network, regulatory relationships generated from eQTL analysis and the co-expression gene network that we previously modeled will be compared to evaluate each regulatory relationship (edge) to generate a consensus soybean seed gene regulatory network. A set of CRISPR/Cas9 genome editing vectors for a regulatory gene (hub) will be constructed to alter its regulatory function in “transgenic” soybean for validation of its regulatory functions in the network. In addition, a set of big data analysis methodologies and data mining strategies will be developed to integrate the large amount of publically available and in-house generated QTL mapping data, transcriptome and genome sequencing data, soybean seed gene regulatory networks predicted above and seed storage reserve related metabolic pathways to identify putative genes/alleles that cause the variation in oil and/or protein content in soybean. We will sequence transcriptomes of soybean seeds containing different alleles of a putative gene to determine their transcriptome response to the allelic variation for validating its regulatory function and providing an insight into its underlying mode of action in regulating oil and/or protein production in seeds.

Progress Report
With the advances in next generation sequencing and other new high-throughput technologies, the soybean research community has generated an unprecedented amount of biological data including genome sequencing, gene expression and a variety of phenotypic data for agriculturally important traits in the past decades. ARS scientists in Saint Louis, Missouri, have been developing a multi-disciplinary and big-data technology platform to enhance translating the huge amount of biological data into data meaningful for soybean product development and for increasing US soybean research efficiency and competitiveness and improving soybean seed quality. The team consolidated, quality controlled and analyzed a total of 5,228 whole genome sequences, 3,930 soybean transcriptome sequences and 0.77 million phenotypic data for 122 soybean traits that were previously generated by world-wide researchers. Several big-data analysis strategies and algorithms were developed and have been used to identify 1321 quantitative trait locus (QTLs) for 56 soybean traits. For soybean seed improvement, eighty-eight QTLs for 18 seed quality traits including protein, oil, amino acids and fatty acids were characterized in detail. The researchers are applying gene editing and other gene engineering technologies to modify a selection of QTL genes related to protein and oil content and amino acid profile to confirm their gene functions and associated networks. In addition, the scientists discovered the genes and causative DNA variants underlying two high-effect protein and oil QTLs. A goal in this is to provide insight into their underlying molecular mode of action and their roles in soybean domestication and improvement. The team is continuing to improve the big-data technology platform, characterize and engineer the discovered genes and networks, and test new strategies for improving soybean seed protein and other seed quality and yield traits.

Accomplishments
1. Discovery of 88 genetic variants and their relationship for soybean seed quality improvement through an integrative big-data technology platform. Soybean is one of the two most important crops in United States agriculture. Demand for soybean is mainly driven by highly valuable protein and oil content for human consumption and animal feed. In addition, composition of amino acids and fatty acids in soybean seeds also directly affect its nutritional value. Soybean quality trait variation is often determined by many QTLs in soybean population. It is critical to discover their underlying genes for soybean improvement. ARS researchers in Saint Louis, Missouri, have developed a “big-data” platform that integrates data science, multi-disciplinary approaches, the large amount of genomic and gene expression and phenotypic data available, and applied the platform to identify 88 QTLs for 18 soybean quality traits. The researchers identified candidate genes underlying a subset of the QTLs and illustrated how they interact with each other in determining those soybean seed quality traits. The knowledge gained from the study provides a solid foundation for researchers to design new strategies to develop soybean cultivars with superior soybean seed quality through molecular breeding and biotechnology.

Review Publications
Zhang, H., Goettel, W., Song, Q., Jiang, H., Hu, Z., Wang, M.L., An, Y. 2020. Selection of GmSWEET39 for oil and protein improvement in soybean. PLoS Genetics. 16(11).e1009114. https://doi.org/10.1371/journal.pgen.1009114.
Zhu, H., Chen, P., Zhong, S., Dardick, C.D., Callahan, A.M., An, Y., Van Knocker, S., Yang, Y., Zhong, G., Abbott, A., Liu, Z. 2020. Thermal-responsive genetic and epigenetic regulation of DAM cluster controlling dormancy and chilling requirement in peach floral buds. Horticulture Research. 7:Article 114. https://doi.org/10.1038/s41438-020-0336-y.
Zhang, H., Hu, Z., Yang, Y., Liu, X., Lv, H., Song, B., An, Y., Li, Z., Zhang, D. 2021. Transcriptome profiling reveals the spatial-temporal dynamics of gene expression essential for soybean seed development. BMC Genomics. 22. Article e453. https://doi.org/10.1186/s12864-021-07783-z.

U.S. DEPARTMENT OF AGRICULTURE

Plant Genetics Research: Columbia, MO