Skip to main content
ARS Home » Midwest Area » Columbia, Missouri » Plant Genetics Research » Research » Research Project #434365

Research Project: Soybean Seed Quality Improvement through Translational Genomics

Location: Plant Genetics Research

2020 Annual Report

Objective 1: Develop and make available new approaches to evaluate gene functions in gene networks and verify these tools by examining previously identified gene networks in soybean. Objective 2: Discover, characterize, and make available genes for industry-relevant protein and oil traits from new and existing genetic populations created through various methods, such as fast neutrons, conventional crossing, reverse genetics (TILLING), or mining exotic diversity contained in the USDA National Plant Germplasm System.

We will apply a genome-wide reverse engineering approach to reconstruct a gene regulatory network in soybean using in-house generated and public available transcriptome sequencing data. An eQTL mapping analysis will conducted with seed transcriptome sequencing and genome sequencing data of the wild and cultivated soybean genotypes to identify the trans-acting eQTL and reveal the relationship of candidate regulatory genes/alleles and their associated genes. The reconstructed gene regulatory network, regulatory relationships generated from eQTL analysis and the co-expression gene network that we previously modeled will be compared to evaluate each regulatory relationship (edge) to generate a consensus soybean seed gene regulatory network. A set of CRISPR/Cas9 genome editing vectors for a regulatory gene (hub) will be constructed to alter its regulatory function in “transgenic” soybean for validation of its regulatory functions in the network. In addition, a set of big data analysis methodologies and data mining strategies will be developed to integrate the large amount of publically available and in-house generated QTL mapping data, transcriptome and genome sequencing data, soybean seed gene regulatory networks predicted above and seed storage reserve related metabolic pathways to identify putative genes/alleles that cause the variation in oil and/or protein content in soybean. We will sequence transcriptomes of soybean seeds containing different alleles of a putative gene to determine their transcriptome response to the allelic variation for validating its regulatory function and providing an insight into its underlying mode of action in regulating oil and/or protein production in seeds.

Progress Report
To discover genes and gene network controlling soybean seed protein and oil quality, ARS scientist at Saint Louis, Missouri, analyzed additional 750 genome sequences and integrated them with 1,500 previously analyzed genome sequences to produce a collection of 2,250 whole genome sequences. Identified several putative genes underlying protein and oil Quantitative trait locus (QTLs) using a big-data driven technology platform. Characterized a major QTL gene in details, and submitted a manuscript reporting the result. Analyzed and normalized transcript accumulation for all soybean genes (56,000 genes) in over 3,000 soybean samples by analyzing their transcriptome sequencing. Generated a co-expression network underlying soybean development. In addition, discovered several genetic regions (QTL) and gene variants associated with other important soybean seed quality traits including methionine, oleic acid and trypsin inhibitors.

1. Discovery of a key gene for soybean protein, oil and yield improvement. Soybean is one of the two most important crops in U.S. agriculture. Soybean was domesticated and further improved to provide highly valuable oil and protein for human consumption and animal feed. Soybean is also the most effective crop to produce plant-based protein. It is critical to discover the genes controlling protein and oil and reveal molecular basis underlying how protein and oil have been selected together with other major soybean traits for soybean improvement. Having developed and applied a big-data technology platform, ARS researchers in Saint Louis, Missouri, identified the gene underlying a major protein and oil quantitative trait locus (QTL) on Chromosome 15, which researchers have tried to clone since it was discovered in 1992. The team demonstrated that the gene encodes a sugar transporter and is specifically expressed in a soybean seed coat which plays a key role in moving nutrients produced from photosynthesis to seed storage tissues for synthesizing seed oil and protein. The gene is associated with seed protein, oil and weight (a seed yield component), three of the most important traits in soybean agriculture. A small DNA deletion in the gene caused high oil, bigger seed and lower protein. The mutated gene has been extensively used in U.S. soybean improvement. The gene mutation is a major cause of low protein in US soybean cultivars, which impose a significant threat on US soybean competitiveness in the world. Having analyzed whole genome sequences of 631 soybean lines, the research team discovered a set of soybean germplasm as breeding materials for improving soybean protein content. The comprehensive knowledge on the molecular basis underlying the major multifunctional QTL and its association with soybean improvement would be highly valuable to soybean researchers to design new strategies for soybean seed quality and yield improvement through breeding and biotechnological approaches.

Review Publications
Han, Q., Bartels, A., Cheng, X., Meyer, A., An, Y., Hsieh, T., Xiao, W. 2019. Epigenetics regulates reproductive development in plants. Plants. 8(12):564.
Zhang, D., Zhang, H., Hu, Z., Chu, S., Yu, K., Lv, L., Yang, Y., Zhang, X., Chen, X., Kan, G., Tang, Y., An, Y., Yu, D. 2019. Artificial selection on GmOLEO1 contributes to the increase in seed oil during soybean domestication. PLoS Genetics. 15(7):e1008267.