Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Soybean Genomics & Improvement Laboratory » Research » Research Project #434471

Research Project: Characterization of Genetic Diversity in Soybean and Common Bean, and Its Application toward Improving Crop Traits and Sustainable Production

Location: Soybean Genomics & Improvement Laboratory

2023 Annual Report

Objective 1: Discover QTL and genes controlling biotic and abiotic stress tolerance, and agronomic and quality traits in soybean and common bean and develop new DNA markers that define haplotype variation across new and previously identified genomic regions. [NP301, C1, PS1A; C3, PS3B] The aim of objective 1 is to develop community resources for efficient identification of genes/QTL impacting a range of traits and to facilitate marker assisted selection of alleles in soybean and common bean in collaboration with breeders. These include highly polymorphic markers, core germplasm collection and genotypic datasets of new exotic elite germplasm introduced to USDA Soybean Germplasm Collection. Objective 2: Evaluate diverse soybean populations developed from hybridization with wild soybean to discover unique QTL controlling seed protein and oil content, develop molecular markers, and make these available to breeders for improving soybean quality. [NP301, C1, PS1A; C3, PS3B] As many wild soybean germplasm may has different alleles controlling high protein and oil content than cultivated soybean, here we will explore wild soybean for the improvement of U.S. soybean seed protein and oil content with the markers developed from Objective 1 and genomic tools previously developed in our laboratory. Objective 3: Characterize genetic diversity of the Soybean Rhizobium Germplasm Collection using whole genome sequencing, evaluate nitrogen fixation efficiency of the core strains, and use the information to identify rhizobium genes associated with host-specific nodulation and nitrogen fixation in specific soybean genotype/rhizobium symbioses. [NP301, C1, PS1A; C3, PS3B] Genetic diversity of the rhizobia will be evaluated using genomic information and their influence on the nitrogen fixation efficiency in soybean will be analyzed. The research will result in the identification of efficient strains and genes for enhanced nitrogen fixation in soybean, resulting in better utilization of the diversity of rhizobium strains and soybean ancestors to improve biological nitrogen fixation in commercial soybean cultivars.

Objective 1: Solexa short genomic DNA sequences from 16 diverse genotypes of different common bean market classes will be aligned to the common bean whole genome sequence (WGS) for SSR marker discovery. After filtering, primer sets will be designed to amplify the SSRs. A subset of 100 primer pairs will be randomly selected for testing polymorphism using genomic DNA from the 16 diverse common bean genotypes. A total of 12 pairs of diverse genotypes from different market classes of the Andean Diverse Panel of common bean will be sequenced. Called SNPs will be filtered based on a number of factors for beadchip assay. SNPs that are polymorphic within multi- market classes will be added to the Illumina Infinium BARCBean6K_3 BeadChip pool or used for KASP markers to fine map gene/QTL in targeted genomic regions. Based on the SNP data of the >18,000 cultivated soybean accessions assayed with SoySNP50K BeadChip, core sets of soybean accessions for each soybean maturity group will be created. The software Core Hunter 3 will be used to select the core collection with high allelic richness. Objective 2: a nested association mapping panel consisting of 150-300 F6 lines from each of 10 crosses of NC-Raleigh x wild soybean from the wild soybean core collection will be developed. The parents and the RILs will be grown in the field at two locations in two years. DNA isolated from the RILs and parents will be genotyped with Illumina BARCSoySNP6K BeadChips. Protein content and oil content of the parents and lines will be measured using a DA 7250 NIR Analyzer. The dataset will be used to identify QTL, genes and haplotypes controlling high seed protein and oil content in wild soybean that will be used for improving cultivated soybean and to predict accuracy of genomic selection. Objective 3: Genomic DNA of 760 soybean Bradyrhizobium strains will be isolated and sequenced at using NextSeq500 Sequencer. The resulting sequence will be aligned to the WGS of the B. japonicum strain USDA110 for variant discovery. Redundant or highly similar strains with 99.9% similarity among the soybean rhizobia will be identified. Within each cluster with 99.9% similarity, an accession from each cluster will be evaluated for nitrogen fixation efficiency using 8 ancestral cultivars which contribute more than 70% of the genetic diversity to the Southern and Northern American elite cultivars. Plant will be measured for chlorophyll content and biomass with or without inoculation of the stains, and scored for plant vegetative growth based on the growth of the plant inoculated with USDA110, a recommended soybean strain. The test in eight ancestors will be carried out in a greenhouse with replications.

Progress Report
This is the final report for project 8042-21000-289-000D which terminated in May 13, 2023. New NP301 OSQR approved project entitled “Characterization and Utilization of Genetic Diversity in Soybean and Common Bean and Management and Utilization of the National Rhizobium Genetic Resource Collection” is being established. Extensive results were realized over the 5 years of the project. Under Objective 1, we developed two low-cost genomic assays for soybean, BARCSoySNP3K containing 3K SNPs and SoySNP1K containing 1K SNPs. Markers in the assay were selected based on haplotype block and polymorphism analyses among 18,000 soybean genomes. These SNPs have been shown to be of high genotyping quality based on the analysis of >30,000 southern and northern soybean samples. The 3K assay is commercialized by Illumina, Inc., and the 1K is commercialized by Agriplex Genomics, Cleveland, Ohio. We also developed a common bean assay, BARCBean12K, containing 12,000 SNPs based on whole-genome sequence analysis of 52 different Andean and 13 Central American bean germplasms from different market classes. The assay includes 6,000 highly polymorphic SNPs selected from millions of SNPs identified from 52 Andean bean species, as well as 6,000 SNPs from the BARCBean6K_3 assay being used by common bean breeders and geneticists. 12K is commercialized by Illumina Inc. and is used by bean researchers at USDA-ARS, universities, and seed companies to discover genes that control different traits, to assist in breeding selection and to differentiate bean varieties. Using these tools as well as the SoySNP50K, BARCSoySNP6K and BARCBean6k_3, we analyzed hundreds of soybean and common bean populations created by 49 collaborators in the United States and other countries. The analyses resulted in mapping of QTL/genes controlling numerous soybean traits including resistance to sudden death syndrome, resistance to cyst nematode and seed and seedling rot, aluminum tolerance, drought tolerance, low seed coat deficiency, seed compositions, seed size, and symbiotic compatibility, etc., development of markers tagging low Kunitz trypsin inhibitor and genomic selection of seed composition in collaboration with researchers in Iowa State University, Michigan State University, University of Nebraska, Virginia Tech, University of Georgia, University of Tennessee, University of Missouri at St Louis, etc. In common bean, the analyses led to the identification of QTL/genes controlling Fusarium oxysporum resistance, anthracnose resistance, pod and seed size, development of markers associated with bean golden yellow mosaic virus resistance, anthracnose and angular leaf spot disease resistance, post-processing color retention in black bean and the development of lines with increased cysteine and methionine concentration in collaboration with researchers at USDA-ARS, Prosser, Washington, University of Michigan, Agriculture and Agri-Food Canada, Canada, Serida in Spain, the University of Nova de Lisboa, in Portugal, and Universities in Brazil. Based on the dataset of >18,000 cultivated soybean accessions genotyped with 42,509 SNPs in the SoySNP50K assay, we selected sets of soybean accessions from different soybean maturity groups (MGs) and could capture > 95% of the SNPs diversity among the accessions from the MGs. Some of the core sets have been provided to soybean researchers working in different soybean growing regions, e.g. the core set of 500 accessions for maturity group (MG) II, III and IV was provided to Purdue University, the set of 500 from MG IV and V was provided to Virginia Tech, 300 from MGIV,V,VI, VII was provided to USDA-ARS, Raleigh, North Carolina, 500 from MGV, VI, VII, VIII and IX was provided to the University of Georgia, 200 from MG000-X was provided to USDA-ARS, Jackson, Tennessee and 400 from MG000-X to the University of Missouri. In addition, based on the dataset of 1,168 G. soja accessions genotyped with 42,509 SNPs, we selected a set of 400 wild soybeans from MGIII and IV for USDA-ARS, Stoneville, Mississippi, and a set of wild soybean from MG000-X for the University of Missouri and USDA-ARS, Raleigh, North Caolina. These core collections are critical for researchers to efficiently evaluate and utilize a large number of accessions in the collection for the discovery of novel genes/QTL controlling important traits and discovery of germplasm harboring desired traits. In collaboration with the scientists at the USDA Soybean Germplasm Collection, we genotyped 1056 soybean germplasm accessions from Korea, Vietnam, and other countries with the SoySNP50K BeadChips. We compared the genetic relationship of these accessions with the 562 elite cultivars in the USDA-ARS Soybean Germplasm Collection. Those subpopulations that are not represented in current elite cultivars will be provided as a pool of untapped genetic variability that can be exploited for genetic advance for abiotic and biotic stress resistance, seed quality traits, and productivity. The resulting genotypic dataset was shared with collaborators and is public available. Under Objective 2, a population of 10 families was constructed by crossing 10 wild soybeans with the common cultivated soybean North Caroline, Raleigh (PI 641156) using the single-seed descent method. A total of >1100 RILs from ten crosses grown in North Carolina and Beltsville for two years. Protein content and oil content of more than 6,000 plots were obtained in collaboration with researchers at the University of Georgia. All lines and parents were assayed with BARCSoySNP6K. Through single-family and multi-family association analysis, 57, 71 and 59 genetic loci associated with seed protein content, oil content and protein plus oil content were identified, respectively. This study provided new insight into the genetic characteristics of cultivated x wild soybean hybrid populations and detected a number of novel wild soybean genomic regions that could be introduced into cultivated soybean to improve these traits. In collaboration with USDA scientists at USDA-ARS in St. Louis, Missouri, we analyzed genomic regions that control protein content in the G. max x G. soja family and 631 soybean accessions and identified the sucrose transport gene on chromosome 15 and the domestication gene on chromosome 20 regulating seed protein content and oil content. These are the first reports of the genes responsible for the two most important loci controlling protein and oil content in soybean. The resulted were published on Plos Genetics and Nature Communication, respectively. Under Objective 3, about 600 soybean Bradyrhizobium strains were grown, and DNA was isolated from cultured cells. The sequencing of all the isolates has been completed and the whole genome sequence assemblies of 100 representative accessions are available on JGI website (JGI Genome Portal - Home ( We evaluated the nitrogen fixation efficiency of 104 different rhizobia accessions using 8 ancestral cultivars which contributed more than 70% of the genetic diversity to the Southern and Northern American elite cultivars. Plants were measured for chlorophyll content, root nodule number, and vegetative growth. These stains were inoculated in the greenhouse in triplicate. We observed a large amount of trait variation in rhizobia accessions and identified eight rhizobia accessions that may be similar or more efficient at nitrogen fixation than the recommended soybean strain USDA 110. These strains have potential for commercial use. From the current project, a total of 69 papers have been published in peer reviewed journals.

1. Discovery of the gene controlling frogeye leaf spot resistance in soybean. Frogeye leaf spot is a foliar disease of soybean caused by a fungus. In severe cases, foliar lesions can coalesce and cover more than 30% of the leaf area, at which time the leaves often wither and abscise. The disease, one of the five most yield-reducing diseases in the southern United States, has recently become a growing problem in growing regions of the midwestern and northern of the United States due to warmer winters and the growth of susceptible varieties. Limited sources of resistance have been identified and used in modern soybean breeding. Researchers at the University of Georgia, Athens, Georgia, and USDA-ARS, Beltsville, Maryland, have discovered a new source of resistance in soybean. Through an analysis of 329 different soybean accessions, they identified a gene called Glyma.11g230400 that contributed to the most pronounced disease resistance and developed a molecular marker to track it. The study will facilitate combining new sources of resistance with known available sources to enhance soybean resistance to frogeye leaf spot and accelerate disease-resistance breeding.

2. Methods for predicting optimal cross combinations to accelerate soybean genetic improvement. Parental selection and crossing to combine traits from desirable parents are initial, essential steps in the breeding pipeline. Most often these crossing decisions are made based on pedigrees, parental genotypes, or specific desirable traits. Traditional population development can take five years or more. Thus poor cross combinations will drain resources for years before being evaluated. Researchers at University of Georgia, Athens, Georgia, and USDA-ARS, Beltsville, Maryland, developed genetic marker-based selection models and evaluated their predictive accuracy for soybean seed yield. The study identified the best predictive models under different scenarios. The predictive models can be utilized by soybean as well as other crop breeding programs at the earliest stage of the breeding cycle and will increase genetic gain and reduce the number of breeding cycles.

3. A novel protein quantitative trait locus (QTL) from a glycine soja accession. The soybean is valuable for human and animal nutrition; however, soybean suffers from very low genetic diversity compared to many crop species, in large part due to its self-pollinating floral biology. This lack of genetic diversity has been made worse by events that occurred during domestication, the introduction of limited number of soybeans to the United States and soybean breeding selection. To ensure long-term genetic gain potential for seed yield and nutritional potential, researchers at University of Missouri, Columbia, Missouri, and USDA-ARS in Beltsville, Maryland, developed wild soybean derived populations and performed genetic analysis. They identified several novel genomic regions correlated with changes in seed protein and oil content. One genomic region was identified that increased seed protein levels (+0.7%) but had no significant impact on seed oil content. This finding was in stark contrast to many other sources of elevated seed protein which have significant costs in terms of seed oil and/or seed yield. They validated the finding by further population development, which allowed them to narrow down to allow candidate genes to be identified. They also developed new molecular biology selection tools to streamline breeding for this new protein gene. The results will allow soybean germplasm/cultivars to be more protein-rich, while the wealth of new genetic information may allow researchers to clone the gene behind this trait in the near future.

4. Discovery of the new mechanism in soybean for conferring resistance to cyst nematode. Soybean Cyst Nematode (SCN) is a soil borne pest and the number one biotic cause for reduced seed yield in soybean. Annually more than $1.5 billion in losses occur in the United States because of this pathogen. Identifying and deploying soybean genetic resistance genes is the most efficacious and economical method of controlling the economic cost of SCN infestation of fields. In this study, researchers at University of Missouri USDA-ARS in Columbia, Missouri, and USDA-ARS, Beltsville, Maryland, evaluated three populations specifically created to identify and test different genetic resistance genes using several different SCN virulence races. Soybean cyst nematode resistance loci rhg1-a and Rhg2 in PI 90763 impart vigorous resistance through an epistatic interaction against multiple SCN populations. They were also able to fine-map the Rhg2 gene to a very small genomic interval (~169 kilobasepairs, containing <22 genes) and identified a strong candidate gene for Rhg2. Their results can be employed to accelerate and improve ongoing breeding efforts to diversify SCN resistance in modern soybean resistant cultivars.

Review Publications
Diers, B., Specht, J., Graef, G., Song, Q., Rainey, K.M., Ramasubramanian, V., Liu, X., Myers, C., Stupar, R., An, Y., Beavis, W. 2023. Genetic architecture of protein and oil content in soybean seed and meal. The Plant Genome. 16(1). Article e20308.
Wang, H., Campbell, B., Happ, M., McConaughy, S., Lorenz, A., Amundsen, K., Song, Q., Pantalone, V., Hyten, D. 2022. Development of molecular inversion probes for soybean progeny genomic selection genotyping. The Plant Genome. Article e20270.
McDonald, S., Buck, J., Song, Q., Li, Z. 2023. Genome-wide association study reveals novel loci and candidate gene for resistance to frogeye leaf spot (Cercospora sojina) in soybean. Molecular Genetics and Genomics.
Hu, L., Wang, X., Zhang, J., Florz-Palacios, L., Song, Q., Jiang, G. 2023. Genome-wide detection of quantitative trait loci and prediction of candidate genes for seed sugar composition in early mature soybean. International Journal of Molecular Sciences. 24:3167.
McConaughy, S., Amundsen, K., Quigley, C.V., Pantalone, V., Hyten, D. 2023. Recombination hotspots in soybean (Glycine Max (L.) Merr.). G3, Genes/Genomes/Genetics.
Miller, M., Song, Q., Fallen, B.D., Li, Z. 2023. Genomic prediction of optimal cross combinations to accelerate genetic improvement of soybean (Glycine Max). Crop Science. 14. Article e1171135.
Ma, G., Song, Q., Li, X., Qi, L. 2022. Genetic insight into disease resistance gene clusters by using sequencing-based fine mapping in sunflower (Helianthus annuus L.). International Journal of Molecular Sciences. 23(17):9516.
Yang, Y., La, T., Gillman, J.D., Lyu, Z., Joshi, T., Usovsky, M., Song, Q., Scaboo, A. 2022. Linkage analysis and residual heterozygotes derived near isogenic lines reveals a novel protein quantitative trait loci from a Glycine soja accession. Frontiers in Plant Science. 13. Article 938100.
Singer, W.M., Shea, Z., Yu, D., Huang, H., Mian, R.M., Rosso, M.L., Song, Q., Zhang, B. 2022. Genome-wide association study and genomic selection for proteinogenic methionine in soybean seeds. Frontiers in Plant Science.
Basnet, P., Meinhardt, C.G., Usovsky, M., Gillman, J.D., Joshi, T., Song, Q., Diers, B., Mitchum, M.G., Scaboo, A. 2022. Epistatic interaction between Rhg1-a and Rhg2 in PI 90763 confers resistance to virulent soybean cyst nematode populations. Theoretical and Applied Genetics. 135:2025-2039.
Ji, W., Yang, T., Song, Q., Ma, M. 2022. Isoflavone composition of germinated soybeans after freeze-thaw. Food Research International. 16. Article 100493.