Skip to main content
ARS Home » Southeast Area » Raleigh, North Carolina » Soybean and Nitrogen Fixation Research » Research » Publications at this Location » Publication #353293

Research Project: Exploiting Genetic Diversity through Genomics, Plant Physiology, and Plant Breeding to Increase Competitiveness of U.S. Soybeans in Global Markets

Location: Soybean and Nitrogen Fixation Research

Title: Genome-wide association study of seed protein, oil, and amino acid contents in soybean from maturity groups I to IV

Author
item LEE, SUNGWOO - North Carolina State University
item VAN, KYUJUNG - The Ohio State University
item SUNG, MIKSUNG - North Carolina State University
item MCHALE, LEAH - The Ohio State University
item Nelson, Randall
item La Mantia, Jonathan
item Mian, Rouf

Submitted to: Journal of Theoretical and Applied Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/5/2019
Publication Date: 2/26/2019
Citation: Lee, S., Van, K., Sung, M., McHale, L., Nelson, R.L., La Mantia, J.M., Mian, R.M. 2019. Genome-wide association study of seed protein, oil, and amino acid contents in soybean from maturity groups I to IV. Journal of Theoretical and Applied Genetics. 132(6):1639–1659. https://doi.org/10.1007/s00122-019-03304-5
DOI: https://doi.org/10.1007/s00122-019-03304-5

Interpretive Summary: Soybean [Glycine max (L.) Merr.] meal protein is the major source of protein in poultry and animal feed worldwide. Soybean seed protein is negatively correlated with seed yield and oil. The protein content of U.S. soybean has been declining as the seed yield keeps going up. It is very important to dissect protein and oil loci/genes to address the negative correlations among protein, oil and seed yield. We conducted a genome-wide association study (GWAS) using phenotypic data collected from five environments for 621 soybean accessions in maturity groups I – IV and 34,014 single nucleotide polymorphism (SNP) markers to identify loci for protein, oil, and several essential amino acids. We identified 3 and 5 genomic regions significantly associated with seed protein and oil contents, respectively. In addition, one, three, one, and four genomic regions were identified for the key essential amino acids, cysteine, methionine, lysine, and threonine, respectively. These are important amino acids for poultry and animals. As identified in previous studies, loci on chromosomes (Chr) 15 and 20 were associated with seed protein and oil content, with the most significant loci for seed protein and oil on Chr 20 and each of the loci on Chr 15 and 20 exhibiting a negative relation between the two traits. Application of multi-trait mixed model allowed identification of a common effect locus on Chr 5 that increased oil with no effect on protein and on Chr 10 that increased protein with little effect on oil. These two loci will be useful in minimizing the negative correlation between protein and oil. The SNP haplotypes distribution of positive alleles for protein and oil by geographic regions of the world will also be useful in breeding for these traits.

Technical Abstract: Soybean [Glycine max (L.) Merr.] protein and oil are used worldwide in feed, food, and industrial raw materials. Thus, increasing both the seed protein and oil are important. Yet, protein content is negatively correlated with oil content and with seed yield. We conducted a genome-wide association study (GWAS) using phenotypic data collected from five environments for 621 accessions in maturity groups I – IV and 34,014 single nucleotide polymorphism (SNP) markers to identify QTL for protein, oil, and several essential amino acids. We identified 3 and 5 genomic regions significantly associated with seed protein and oil contents, respectively. In addition, one, three, one, and four genomic regions were identified for amino acids: cysteine, methionine, lysine, and threonine, respectively. As identified in previous studies, QTL on chromosomes (Chrs) 15 and 20 were associated with seed protein and oil contents, with the most significant QTL for protein and oil on Chr 20 and both QTL exhibiting a negative relation between the two traits. Application of multi-trait mixed model allowed identification of common effect loci on Chr 5 that increased oil with no effect on protein and on Chr 10 that increased protein with little effect on oil. The frequency of the positive effect haplotypes in linkage disequilibrium with significantly associated SNPs described for loci on Chrs 5, 10, 15 and 20 varied across maturity groups and geographic regions. Their distributions provide guidance on which alleles have potential to contribute to soybean improvement for specific markets and growing regions.