Page Banner

United States Department of Agriculture

Agricultural Research Service


Location: Corn Insects and Crop Genetics Research

Title: Genome Sequence of the Paleopolyploid Soybean (Glycine max (L.) Merr.)

item Schmutz, Jeremy
item Cannon, Steven
item Schlueter, Jessica
item Ma, Jianxin
item Hyten, David
item Song, Qijian
item Mitros, Therese
item Nelson, William
item May, Gregory
item Gill, Navdeep
item Peto, Myron
item Shu, Shengqiang
item Goodstein, David
item Thelen, Jay
item Cheng, Jianlin
item Sakurai, Tetsya
item Umezawa, Taishi
item Shinozaki, Kaquo
item Du, Jianchang
item Bhattacharyya, Madan
item Sandhu, Devinder
item Grant, David
item Joshi, Trupti
item Libault, Marc
item Zhang, Xuecheng
item Hguyen, Henry
item Valliyodan, Babu
item Xu, Doug
item Futrell-griggs, Montona
item Abernathy, Brian
item Hellsten, Utte
item Berry, Kerrie
item Grimwood, Jane
item Yu, Yeisoo
item Wing, Rod
item Cregan, Perry
item Stacey, Gary
item Specht, James
item Rokhsar, Dan
item Shoemaker, Randy
item Jackson, Scott

Submitted to: Nature
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/12/2009
Publication Date: 1/14/2010
Citation: Schmutz, J., Cannon, S.B., Schlueter, J., Ma, J., Hyten, D.L., Song, Q., Mitros, T., Nelson, W., May, G.D., Gill, N., Peto, M.F., Shu, S., Goodstein, D., Thelen, J.J., Cheng, J., Sakurai, T., Umezawa, T., Shinozaki, K., Du, J., Bhattacharyya, M., Sandhu, D., Grant, D.M., Joshi, T., Libault, M., Zhang, X., Hguyen, H., Valliyodan, B., Xu, D., Futrell-Griggs, M., Abernathy, B., Hellsten, U., Berry, K., Grimwood, J., Yu, Y., Wing, R.A., Cregan, P.B., Stacey, G., Specht, J., Rokhsar, D., Shoemaker, R.C., Jackson, S. 2010. Genome Sequence of the Paleopolyploid Soybean (Glycine max (L.) Merr.). Nature. 463:178-183.

Interpretive Summary: From the beginning of agriculture, plant breeders have faced the problem of trying to predict and shape hundreds of plant characteristics, without a basic understanding of the molecular causes of those traits. Traditionally, breeders have made crosses between two good plants and have looked for better ones in the progeny, hoping not to lose other beneficial traits in the process. In this paper the authors report the complete genome sequence of soybean: essentially every letter of the approximately billion DNA letters that make up the soybean genetic code. The authors also report predictions of the locations and letters of every gene in the genome. The genes are the instructions used by each cell to build up the whole plant and to respond to the environment. These results provide plant breeders and researchers with the basic blueprint for the soybean genome. This will make it possible, eventually, to understand any soybean trait, and improve on many characteristics. A great deal of additional work is required to make use of the genome blueprint, but already in the year since the sequence was released to the public (in late 2008), the sequence has been used to identify genes responsible for digestibility in soybean and common bean, for phytate production in soybean seeds (which currently results in environmentally damaging phosphate runoff from swine and poultry waste), and for plant resistance to the devastating soybean disease Asian Soybean Rust. The genome sequence is also an important resource for understanding the extent of plant diversity. The ability to speed breeding efforts, while maintaining diversity, benefits both growers and consumers through new varieties that are higher-yielding and more nutritious and stress- and disease-resistant.

Technical Abstract: We report the genome sequence for soybean (Glycine max var. Williams 82), one of the most important crop plants worldwide because of its ability to produce both protein and oil. Soybean is a recently domesticated legume that plays a vital role in crop rotation as it fixes atmospheric nitrogen via symbioses with soil-borne microorganisms. The 1,115 Mbp genome was sequenced by a whole genome shotgun approach and integrated with physical and high-density genetic maps to create a chromosome scale draft sequence assembly. We predicted 46,430 protein-coding genes in soybean, 70% more genes than the model plant Arabidopsis and similar in total number to the 45,555 genes in the Populus trichocarpa tree genome which, like soybean, is also a paleopolyploid. Of the predicted genes, 21.6% are in repeat-poor, highly recombinogenic portions of the chromosomes. In the soybean lineage history, there have been two large-scale duplication events (polyploidies). These have resulted in nearly 75% of the genes being present in multiple copies. This is evident in many gene families. Soybean has nearly twice as many genes as Arabidopsis thaliana involved in acyl lipid metabolism (1,127 vs. 614) and twice as many transcription factors (5,671 vs. 2,315). The two duplication events occurred ~59 and ~13 Mya, producing massive genetic redundancy, followed by gene diversification and loss, as well as numerous chromosome rearrangements. The release of an accurate soybean genomic sequence will allow rapid identification of the underlying genetic basis of many soybean traits, and will speed the creation of improved new soybean varieties.

Last Modified: 10/19/2017
Footer Content Back to Top of Page