Location: Soybean Genomics & Improvement Laboratory
Title: Long-read sequencing reveals novel structural variation markers for key agronomic and quality traits of soybeansAuthor
![]() |
WANG, ZHIBO - Virginia Polytechnic Institution & State University |
![]() |
BELAY, KASSAYE - Virginia Polytechnic Institution & State University |
![]() |
PATERSON, JOE - Virginia Polytechnic Institution & State University |
![]() |
BEWICK, PATRICK - Virginia Polytechnic Institution & State University |
![]() |
SONGER, WILLIAM - Virginia Polytechnic Institution & State University |
![]() |
Song, Qijian |
![]() |
ZHANG, BO - Virginia Polytechnic Institution & State University |
![]() |
LI, SONG - Virginia Polytechnic Institution & State University |
Submitted to: Frontiers in Plant Science
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 3/17/2025 Publication Date: 4/8/2025 Citation: Wang, Z., Belay, K., Paterson, J., Bewick, P., Songer, W., Song, Q., Zhang, B., Li, S. 2025. Long-read sequencing reveals novel structural variation markers for key agronomic and quality traits of soybeans. Frontiers in Plant Science. 16. Article e1557748. https://doi.org/10.3389/fpls.2025.1557748. DOI: https://doi.org/10.3389/fpls.2025.1557748 Interpretive Summary: Decades of research have shown that structural variation (SV), including deletions, insertions, duplications and chromosomal rearrangements, is an important element in plant evolution, affecting traits such as branch structure, flowering time, seed size and stress resistance. Third-generation long-read sequencing technology is revolutionizing plant genomics, providing unprecedented opportunities to identify SVs that short-read sequencing cannot reliably capture. Although a number of soybeans have been sequenced, there are significant gaps in current soybean whole genome sequences: most of the resequenced genotypes came from Chinese breeding programs, which are not available in the United States; efforts have focused primarily on animal feed soybean genotypes not intended for human consumption, such as natto, edamame, bean sprouts, tofu and soy milk. Furthermore, previously reported SVs were almost all identified using short DNA sequence reads, which may have lower reliability in identifying SVs. We resequenced 29 soybean varieties used for food consumption using nanopore long-read sequencing technology, identified SVs from the varieties using long-read sequence, experimentally verified the association of SVs with soybean production and food quality traits and deposited the sequences and SVs into the public domain. This study not only adds valuable resources for marker development, but also aids in understanding the underlying mechanisms controlling soybean traits and conducting other basic and applied genetic research. Technical Abstract: In plant genomic research, long read sequencing has been widely used to detect structure variations that are not captured by short read sequencing. In this letter, we described an analysis of whole genome re-sequencing of 29 soybean varieties using nanopore long-read sequencing. The compiled germplasm reflects diverse applications, including livestock feeding, soy milk and tofu production, as well as consumption of natto, sprouts, and vegetable soybeans (edamame). We have identified 365,497 structural variations in these newly re-sequenced genomes and found that the newly identified structural variations are associated with important agronomic traits. These traits include seed weight, flowering time, plant height, oleic acid content, methionine content, and trypsin inhibitor content, all of which significantly impact soybean production and quality. Experimental validation supports the roles of predicted candidate genes and structural variant in these biological processes. Our research provides a new source for rapid marker discovery in crop genomes using structural variation and whole genome sequencing. |