Location: Sustainable Perennial Crops Laboratory
Title: Seed quality drives grain yield in Ethiopian and Senegalese sorghum: Insights from machine learningAuthor
![]() |
Ahn, Ezekiel |
![]() |
Prom, Louis |
![]() |
Jang, Jae Hee |
![]() |
Baek, Insuck |
![]() |
TUKULI, ADAMA - Orise Fellow |
![]() |
LIM, SEUNGHYUN - Orise Fellow |
![]() |
HONG, SEOK - Ulsan National Institute Of Science And Technology (UNIST) |
![]() |
LEE, YOONJUNG - University Of Minnesota Crookston |
![]() |
Kim, Moon |
![]() |
Meinhardt, Lyndel |
![]() |
Park, Sunchung |
![]() |
MAGILL, CLINT - Texas A&M University |
|
Submitted to: PLOS ONE
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 7/15/2025 Publication Date: 8/14/2025 Citation: Ahn, E.J., Prom, L.K., Jang, J., Baek, I., Tukuli, A.R., Lim, S., Hong, S.M., Lee, Y., Kim, M.S., Meinhardt, L.W., Park, S., Magill, C. 2025. Seed quality drives grain yield in Ethiopian and Senegalese sorghum: Insights from machine learning. PLOS ONE. https://doi.org/10.1371/journal.pone.0329366. DOI: https://doi.org/10.1371/journal.pone.0329366 Interpretive Summary: Sorghum is a crucial cereal crop, especially in Africa and Asia, providing food and fodder. This study explored the genetic diversity of sorghum varieties from Ethiopia and Senegal, aiming to group these varieties based on their traits and predict their grain yield using machine learning techniques. The study found that machine learning can effectively categorize sorghum varieties based on their features, which could help breeders select suitable varieties for crossing to create improved hybrids. Additionally, the study successfully predicted grain yield using these techniques, identifying seed weight and germination rate as the most important factors for determining yield potential. This information can help breeders develop improved sorghum varieties with better yields, contributing to increased food security and agricultural sustainability in the region. Technical Abstract: This study evaluated the application of machine learning for clustering and yield prediction in a collection of 179 sorghum accessions from Ethiopia and Senegal. Various machine learning models were employed, including Bagging classifier, DBSCAN, Gaussian Mixture Model, K-Nearest Neighbors, Random Forest, and Support Vector Machines, to analyze phenotypic data comprising nine key traits: grain yield, seed weight, flowering time, germination rate, panicle height and length, and resistance to anthracnose, grain mold, and rust. Clustering analysis revealed distinct groupings within the accessions, highlighting the presence of inherent sub-types within this germplasm, which could be valuable for breeding programs. The Boosted Tree model exhibited exceptional accuracy in predicting grain yield, emphasizing the importance of seed weight, germination rate, and flowering time as key determinants of yield potential. This research underscores the potential of machine learning in sorghum germplasm characterization and yield optimization for enhanced breeding strategies. |
