Skip to main content
ARS Home » Plains Area » Lubbock, Texas » Cropping Systems Research Laboratory » Plant Stress and Germplasm Development Research » Research » Publications at this Location » Publication #420519

Research Project: Genetic Improvement of Sorghum Traits that Advance Agricultural Productivity and Climate Resilience

Location: Plant Stress and Germplasm Development Research

Title: Sorghum-grain-count: A large image dataset and benchmark for sorghum grain count estimation

Author
item BAKSHI, ALIVA - Kansas State University
item PRAMANIK, SWARAH - Kansas State University
item CARAGEA, DOINA - Kansas State University
item SOMAYANDA, IMPA - Texas Tech University
item KUMAR, RITESH - Cornell University
item BONGARI, MAYANK - Texas Tech University
item Bean, Scott
item Emendack, Yves
item Hayes, Chad
item JAGADISH, KRISHNA - Texas Tech University

Submitted to: Smart Agricultural Technology
Publication Type: Proceedings
Publication Acceptance Date: 7/19/2025
Publication Date: 7/25/2025
Citation: Bakshi, A., Pramanik, S., Caragea, D., Somayanda, I., Kumar, R., Bongari, M., Bean, S.R., Emendack, Y., Hayes, C.M., Jagadish, K. 2025. Sorghum-grain-count: A large image dataset and benchmark for sorghum grain count estimation. Smart Agricultural Technology. 12. https://doi.org/10.1016/j.atech.2025.101218.
DOI: https://doi.org/10.1016/j.atech.2025.101218

Interpretive Summary: Sorghum hold immense significance for food, feed, and energy production, serving as a vital grain crop and source of biofuel. Unravelling the intricate relationship between sorghum genotypes and their phenotypic expression stands to greatly impact its utilization and energy systems. Conventional phenotyping methods which rely on manual procedures, are time consuming, produce non-reliable results, and have not been effective in capturing the physical characteristics to quantify phenotypic traits for large-scale field trials that may include hundreds of genotypes. Scientists from ARS, Texas Tech, Cornnell University, and Kansas State University, through collaborative work, have developed a computer vision and deep-learning approach used for image-based high-throughput phenotyping approach to capture traits and genetic variations from a large number of genotypes in sorghum, to train models that could be used for yield predictions. This benchmark dataset approach have the potential to advance research on deep-learning for yield estimation and ultimately enable hybrid breeding in sorghum and other crops.

Technical Abstract: The crop scientific community has made significant progress in increasing global food production through advances in genetics and management. Among others, trait-based ideotype breeding has been shown to enhance and sustain yield potential of various crops under current and future changing climate. To be successful, trait-based ideotype breeding requires extensive phenotyping of plants in large-scale field trials that may include hundreds of genotypes. Computer vision and deep learning approaches have been used for image-based high-throughput phenotyping. However, existing approaches generally require significant amounts of labeled training data and model fine-tuning, which are both expensive and time consuming. Modern large foundation models trained in a self-supervised manner on massive general purpose datasets do not perform well on specific plant phenotyping tasks in zero-shot or few-shot scenarios, most probably because they do not include images relevant to plant phenotyping. Large sets of images that capture traits and genetic variation from a large number of genotypes in various crops are needed to train crop-specific models that can be easily adapted and transferred to other crops. Towards this goal, we introduce a large Sorghum-Seed-Yield benchmark dataset consisting of 5,520 panicle images and 13,800 threshed seed images from 345 genotypes of sorghum, a crop which holds immense significance for both food and energy production. Using a relatively small number of images where the seeds are manually an- notated using bounding box and/or point annotations, we also train baseline models for small object detection and counting, as well as density-estimation models for yield prediction. Our benchmark dataset and baseline models have the potential to advance the research on deep learning for yield estimation and ultimately enable hybrid breeding in sorghum and other crops.