Skip to main content
ARS Home » Plains Area » Temple, Texas » Grassland Soil and Water Research Laboratory » Research » Publications at this Location » Publication #418826

Research Project: Enhancing Cropping System and Grassland Sustainability in the Texas Gulf Coast Region by Managing Systems for Productivity and Resilience

Location: Grassland Soil and Water Research Laboratory

Title: Phenotyping cotton leaf chlorophyll via in situ hyperspectral reflectance sensing, spectral vegetation indices, and machine learning

Author
item Thorp, Kelly
item Thompson, Alison
item Herritt, Matthew

Submitted to: Frontiers in Plant Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/4/2024
Publication Date: 11/21/2024
Citation: Thorp, K.R., Thompson, A.L., Herritt, M.T. 2024. Phenotyping cotton leaf chlorophyll via in situ hyperspectral reflectance sensing, spectral vegetation indices, and machine learning. Frontiers in Plant Science. https://doi.org/10.3389/fpls.2024.1495593.
DOI: https://doi.org/10.3389/fpls.2024.1495593

Interpretive Summary: A pressing objective in plant science is to understand the connection between a plant's observable characteristics (phenotype) and its genetic makeup (genotype). Great advances in DNA sequencing have unlocked the genetic code for many important commodity crops. However, understanding how genes control complex traits, such as drought tolerance, time to flowering, and harvestable yield, remains challenging. Information technologies, including sensing and computing tools, are now being used to rapidly characterize the growth responses of genetically diverse plant populations in the field and relate these responses to individual genes. In this study, light reflectance from cotton leaves was measured using spectral sensing equipment prior to sampling the leaves for chlorophyll content. Advanced computational methods were then used to relate the reflectance information with leaf chlorophyll. The study identified successful data analysis strategies and provided recommendations for using sensor-based leaf reflectance data in plant breeding programs and for plant genetic analyses. In the next decades, field-based plant phenomics will likely revolutionize our understanding of how plant genetics interacts with the environment to produce food, feed, fiber and fuel resources. The results of this study will be used internationally by plant scientists in both the public and private sectors to advance current field-based plant phenomics efforts.

Technical Abstract: Cotton (Gossypium hirsutum L.) leaf chlorophyll (Chl) has been targeted as a phenotype for breeding selection to improve cotton tolerance to environmental stress. However, high-throughput phenotyping methods based on hyperspectral reflectance sensing are needed to rapidly screen cultivars for Chl in the field. The objectives of this study were to deploy a cart-based field spectroradiometer to measure cotton leaf reflectance in two field experiments over four growing seasons (2019–2022) at Maricopa, Arizona and to evaluate 148 spectral vegetation indices and 14 machine learning methods for estimating leaf chlorophyll from the reflectance data. Leaf tissue was sampled concurrently with reflectance measurements, and analytical processing in the laboratory provided data to compute leaf Chl a, Chl b, and Chl a+b as both areas-basis (µg cm-2) and mass-basis (mg g-1) measurements. The 148 spectral vegetation indices were evaluated both individually and collectively as a set of input features for machine learning models. The leaf reflectance data, in addition to several other data transformation involving spectral derivatives and log-inverse reflectance, were also evaluated. Data sets for model training and testing were based on two strategies: 1) training based on 2019–2020 data and testing based on 2021–2022 data and 2) using a random split of data from all four years. Machine learning models trained with 2019–2020 data performed poorly in tests with the 2021–2022 data (e.g., RMSE=23.7% and r2=0.46 for area-basis Chl a+b), indicating difficulty transferring models from one experiment to another. Model performance was more satisfactory when training and testing data sets were based on a random split of all data (e.g., RMSE=10.5% and r2=0.88 for area basis Chl a+b), but performance outside the conditions of the present study cannot be guaranteed. The performance of spectral vegetation indices was in the middle (e.g., RMSE=16.2% and r2 = 0.69 for area-basis Chl a+b), and the indices provided more consistent error metrics among the evaluated data sets as compared to machine learning models. Generally, spectral indices and machine learning models both estimated Chl a and Chl a+b with less error as compared to Chl b. Ensemble machine learning methods which combined estimates from several base estimators (e.g., random forest, gradient booting, and AdaBoost regressors) and a multi-layer perceptron neural network method were among the top performing models (p < 0.05). Also, input features based on spectral derivatives or spectral indices often performed better than inputting reflectance data directly. Assessments of feature importance demonstrated that spectral reflectance data and spectral vegetation indices involving red edge radiation were the most important inputs to random forest models for estimation of cotton leaf Chl. Because subjective modeling decisions can impact machine learning performance, spectral vegetation indices should not be overlooked as a practical plant trait estimation tool for high-throughput phenotyping, whereas machine learning offers great opportunity for data mining to develop newer and more robust indices.