|WANG, G. - Bridgestone Americas Tire Operations|
|BADARUDDIN, M. - University Of Arizona|
Submitted to: Computers and Electronics in Agriculture
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/23/2017
Publication Date: 3/3/2017
Citation: Thorp, K.R., Wang, G., Bronson, K.F., Badaruddin, M., Mon, J. 2017. Hyperspectral data mining to identify relevant canopy spectral features for estimating durum wheat growth, nitrogen status, and yield. Computers and Electronics in Agriculture. 136:1-12.
Interpretive Summary: Modern spectral sensors can measure crop canopy reflectance in hundreds or thousands of narrow spectral wavebands. Such data is potentially useful for aiding crop management decisions with a goal to conserve natural resources while optimizing crop productivity. A primary challenge is how to mine the spectral data for relevant relationships with agricultural crop characteristics. In this study, several computational strategies were investigated to compare the ability of several spectral data sets to estimate durum wheat canopy characteristics. In particular, a genetic algorithm was developed and tested to extract features from spectral data sets that were highly relevant for estimating characteristics of durum wheat canopies. Using this procedure, improved models to estimate durum wheat leaf area index, biomass, and plant nitrogen content were developed. In addition, the procedure permitted improved estimation of durum wheat yield and grain nitrogen concentration from in-season canopy spectral data. A unique advantage of the procedure is that it permitted identification of the spectral regions that were most useful for estimating certain crop characteristics. The results are useful to researchers and industries focused on the development of proximal sensing tools for detection of agricultural crop characteristics.
Technical Abstract: Modern hyperspectral sensors permit reflectance measurements of crop canopies in hundreds of narrow spectral wavebands. While these sensors describe plant canopy reflectance in greater detail than multispectral sensors, they also suffer from issues with data redundancy and spectral autocorrelation. Data mining techniques that extract relevant spectral features from narrow-band reflectance and spectral derivative data will aid the development of novel sensors for plant trait estimation. The objectives of this research were to 1) compare the ability of broad-band reflectance, narrow-band reflectance, and spectral derivatives for estimating durum wheat traits in the field and 2) develop a genetic algorithm to identify the most relevant spectral features for durum wheat trait estimation. Experiments at Maricopa, Arizona during the winters of 2010-2011 and 2011-2012 tested six durum wheat cultivars with six split-applied nitrogen (N) fertilization rates. Destructive biomass samples were collected four times in each growing season and were used to measure leaf area index, canopy dry weight, and plant N content. Canopy spectral reflectance data in 701 narrow wavebands from 350 nm to 1050 nm were collected weekly over each treatment plot using a field spectroradiometer. First- and second-order spectral derivatives were calculated using Savitzky-Golay filtering. The narrow-band data were also used to estimate reflectance in broad wavebands, as typically collected by two commercial multispectral instruments having either four or eight channels. Partial least squares regression (PLSR) compared the ability of each spectral data set to estimate each durum wheat canopy trait. A genetic algorithm was developed to mine narrow-band canopy reflectance and spectral derivative data for relevant spectral features to estimate durum wheat traits. The genetic algorithm subsampled the narrow-band reflectance data and first- and second-order derivative spectra, identified up to 25 spectral features computed as the mean of spectral data over a range of wavelengths, assessed the goodness-of-fit of the spectral features to estimate plant traits via PLSR, and iterated the process to identify the optimum set of spectral features. Using PLSR, multispectral data in 4 broad bands estimated leaf area index, canopy dry weight, and plant N content with root mean squared error of cross validation (RMSECV) of 40.5%, 43.4%, and 44.1% respectively, while hyperspectral data in 701 narrow bands reduced RMSECV to 34.2%, 24.8%, and 28.4%, respectively. Using a genetic algorithm to identify less than 25 relevant spectral features further reduced RMSECV to 26.1%, 20.3%, and 21.2%, respectively. Durum wheat traits were best estimated when using a genetic algorithm to mine hyperspectral reflectance and spectral derivative data for identification of the most relevant spectral features for trait estimation.