|Reeves Iii, James|
Submitted to: Near Infrared Spectroscopy Journal
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: July 20, 2004
Publication Date: September 10, 2004
Citation: Delwiche, S.R., Reeves III, J.B. 2004. The effect of spectral pretreatments on the pls modeling of agricultural products. Near Infrared Spectroscopy Journal. 12:177-182. Interpretive Summary: The analytical technique known as near-infrared (NIR) spectroscopy has been widely used by the agricultural and food industries for the past three decades. Its popularity stems from its rapidness (e.g., hundreds of samples per laboratory per day), accuracy, and ease of analysis. Considered a secondary analytical procedure, the success of an NIR method ultimately lies with its calibration to a primary procedure. One common analytical method for calibration is called partial least squares regression. Essentially, a regression equation is developed that relates the component of interest (the analyte) to the spectra, having first performed a mathematical compression to the spectra to reduce it from several hundred data values to a number between 2 and 15. Even before this compression step, other mathematical transformations, referred to as spectral pretreatments, are typically performed to enhance model accuracy. This study reports on our systematic attempt to explore for the relative strengths of the popularly used pretreatments. These pretreatments include ones that correct for differences in particle size within the sample, ones that remove noise from the spectra, and ones that enhance spectral absorption bands through derivatization. Having developed a batch program that runs in a commercial statistics software package (i.e., SAS), we now have the ability to try out hundreds of spectral pretreatments, and have selected ten representative ones for this study. Statistical tests which examine the variance of the residuals (NIR-predicted minus reference) and the consistency of the rank order of the ten pretreatments for two spectral sets (a cross-validation and a test set) have been utilized on two completely independent sets of data, one consisting of ground wheat (with two analytes) for human food, the other consisting of forages (also with two analytes) for animal feed. Our findings indicate that differences in modeling error arise with the different pretreatments; however, many of these differences are not statistically different. Scientists, specifically those that develop multivariate statistical analytical methods, and analytical laboratories are the intended beneficiaries of this research.
Technical Abstract: Spectral pretreatment, such as scatter correction, smoothing, and derivatization, is considered, almost by folklore, an integral component to the development of near-infrared (NIR) partial least squares (PLS) regression equations. This study was undertaken to examine the importance of pretreatments. Diffuse reflectance NIR (1100-2500 nm) spectra of ground wheat and forages were separately analyzed. For ground wheat, the effect of spectral pretreatment on the PLS equations for protein content and sodium dodecyl sulfate (SDS) sedimentation volume (a protein quality index) was examined. For forages, similar examinations were performed on crude protein content and lignin content. Results indicate that while pretreatment is indeed important, statistical significance, as determined by F-test of correlated variances, is often not established. Protein content calibrations tend to be enhanced by scatter correction, as opposed to smoothing or derivatization, whereas the SDS sedimentation volume and lignin content calibrations favored these convolution functions. It is recommended that the selection of the best pretreatment for an analyte be based on the combination of statistical testing and the modeler's judgment.