|Westerhaus, Mark - PENN STATE UNIVERSITY|
|Barton Ii, Franklin|
Submitted to: Eastern Analytical Symposium
Publication Type: Abstract Only
Publication Acceptance Date: July 8, 1996
Publication Date: N/A
Technical Abstract: Near infrared spectroscopy relies on the collection of an appropriate set of samples for calibration. The purpose of this study was to evaluate selection of calibration samples from scores derived from principal component analysis (PCA) and partial least squares (PLS1) regression. A diverse set of cereal products and grains (N=90) and these samples stored under different relative humidity environments (N=143) were used to test these approaches. The range in total dietary fiber (TDF) was 0.66 to 52.14%. Calibration samples were selected by: 1) PCA regression of spectra using 2 to 20 PCA factors with N fixed at 75; and 2) PLS regression of spectra and TDF using 20 to 20 factors with N fixed at 75. Calibrations for TDF were developed with the selected samples using PLS regression. Calibrations were validated using two independent sets of cereal samples with the same range in TDF (1.16-43.66%) but different residual moisture contents (set one 3.45-12.98%, N=29; set two 0.52-15.58%, N=29). Model performance was reported as the standard error of performance (SEP). Samples were selected by designating one sample to represent all the samples in its neighborhood. A neighborhood is defined by the neighborhood h which is dependent on the number of factors. The neighborhood h required to select 75 samples (out of 233) increased linearly with increasing number of PCA and PLS1 factors. With a neighborhood h resulting in selection of 75 samples, the optimal number of PCA or PLS1 factors (based on the SEP of the validation sets) ranged from 14 to 18. Model accuracy (SEP) did not differ due to selection of calibration samples from scores derived from PCA and PLS1. The number of factors used to define the neighborhood and select the calibration samples has a substantial impact on model accuracy.