|Reeves Iii, James|
Submitted to: Near Infrared Spectroscopy Journal
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: December 17, 2007
Publication Date: January 31, 2008
Citation: Reeves III, J.B., Delwiche, S.R. 2008. Sas partial least squares (pls) for discriminant analysis. Near Infrared Spectroscopy Journal. 16:31-38. Interpretive Summary: Spectroscopy uses light to determine the composition of materials or for classification. In order to use the information available, mathematical processing of the spectra is almost always required. One of the methods commonly used today is called partial least squares (PLS). This work reports on a program written in the statistical package SAS for implementing PLS for classification or discrimination of materials based on their spectral properties. This was done in combination with previous efforts which implemented data pre-treatments including scatter correction, derivatives, mean centering, and variance scaling for spectral analysis. These procedures are used to remove interferences due to differences in the physical properties of the sample such as particle size differences and to accentuate differences in the spectra. The program was tested on forage and grain samples and also allows for testing of multiple spectral pre-treatments in a one step fashion with summary of all results.
Technical Abstract: The objective of this work was to implement discriminant analysis using SAS partial least squares (PLS) regression for analysis of spectral data. This was done in combination with previous efforts which implemented data pre-treatments including scatter correction, derivatives, mean centering, and variance scaling for spectral analysis. Partial least squares is implemented in SAS as type 2 where a solution for multiple analytes (Y-variables) is determined simultaneously, but can not work with non-numeric analyte values. For discriminant analysis samples belonging to one of Z classes are coded for Z analytes with all but one (class to which sample belongs coded as 1) coded as being a 0. Thus, for four classes, all samples are coded with one of four analyte combinations (1,0,0,0; 0,1,0,0; 0,0,1,0; or 0,0,0,1). This paper discusses a SAS program designed to perform classification/discrimination using SAS PLS, Principal Components or Reduced Rank Regression and previously written SAS macros for pre-treatment of spectral data. Examples are presented using two datasets: A. forages and by-products and B. grains. The program allows for testing of multiple spectral pre-treatments in a one step fashion with summary of all results.