Submitted to: Foods
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/15/2022
Publication Date: 11/18/2022
Citation: Daba, S.D., Honigs, D., McGee, R.J., Kiszonas, A. 2022. Prediction of protein concentration in pea (Pisum sativum L.) using near-infrared spectroscopy (NIRS) systems. Foods. 11(22). Article 3701. https://doi.org/10.3390/foods11223701.
Interpretive Summary: There is currently great interest in utilizing plant-based protein in food products destined for human consumption. Peas are a common source of plant-based protein, but the typical protein concentration is only 17-20%. Therefore, breeding for increased protein concentration has become a top priority. For breeding programs, it is important to have fast, non-destructive methods to measure protein concentration in seeds, which near infrared spectroscopy (NIRS) can provide. This paper reports and compares methods and models used to calibrate two (DA7250 and FT9700) NIRS systems specifically for pea protein concentration. Although the FT9700 system marginally outperformed the DA7250 system, either could be used to predict protein concentration rapidly and non-destructively.
Technical Abstract: Breeding for increased protein concentration is a top priority for pea. Several lines and progenies are routinely evaluated early in the selection stage of a breeding program. As a result, it is vital to have a quick, accurate, and non-destructive technique of assessing protein concentration, which the near infrared spectroscopy (NIRS) technology can provide. We reported NIRS calibration of pea protein concentration. Models were developed applying partial least square regression (PLSR) using reference protein data from the FP-528 dumas nitrogen analyzer and spectral data from two NIRS systems (DA7250 and FT9700). The calibration process included a total of 329 pea samples, with 70% of the samples used in the calibration subset and 30% of the samples used in the validation subset. The results showed that for DA7250 and FT9700 spectral data, respectively, 10 and 13 latent variables were optimal. Because the spectrum data was averaged across 32 scans, neither multiplicative scatter correction (MSC) nor the standard normal variate (SNV) transformation) enhanced the prediction accuracy compared to the raw spectral data. The models developed with the optimum latent variables for each spectral data and using the raw spectral data explained for 87% and 84% of the variation for DA7250 and 89% and 87% of the variation for FT9700 in the calibration and validation datasets, respectively. The root mean square error (RMSE) values for DA7250 were 0.75% and 0.86% for the calibration and validation datasets, respectively. For the FT9700, the RMSE values were 0.68% for the calibration subset and 0.85% for validation subset. The FT9700 system marginally outperformed the DA7250 system. This could be due to the greater number of spectral wavelengths used by FT9700 vs DA7250. However, the models constructed using both NIRS systems performed well in terms of prediction, and both NIRS systems could be utilized to predict protein concentration in peas. These models should be further validated using external datasets, and they will need to be updated as the protein concentration range for subsequent samples expands.