|MILALI, MASABHO - IFAKARA HEALTH INSTITUTE|
|SIKULU-LORD, MAGGY - UNIVERSITY OF QUEENSLAND|
|KIWARE, SAMSON - IFAKARA HEALTH INSTITUTE|
|CORLISS, GEORGE - MARQUETTE UNIVERSITY|
|POVINELLI, RICHARD - MARQUETTE UNIVERSITY|
Submitted to: PLoS ONE
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/29/2019
Publication Date: 8/14/2019
Citation: Milali, M.P., Sikulu-Lord, M.T., Kiware, S.S., Dowell, F.E., Corliss, G.F., Povinelli, R.J. 2019. Age grading An. gambiae and An. arabiensis using near infrared spectra and artificial neural networks. PLoS One. 14(8):e0209451. https://doi.org/10.1371/journal.pone.0209451.
Interpretive Summary: Estimating the age of mosquitoes is one of the indicators used by entomologists for estimating vectorial capacity and the effectiveness of an existing mosquito control intervention. Malaria is a vector-borne parasitic disease transmitted to people by mosquitoes of the genus Anopheles. The disease killed approximately 445,000 people in 2016. Mosquitoes contribute to malaria transmission by hosting and allowing the development to maturity of the malaria-causing Plasmodium parasite. Depending on environmental temperature, Plasmodium takes 10-14 days in an Anopheles mosquito to develop fully enough to be transmitted to humans. Therefore, knowing the age of a mosquito provides an indication of whether a mosquito is capable of transmitting malaria. Near infrared spectroscopy (NIRS) classifies lab-reared and semi-field raised mosquitoes into
Technical Abstract: Near infrared spectroscopy (NIRS) is currently complementing techniques to age-grade mosquitoes. NIRS classifies lab-reared and semi-field raised mosquitoes into < or = 7 days old with an average accuracy of 80%, achieved by training a regression model using partial least squares (PLS) and interpreted as a binary classifier. We explore whether using an artificial neural network (ANN) analysis instead of PLS regression improves the current accuracy of NIRS models for age-grading malaria transmitting mosquitoes. We also explore if directly training a binary classifier instead of training a regression model and interpreting it as a binary classifier improves the accuracy. A total of 786 and 870 NIR spectra collected from laboratory reared An. gambiae and An. arabiensis, respectively, were used and pre-processed according to previously published protocols. Based on ten-fold Monte Carlo cross-validation, an ANN regression model scored root mean squared error (RMSE) of 1.6 ± 0.2 for An. gambiae and 2.8 ± 0.2 for An. arabiensis; whereas the PLS regression model scored RMSE of 3.7 ± 0.2 for An. gambiae, and 4.5 ± 0.1 for An. arabiensis. When we interpreted regression models as binary classifiers, the accuracy of the ANN regression model was 93.7 ± 1.0 % for An. gambiae, and 90.2 ± 1.7 % for An. arabiensis; while PLS regression model scored the accuracy of 83.9 ± 2.3% for An. gambiae, and 80.3 ± 2.1% for An. arabiensis. We also find that a directly trained binary classifier yields higher age estimation accuracy than a regression model interpreted as a binary classifier. A directly trained ANN binary classifier scored an accuracy of 99.4 ± 1.0 for An. gambiae, and 99.0 ± 0.6% for An. arabiensis; while a directly trained PLS binary classifier scored 93.6 ± 1.2% for An. gambiae, and 88.7 ± 1.1% for An. arabiensis. Training both regression and binary classification age models using ANNs yields models with higher estimation accuracies than when the same age models are trained using PLS. Regardless of the model architecture, directly trained binary classifiers score higher accuracy on classifying age of mosquitoes than a regression model translated as binary classifier. Therefore, we recommend training models to estimate age of An. gambiae and An. arabiensis using ANN model architectures and direct training of binary classifier instead of training a regression model and interpret it as a binary classifier.