Location: Virus and Prion ResearchTitle: Machine learning prediction and experimental validation of antigenic drift in H3 influenza A viruses in swine
|ZELLER, MICHAEL - Iowa State University|
|GAUGER, PHILLIP - Iowa State University|
|ARENDSEE, ZEBULUN - Orise Fellow|
|SOUZA, CARINE - Orise Fellow|
Submitted to: mSphere
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/23/2021
Publication Date: 3/17/2021
Citation: Zeller, M.A., Gauger, P.C., Arendsee, Z.W., Souza, C.K., Vincent, A.L., Anderson, T.K. 2021. Machine learning prediction and experimental validation of antigenic drift in H3 influenza A viruses in swine. mSphere. 6. https://doi.org/10.1128/mSphere.00920-20.
Interpretive Summary: Influenza A virus (IAV) is an important respiratory pathogen in swine with significant economic losses due to decreased rate of gain, increased antibiotic and vaccine costs, and increased mortality. Vaccines developed to control infection focus on the hemagglutinin (HA) protein, with vaccine components determined through sequencing of HA genes collected from sick pigs infected with IAV. Though sequencing technologies have provided data to design vaccines, there are no rapid and accurate predictive approaches that adequately link sequence data to vaccine protection. Our study evaluated a combination of computer models known as “machine learning” to estimate virus antigenic features from genetic sequence data. Our model was used to identify and rank the importance of mutations in the HA gene that caused changes in the HA protein to predict whether these would affect antibody recognition. Using these predictions, previously uncharacterized viruses were selected to test the model in an assay measuring antibody binding to virus antigen. We demonstrated that the model predicted virus antigenic characteristics from genetic sequence data. Linking genetic diversity to change in the HA protein and recognition by antibody has important implications for effective vaccine design. The findings from this study are critical to help inform vaccine manufacturers and swine producers on how to develop and implement vaccines to more effectively control IAV in swine.
Technical Abstract: The genetic and antigenic diversity of influenza A virus (IAV) circulating in swine challenges the development of effective vaccines, thereby increasing the zoonotic threat and pandemic potential of swine IAV. High throughput sequencing technologies and analyses are able to quantify genetic diversity of IAV, but there are no accurate approaches to adequately describe novel antigenic phenotypes. This study evaluated an ensemble of non-linear regression models to estimate virus phenotype from genotype. Regression models were trained with a phenotypic dataset of pairwise hemagglutination inhibition (HI) assays, using genetic sequence identity and pairwise amino acid mutations as predictor features. The model identified pairwise amino acid identity, ranked the relative importance of mutations in the hemagglutinin (HA) protein, and demonstrated good prediction accuracy following ten-fold cross validation. Four previously untested IAV strains were selected to experimentally validate the model predictions by HI assays. Error between predicted and measured distances of uncharacterized strains were 0.34, 0.70, 2.19, and 0.17 antigenic units. These regression models trained on HI data can be used to estimate antigenic distances between different strains of IAV in swine using sequence data. By ranking the importance of mutations in the HA, this method provides criteria for identifying antigenically advanced IAV strains that may not be controlled by existing vaccines and can inform strain updates to vaccines to better control this important pathogen.