|WESTERMAN, KENNETH - TUFTS UNIVERSITY|
|HARRINGTON, SEAN - NOTEMEAL, INC|
|ORDOVAS, JOSE - JEAN MAYER HUMAN NUTRITION RESEARCH CENTER ON AGING AT TUFTS UNIVERSITY|
Submitted to: BMC Bioinformatics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 6/2/2020
Publication Date: 6/11/2020
Citation: Westerman, K., Harrington, S.M., Ordovas, J.M., Parnell, L.D. 2020. PhyteByte: Identification of foods containing compounds with specific pharmacological properties. BMC Bioinformatics. https://doi.org/10.1186/s12859-020-03582-7.
Interpretive Summary: Because there is a scarcity of information on how chemical compounds naturally found in different foods exert specific health effects, we sought to build software that could match, when possible, these chemical compounds to pharmacological compounds for which such information is documented. This software uses a sophisticated algorithm to compare chemical structures and to use the wealth of biological and health information on drugs to assign potential biological effects to chemical compounds naturally found in food. A test case of the software is provided for the target of common diabetes medications to identify those natural compounds that could have the similar beneficial effects. Using this software can guide researchers in designing specific experiments to test if the food compound and its food source actually function as predicted in alleviating, either wholly or partially, specific conditions of common age-related metabolic diseases.
Technical Abstract: BACKGROUND: It is well known that phytochemicals and other molecules in food elicit positive health benefits, often by unknown mechanisms. While there is a wealth of data on the biological and biophysical properties of drugs and therapeutic compounds, there is a notable lack of such data for compounds commonly present in food. Computational methods for high-throughput identification of food compounds with specific biological effects, especially when accompanied by associated food composition data, could enable more effective and more personalized dietary planning. Here, we sought to build a machine learning-based tool that would leverage existing pharmacological data to predict bioactivity across a comprehensive molecular database of foods and food compounds. RESULTS: The PhyteByte tool takes a chemiinformatic approach to structure-based activity prediction and applies it to uncover putative bioactivity in food compounds. Our approach takes an input protein target and develops a random forest classifier to predict the effect of an input molecule based on its molecular fingerprint, using structure and activity data available from the ChEMBL database. It then predicts the relevant bioactivity of a library of food compounds with known molecular structures from the FooDB database. The output is a list of food compounds with high confidence of eliciting relevant biological effects, along with their source foods and associated quantities in those foods, where available. Applying PhyteByte to the PPARG gene (as the target of common type 2 diabetes medications), we identify irigenin, sesamin, fargesin, and delta-sanshool as putative agonists of PPARG, along with previously identified agonists of this important metabolic regulator. CONCLUSION: PhyteByte identifies food-based compounds that are predicted to interact with specific protein targets, and ranks the foods by quantity of the compound within that food. The identified relationships can be used to prioritize food compounds for experimental or epidemiological follow-up, and can contribute to the development of precision approaches to dietary planning.