Location: Commodity Utilization ResearchTitle: Applications of regularized regression machine learning for the estimation of thermophysical/chemical properties of biomass-derived molecules with adaptable group contribution
Submitted to: American Chemical Society National Meeting
Publication Type: Abstract Only
Publication Acceptance Date: 4/13/2022
Publication Date: N/A
Interpretive Summary: Sustainable industrial research and development has a significant focus on applications of biomass-derived chemicals (i.e., things from wood, grass, and agricultural wastes/by-products). In order to better understand how these types of chemicals can be used, improvements are needed in estimating their properties. Some of these properties include solubility behavior and boiling point temperature. With techniques from “big data” and “machine learning,” mathematical models can be built to describe these properties. New methods to allow for better flexibility of these models will be explored. The motivation here is driven by limitations of existing published models, which do not work well with many biomass-derived chemical structures. These new models are applied to predict some properties of biomass-derived chemicals. The ways in which these types of estimated values can be physically verified in the laboratory will also be highlighted.
Technical Abstract: With greater emphasis on sustainable biomass-derived products for industrial applications in energy, fuels, chemicals and materials, there is a need to increase the understanding of the properties of these biomass-derived products. A popular approach for quantifying or estimating molecular structure property relationships is through group contribution methods. These methods break molecules into representative functional groups that can be combined linearly to sum individual group contributions toward a final property value. However, applications of group contribution methods are limited by rigidity in published models built on fixed parameters, whose set of “groups” is often rather large and difficult to work with practically. This work explores a potential solution to this problem by applying regularized regression as a machine learning tool to create simpler models based on adaptable contribution groups. In particular, this approach allows for more open-ended selection of groups that can be tailored to a molecular dataset of interest. The overall goal of this work is to develop these models for practical use, particularly by less experienced or unfamiliar practitioners. Applications of this modeling approach were carried out to estimate several properties for biomass pyrolysis derived oligomeric products in conjunction with other empirical relationships. Some of these properties include Hansen solubility parameters, normal boiling point, flash point, heat of vaporization and heat of combustion. The advantage of this type of modeling is also underscored by the significant difficulty that still exists for purification, separation, and characterization of heavy biomass pyrolysis and/or biochemical conversion products. Further opportunities to extend this modeling approach to estimation of other potential properties of interest will also be discussed. Finally, capabilities for coupling molecular structure-property relationship estimates with the ways in which real experimental values can be measured will be highlighted.