{"title":"预测离子液体辅助生物质预处理功效的机器学习模型","authors":"Biswanath Mahanty, Munmun Gharami, Dibyajyoti Haldar","doi":"10.1007/s12155-024-10747-2","DOIUrl":null,"url":null,"abstract":"<div><p>The influence of ionic liquid (IL) characteristics, lignocellulosic biomass (LCB) properties, and process conditions on LCB pretreatment is not well understood. In this study, a total of 129 experimental data on LCB (grass, agricultural, and forest residues) pretreatment using imidazolium, triethylamine, and choline-amino acid ILs were compiled to develop machine learning (ML) models for cellulose, hemicellulose, lignin, and solid recovery. Following data imputation, a bilayer artificial neural network (ANN) and random forest (RF) regression, the two most widely adopted ML models, were developed. The full-featured ANN following Bayesian hyperparameter (HP) optimisation offered excellent fit on training (<i>R</i><sup>2</sup>: 0.936–0.994), though cross-validation (<i>R</i><sub>2</sub>CV) performance remained marginally poor, i.e. between 0.547 and 0.761. The fitness of HP-optimised RF models varied between 0.824 and 0.939 for regression, and between 0.383 and 0.831 in cross-validation. Temperature and pretreatment time had been the most important predictors, except for hemicellulose recovery. Bayesian predictor selection combined with HP optimisation improved the <i>R</i><sup>2</sup>CV boundary for ANN (0.555–0.825), as well as for RF models (0.474–0.824). As predictive performance of the models varied depending on target response, use of a larger homogeneous dataset may be warranted. The predictive modelling framework for LCB pretreatment, developed in this study, can be extended to similar biochemical process systems.</p></div>","PeriodicalId":487,"journal":{"name":"BioEnergy Research","volume":"17 3","pages":"1569 - 1583"},"PeriodicalIF":3.1000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Modelling for Predicting the Efficacy of Ionic Liquid-Aided Biomass Pretreatment\",\"authors\":\"Biswanath Mahanty, Munmun Gharami, Dibyajyoti Haldar\",\"doi\":\"10.1007/s12155-024-10747-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The influence of ionic liquid (IL) characteristics, lignocellulosic biomass (LCB) properties, and process conditions on LCB pretreatment is not well understood. In this study, a total of 129 experimental data on LCB (grass, agricultural, and forest residues) pretreatment using imidazolium, triethylamine, and choline-amino acid ILs were compiled to develop machine learning (ML) models for cellulose, hemicellulose, lignin, and solid recovery. Following data imputation, a bilayer artificial neural network (ANN) and random forest (RF) regression, the two most widely adopted ML models, were developed. The full-featured ANN following Bayesian hyperparameter (HP) optimisation offered excellent fit on training (<i>R</i><sup>2</sup>: 0.936–0.994), though cross-validation (<i>R</i><sub>2</sub>CV) performance remained marginally poor, i.e. between 0.547 and 0.761. The fitness of HP-optimised RF models varied between 0.824 and 0.939 for regression, and between 0.383 and 0.831 in cross-validation. Temperature and pretreatment time had been the most important predictors, except for hemicellulose recovery. Bayesian predictor selection combined with HP optimisation improved the <i>R</i><sup>2</sup>CV boundary for ANN (0.555–0.825), as well as for RF models (0.474–0.824). As predictive performance of the models varied depending on target response, use of a larger homogeneous dataset may be warranted. The predictive modelling framework for LCB pretreatment, developed in this study, can be extended to similar biochemical process systems.</p></div>\",\"PeriodicalId\":487,\"journal\":{\"name\":\"BioEnergy Research\",\"volume\":\"17 3\",\"pages\":\"1569 - 1583\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BioEnergy Research\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s12155-024-10747-2\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BioEnergy Research","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s12155-024-10747-2","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
Machine Learning Modelling for Predicting the Efficacy of Ionic Liquid-Aided Biomass Pretreatment
The influence of ionic liquid (IL) characteristics, lignocellulosic biomass (LCB) properties, and process conditions on LCB pretreatment is not well understood. In this study, a total of 129 experimental data on LCB (grass, agricultural, and forest residues) pretreatment using imidazolium, triethylamine, and choline-amino acid ILs were compiled to develop machine learning (ML) models for cellulose, hemicellulose, lignin, and solid recovery. Following data imputation, a bilayer artificial neural network (ANN) and random forest (RF) regression, the two most widely adopted ML models, were developed. The full-featured ANN following Bayesian hyperparameter (HP) optimisation offered excellent fit on training (R2: 0.936–0.994), though cross-validation (R2CV) performance remained marginally poor, i.e. between 0.547 and 0.761. The fitness of HP-optimised RF models varied between 0.824 and 0.939 for regression, and between 0.383 and 0.831 in cross-validation. Temperature and pretreatment time had been the most important predictors, except for hemicellulose recovery. Bayesian predictor selection combined with HP optimisation improved the R2CV boundary for ANN (0.555–0.825), as well as for RF models (0.474–0.824). As predictive performance of the models varied depending on target response, use of a larger homogeneous dataset may be warranted. The predictive modelling framework for LCB pretreatment, developed in this study, can be extended to similar biochemical process systems.
期刊介绍:
BioEnergy Research fills a void in the rapidly growing area of feedstock biology research related to biomass, biofuels, and bioenergy. The journal publishes a wide range of articles, including peer-reviewed scientific research, reviews, perspectives and commentary, industry news, and government policy updates. Its coverage brings together a uniquely broad combination of disciplines with a common focus on feedstock biology and science, related to biomass, biofeedstock, and bioenergy production.