Usman L. Abbas , Yuxuan Zhang , Joseph Tapia , Selim Md , Jin Chen , Jian Shi , Qing Shao
{"title":"Machine-Learning-Assisted Design of Deep Eutectic Solvents Based on Uncovered Hydrogen Bond Patterns","authors":"Usman L. Abbas , Yuxuan Zhang , Joseph Tapia , Selim Md , Jin Chen , Jian Shi , Qing Shao","doi":"10.1016/j.eng.2023.10.020","DOIUrl":null,"url":null,"abstract":"<div><p>Non-ionic deep eutectic solvents (DESs) are non-ionic designer solvents with various applications in catalysis, extraction, carbon capture, and pharmaceuticals. However, discovering new DES candidates is challenging due to a lack of efficient tools that accurately predict DES formation. The search for DES relies heavily on intuition or trial-and-error processes, leading to low success rates or missed opportunities. Recognizing that hydrogen bonds (HBs) play a central role in DES formation, we aim to identify HB features that distinguish DES from non-DES systems and use them to develop machine learning (ML) models to discover new DES systems. We first analyze the HB properties of 38 known DES and 111 known non-DES systems using their molecular dynamics (MD) simulation trajectories. The analysis reveals that DES systems have two unique features compared to non-DES systems: The DESs have ① more imbalance between the numbers of the two intra-component HBs and ② more and stronger inter-component HBs. Based on these results, we develop 30 ML models using ten algorithms and three types of HB-based descriptors. The model performance is first benchmarked using the average and minimal receiver operating characteristic (ROC)-area under the curve (AUC) values. We also analyze the importance of individual features in the models, and the results are consistent with the simulation-based statistical analysis. Finally, we validate the models using the experimental data of 34 systems. The extra trees forest model outperforms the other models in the validation, with an ROC-AUC of 0.88. Our work illustrates the importance of HBs in DES formation and shows the potential of ML in discovering new DESs.</p></div>","PeriodicalId":11783,"journal":{"name":"Engineering","volume":"39 ","pages":"Pages 74-83"},"PeriodicalIF":10.1000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2095809924003825/pdfft?md5=8b887d6a8fd81e1a7830800a7114c519&pid=1-s2.0-S2095809924003825-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2095809924003825","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Non-ionic deep eutectic solvents (DESs) are non-ionic designer solvents with various applications in catalysis, extraction, carbon capture, and pharmaceuticals. However, discovering new DES candidates is challenging due to a lack of efficient tools that accurately predict DES formation. The search for DES relies heavily on intuition or trial-and-error processes, leading to low success rates or missed opportunities. Recognizing that hydrogen bonds (HBs) play a central role in DES formation, we aim to identify HB features that distinguish DES from non-DES systems and use them to develop machine learning (ML) models to discover new DES systems. We first analyze the HB properties of 38 known DES and 111 known non-DES systems using their molecular dynamics (MD) simulation trajectories. The analysis reveals that DES systems have two unique features compared to non-DES systems: The DESs have ① more imbalance between the numbers of the two intra-component HBs and ② more and stronger inter-component HBs. Based on these results, we develop 30 ML models using ten algorithms and three types of HB-based descriptors. The model performance is first benchmarked using the average and minimal receiver operating characteristic (ROC)-area under the curve (AUC) values. We also analyze the importance of individual features in the models, and the results are consistent with the simulation-based statistical analysis. Finally, we validate the models using the experimental data of 34 systems. The extra trees forest model outperforms the other models in the validation, with an ROC-AUC of 0.88. Our work illustrates the importance of HBs in DES formation and shows the potential of ML in discovering new DESs.
非离子深共晶溶剂(DES)是一种非离子设计溶剂,在催化、萃取、碳捕获和制药等领域有多种应用。然而,由于缺乏能准确预测 DES 形成的高效工具,发现新的 DES 候选物质具有挑战性。寻找 DES 在很大程度上依赖于直觉或试错过程,导致成功率低或错失良机。我们认识到氢键(HB)在DES形成过程中发挥着核心作用,因此我们旨在找出区分DES与非DES系统的HB特征,并利用它们开发机器学习(ML)模型,以发现新的DES系统。我们首先利用分子动力学(MD)模拟轨迹分析了38个已知DES系统和111个已知非DES系统的HB特性。分析表明,与非 DES 系统相比,DES 系统有两个独特的特征:DES具有①更多的两个组分内HB数量不平衡,以及②更多和更强的组分间HB。基于这些结果,我们使用十种算法和三种基于 HB 的描述符开发了 30 个 ML 模型。首先使用平均和最小接收者操作特征(ROC)-曲线下面积(AUC)值对模型性能进行了基准测试。我们还分析了模型中各个特征的重要性,结果与基于模拟的统计分析一致。最后,我们利用 34 个系统的实验数据对模型进行了验证。在验证中,额外树木森林模型的 ROC-AUC 值为 0.88,优于其他模型。我们的工作说明了 HB 在 DES 形成过程中的重要性,并展示了 ML 在发现新 DES 方面的潜力。
期刊介绍:
Engineering, an international open-access journal initiated by the Chinese Academy of Engineering (CAE) in 2015, serves as a distinguished platform for disseminating cutting-edge advancements in engineering R&D, sharing major research outputs, and highlighting key achievements worldwide. The journal's objectives encompass reporting progress in engineering science, fostering discussions on hot topics, addressing areas of interest, challenges, and prospects in engineering development, while considering human and environmental well-being and ethics in engineering. It aims to inspire breakthroughs and innovations with profound economic and social significance, propelling them to advanced international standards and transforming them into a new productive force. Ultimately, this endeavor seeks to bring about positive changes globally, benefit humanity, and shape a new future.