Pub Date : 2024-04-27DOI: 10.1016/j.aichem.2024.100068
Nil Sanosa , David Dalmau , Diego Sampedro , Juan V. Alegre-Requena , Ignacio Funes-Ardoiz
Machine Learning (ML) stands as a disruptive technology, finding application across a diverse array of scientific disciplines. When applied to homogeneous catalysis, this technology accelerates catalyst discovery through virtual screening, which not only reduces experimental iterations but also yields significant savings in time, resources, and waste generation. ML algorithms, often integrated with cheminformatic tools and quantum mechanics featurization, excel in predicting reaction outcomes that guide the engineering of catalysts for desired reactivity and selectivity. This minireview presents recent studies regarding databases as well as supervised and unsupervised problems, offering a general yet insightful perspective on the current ML-driven progress in homogeneous catalysis.
机器学习(ML)是一项颠覆性技术,可应用于各种科学学科。当应用于均相催化时,该技术通过虚拟筛选加速了催化剂的发现,这不仅减少了实验迭代,还大大节省了时间、资源和废物的产生。ML 算法通常与化学信息学工具和量子力学特征整合在一起,在预测反应结果方面表现出色,可指导催化剂的工程设计以获得理想的反应性和选择性。这篇微型综述介绍了有关数据库以及监督和非监督问题的最新研究,为当前以 ML 为驱动力的均相催化研究进展提供了一个全面而深刻的视角。
{"title":"Recent advances of machine learning applications in the development of experimental homogeneous catalysis","authors":"Nil Sanosa , David Dalmau , Diego Sampedro , Juan V. Alegre-Requena , Ignacio Funes-Ardoiz","doi":"10.1016/j.aichem.2024.100068","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100068","url":null,"abstract":"<div><p>Machine Learning (ML) stands as a disruptive technology, finding application across a diverse array of scientific disciplines. When applied to homogeneous catalysis, this technology accelerates catalyst discovery through virtual screening, which not only reduces experimental iterations but also yields significant savings in time, resources, and waste generation. ML algorithms, often integrated with cheminformatic tools and quantum mechanics featurization, excel in predicting reaction outcomes that guide the engineering of catalysts for desired reactivity and selectivity. This minireview presents recent studies regarding databases as well as supervised and unsupervised problems, offering a general yet insightful perspective on the current ML-driven progress in homogeneous catalysis.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000265/pdfft?md5=2dd0fc25216808ebfca4936d94919c60&pid=1-s2.0-S2949747724000265-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140825120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-21DOI: 10.1016/j.aichem.2024.100066
Yujie Wang , Jian Li , Xuze Chen , Weiping Zhu , Xuhong Guo , Fang Zhao
Photocatalytic reactions, achieving chemical synthesis in a more sustainable manner than thermal reactions, have been demonstrated to become more efficient, greener and easier to scale up when combined with continuous microflow technology. Nevertheless, the report on the kinetics measurement for photocatalytic reactions in continuous microflow, especially in a fully automated way, is very rare. In this work, two challenging parameters, i.e., the reaction order with respect to oxygen (2.48) and photoreaction activation energy (-16.83 kJ/mol) of the photocatalytic oxidation of 9,10-diphenylanthracene, were acquired in an automated continuous flow platform using the Steady-state Method. Moreover, the Ramping Method was also successfully implemented in the automated continuous flow photoreaction platform, exhibiting a predictive accuracy of 4.42 %, with 64.3 % less time and 58.0 % less material consumption than the Steady-state Method. And it was found that the improvement in the residence time distribution of the microreactor could improve the accuracy of the Ramping Method. The automated continuous flow process developed in this work could offer an efficient and accurate way to attain the reaction kinetics information for homogeneous photocatalytic reactions.
{"title":"Automated kinetics measurement for homogeneous photocatalytic reactions in continuous microflow","authors":"Yujie Wang , Jian Li , Xuze Chen , Weiping Zhu , Xuhong Guo , Fang Zhao","doi":"10.1016/j.aichem.2024.100066","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100066","url":null,"abstract":"<div><p>Photocatalytic reactions, achieving chemical synthesis in a more sustainable manner than thermal reactions, have been demonstrated to become more efficient, greener and easier to scale up when combined with continuous microflow technology. Nevertheless, the report on the kinetics measurement for photocatalytic reactions in continuous microflow, especially in a fully automated way, is very rare. In this work, two challenging parameters, i.e., the reaction order with respect to oxygen (2.48) and photoreaction activation energy (-16.83 kJ/mol) of the photocatalytic oxidation of 9,10-diphenylanthracene, were acquired in an automated continuous flow platform using the Steady-state Method. Moreover, the Ramping Method was also successfully implemented in the automated continuous flow photoreaction platform, exhibiting a predictive accuracy of 4.42 %, with 64.3 % less time and 58.0 % less material consumption than the Steady-state Method. And it was found that the improvement in the residence time distribution of the microreactor could improve the accuracy of the Ramping Method. The automated continuous flow process developed in this work could offer an efficient and accurate way to attain the reaction kinetics information for homogeneous photocatalytic reactions.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000241/pdfft?md5=acff6c610e496d29876c4bc9832b2989&pid=1-s2.0-S2949747724000241-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140644835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1016/j.aichem.2024.100065
Xue Jia, Hao Li
Metal oxides (MOs) are a class of electrocatalysts which could be the low-cost alternatives to precious metals. However, many MOs suffer from poor stability under electrochemical operating conditions. The Materials Project stands out as one of the largest computational materials databases to date, where the bulk Pourbaix diagrams are essential in assessing the aqueous stability of potential electrocatalysts. Herein, we performed data mining from the Materials Project database to identify potentially stable MOs for industrially important electrocatalytic reactions including oxygen reduction reaction (ORR), oxygen evolution reaction (OER), chlorine evolution reaction (CER), hydrogen evolution reaction (HER), and nitrogen reduction reaction (NRR). We found that many MOs can be potentially stable under electrocatalytic conditions, especially in neutral and alkaline medium. Finally, we summarized those MOs that had been previously experimentally synthesized but haven’t been explored as electrocatalysts. This comprehensive assessment effectively narrows down the exploration scope and facilitates the evaluation of material stability.
{"title":"Data mining of stable, low-cost metal oxides as potential electrocatalysts","authors":"Xue Jia, Hao Li","doi":"10.1016/j.aichem.2024.100065","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100065","url":null,"abstract":"<div><p>Metal oxides (MOs) are a class of electrocatalysts which could be the low-cost alternatives to precious metals. However, many MOs suffer from poor stability under electrochemical operating conditions. The <em>Materials Project</em> stands out as one of the largest computational materials databases to date, where the bulk Pourbaix diagrams are essential in assessing the aqueous stability of potential electrocatalysts. Herein, we performed data mining from the <em>Materials Project</em> database to identify potentially stable MOs for industrially important electrocatalytic reactions including oxygen reduction reaction (ORR), oxygen evolution reaction (OER), chlorine evolution reaction (CER), hydrogen evolution reaction (HER), and nitrogen reduction reaction (NRR). We found that many MOs can be potentially stable under electrocatalytic conditions, especially in neutral and alkaline medium. Finally, we summarized those MOs that had been previously experimentally synthesized but haven’t been explored as electrocatalysts. This comprehensive assessment effectively narrows down the exploration scope and facilitates the evaluation of material stability.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294974772400023X/pdfft?md5=b8129e021daa42ae50521062f842ed41&pid=1-s2.0-S294974772400023X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140645362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.1016/j.aichem.2024.100064
Murat Cihan Sorkun , Elham Nour Ghassemi , Cihan Yatbaz , J.M. Vianney A. Koelman , Süleyman Er
Aqueous Organic Redox Flow Batteries (AORFBs) are considered as one of the most appealing technologies for large-scale energy storage due to their electroactive organic materials, which are abundant, easy to produce, and recyclable. A prevailing challenge for the redox chemistries applied in AORFBs is to achieve high power and energy density. The chemical design and molecular engineering of the electroactive compounds is an effective approach for the optimization of their physicochemical properties. Among them, the reaction energy of redox couples is often used as a proxy for the measured potentials. In this study, we present RedPred, a machine learning (ML) model that predicts the one-step two-electron two-proton redox reaction energy of redox-active molecule pairs. RedPred comprises an ensemble of Artificial Neural Networks, Random Forests, and Graph Convolutional Networks, trained using the RedDB database, which contains over 15,000 reactant-product pairs for AORFBs. We evaluated RedPred’s performance using six different molecular encoders and five prominent ML algorithms applied in chemical science. The predictive capability of RedPred was tested on both its training chemical space and the chemical space outside its training domain using two separate test datasets. We released a user-friendly web tool with open-source code to promote software sustainability and broad use.
{"title":"RedPred, a machine learning model for the prediction of redox reaction energies of the aqueous organic electrolytes","authors":"Murat Cihan Sorkun , Elham Nour Ghassemi , Cihan Yatbaz , J.M. Vianney A. Koelman , Süleyman Er","doi":"10.1016/j.aichem.2024.100064","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100064","url":null,"abstract":"<div><p>Aqueous Organic Redox Flow Batteries (AORFBs) are considered as one of the most appealing technologies for large-scale energy storage due to their electroactive organic materials, which are abundant, easy to produce, and recyclable. A prevailing challenge for the redox chemistries applied in AORFBs is to achieve high power and energy density. The chemical design and molecular engineering of the electroactive compounds is an effective approach for the optimization of their physicochemical properties. Among them, the reaction energy of redox couples is often used as a proxy for the measured potentials. In this study, we present RedPred, a machine learning (ML) model that predicts the one-step two-electron two-proton redox reaction energy of redox-active molecule pairs. RedPred comprises an ensemble of Artificial Neural Networks, Random Forests, and Graph Convolutional Networks, trained using the RedDB database, which contains over 15,000 reactant-product pairs for AORFBs. We evaluated RedPred’s performance using six different molecular encoders and five prominent ML algorithms applied in chemical science. The predictive capability of RedPred was tested on both its training chemical space and the chemical space outside its training domain using two separate test datasets. We released a user-friendly web tool with open-source code to promote software sustainability and broad use.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000228/pdfft?md5=eb00d9969ca643436121b54e6f5cf72b&pid=1-s2.0-S2949747724000228-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140539637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-03DOI: 10.1016/j.aichem.2024.100063
Ruchira Joshi , Zipeng Zheng , Palak Agarwal , Ma’mon M. Hatmal , Xinmin Chang , Paul Seidler , Ian S. Haworth
Artificial intelligence (AI) has huge potential to accelerate drug discovery, but challenges remain in implementing AI algorithms that can be used by the broad scientific community. Identification of molecular features and their subsequent use in training of machine learning models may permit prediction of new molecules with enhanced properties. Predictive modeling is particularly applicable to analysis of structure-activity relationships (SARs) and would be a useful tool in the hands of laboratory medicinal chemists. This requires a software platform that is chemically intuitive while providing the user with access to AI methods. The KNIME platform provides such an environment through inclusion of broad chemical toolsets and a user-friendly approach for utilization of machine learning for analysis of SAR data. Here, we illustrate use of KNIME for this purpose, with a focus on discovery of features of highly potent tau inhibitors from a series of structurally diverse polyphenols. Workflows are described that enable implementation of AI tools in KNIME for diverse SAR projects.
人工智能(AI)在加速药物发现方面有着巨大的潜力,但在实施可供广大科学界使用的人工智能算法方面仍存在挑战。识别分子特征并随后将其用于训练机器学习模型,可以预测具有更强特性的新分子。预测建模尤其适用于结构-活性关系(SAR)分析,将成为实验室药物化学家手中的有用工具。这需要一个直观的化学软件平台,同时为用户提供人工智能方法。KNIME 平台提供了这样一个环境,它包含了广泛的化学工具集和用户友好型方法,可利用机器学习分析 SAR 数据。在此,我们以从一系列结构不同的多酚类化合物中发现高活性 tau 抑制剂的特征为重点,说明了 KNIME 在此方面的应用。本文介绍了 KNIME 中的人工智能工具在不同 SAR 项目中的应用。
{"title":"KNIME workflows for applications in medicinal and computational chemistry","authors":"Ruchira Joshi , Zipeng Zheng , Palak Agarwal , Ma’mon M. Hatmal , Xinmin Chang , Paul Seidler , Ian S. Haworth","doi":"10.1016/j.aichem.2024.100063","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100063","url":null,"abstract":"<div><p>Artificial intelligence (AI) has huge potential to accelerate drug discovery, but challenges remain in implementing AI algorithms that can be used by the broad scientific community. Identification of molecular features and their subsequent use in training of machine learning models may permit prediction of new molecules with enhanced properties. Predictive modeling is particularly applicable to analysis of structure-activity relationships (SARs) and would be a useful tool in the hands of laboratory medicinal chemists. This requires a software platform that is chemically intuitive while providing the user with access to AI methods. The KNIME platform provides such an environment through inclusion of broad chemical toolsets and a user-friendly approach for utilization of machine learning for analysis of SAR data. Here, we illustrate use of KNIME for this purpose, with a focus on discovery of features of highly potent tau inhibitors from a series of structurally diverse polyphenols. Workflows are described that enable implementation of AI tools in KNIME for diverse SAR projects.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000216/pdfft?md5=2753f7e85fd445cdf8e68194d90fd743&pid=1-s2.0-S2949747724000216-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140349865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1016/j.aichem.2024.100062
Yujun Liu , Xiaolong Zhang , Luotong Li , Xingchen Liu , Tingyu Lei , Jiawei Bai , Wenping Guo , Yuwei Zhou , Xingwu Liu , Botao Teng , Xiaodong Wen
Fe-based Fischer-Tropsch Synthesis (FTS) enables the selective conversion of syngas into long-chain hydrocarbons, which can be further refined to produce highly demanded liquid fuels and high-value chemical products. However, developing novel heterogeneous catalysts for FTS with desirable performance characteristics is a challenging task, as their performance depends on various factors such as precursor, support material, promoters, pretreatment conditions and the catalyst structures. Thus, it remains difficult to understand the structure-performance relationship of FTS and to optimize the catalyst formulations and operating conditions rationally. By integrating traditional chemistry with machine learning, we herein establish intrinsic correlations among reduction, reaction conditions, phase information and the methane selectivity of Fe-based FTS, using high quality experimental data. The content of the iron phases in the post-reaction phase, particularly χ-Fe5C2, significantly influences the methane selectivity of the catalyst. Four types of additives K, Cu, SiO2, and Ca could effectively suppress the methane selectivity, most likely by promoting or stabilizing the iron carbide phases, indicated by their strong correlation. The machine learned structure-performance relationships offers new insights into the design of Fe-based FTS catalysts, and could guide the further optimization of the preprocessing conditions and various parameter factors to minimize the methane selectivity of FTS.
{"title":"Machine learning insights into catalyst composition and structural effects on CH4 selectivity in iron-based fischer tropsch synthesis","authors":"Yujun Liu , Xiaolong Zhang , Luotong Li , Xingchen Liu , Tingyu Lei , Jiawei Bai , Wenping Guo , Yuwei Zhou , Xingwu Liu , Botao Teng , Xiaodong Wen","doi":"10.1016/j.aichem.2024.100062","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100062","url":null,"abstract":"<div><p>Fe-based Fischer-Tropsch Synthesis (FTS) enables the selective conversion of syngas into long-chain hydrocarbons, which can be further refined to produce highly demanded liquid fuels and high-value chemical products. However, developing novel heterogeneous catalysts for FTS with desirable performance characteristics is a challenging task, as their performance depends on various factors such as precursor, support material, promoters, pretreatment conditions and the catalyst structures. Thus, it remains difficult to understand the structure-performance relationship of FTS and to optimize the catalyst formulations and operating conditions rationally. By integrating traditional chemistry with machine learning, we herein establish intrinsic correlations among reduction, reaction conditions, phase information and the methane selectivity of Fe-based FTS, using high quality experimental data. The content of the iron phases in the post-reaction phase, particularly χ-Fe<sub>5</sub>C<sub>2</sub>, significantly influences the methane selectivity of the catalyst. Four types of additives K, Cu, SiO<sub>2</sub>, and Ca could effectively suppress the methane selectivity, most likely by promoting or stabilizing the iron carbide phases, indicated by their strong correlation. The machine learned structure-performance relationships offers new insights into the design of Fe-based FTS catalysts, and could guide the further optimization of the preprocessing conditions and various parameter factors to minimize the methane selectivity of FTS.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000204/pdfft?md5=062c38e0ee5dbc49728857b869639811&pid=1-s2.0-S2949747724000204-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140346879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-30DOI: 10.1016/j.aichem.2024.100061
Lucía Morán-González , Feliu Maseras
The application of artificial intelligence to chemistry usually focuses on the identification of good correlations between descriptors and a given property of interest. The descriptors often come from arbitrary sets, with the implicit assumption that the evaluation of a sufficiently wide range of descriptors will lead to a satisfactory choice. Recent work in our group has focused on applying statistical analysis to large amounts of DFT results with the goal of finding optimal descriptor sets for a given property, which we label as hidden descriptors. This article briefly discusses this treatment and the chemical knowledge that has been gained through its application in two different domains: metal-ligand bond strength in transition metal complexes, and energy barriers in bimolecular nucleophilic substitution reactions.
{"title":"Hidden descriptors: Using statistical treatments to generate better descriptor sets","authors":"Lucía Morán-González , Feliu Maseras","doi":"10.1016/j.aichem.2024.100061","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100061","url":null,"abstract":"<div><p>The application of artificial intelligence to chemistry usually focuses on the identification of good correlations between descriptors and a given property of interest. The descriptors often come from arbitrary sets, with the implicit assumption that the evaluation of a sufficiently wide range of descriptors will lead to a satisfactory choice. Recent work in our group has focused on applying statistical analysis to large amounts of DFT results with the goal of finding optimal descriptor sets for a given property, which we label as hidden descriptors. This article briefly discusses this treatment and the chemical knowledge that has been gained through its application in two different domains: metal-ligand bond strength in transition metal complexes, and energy barriers in bimolecular nucleophilic substitution reactions.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000198/pdfft?md5=541c02174c39b94d0f3787f465d80154&pid=1-s2.0-S2949747724000198-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140345108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-19DOI: 10.1016/j.aichem.2024.100060
Xialan Dong, Weifan Zheng
Drug repurposing is an approach to identifying new uses for existing drugs, where advanced computational methods, such as text and graph embedding techniques, are playing an ever-increasing role. This review provides a timely overview of these embedding methods for drug repurposing and discusses their integration with machine learning. Text embedding techniques, such as Word2Vec, FastText, BERT, and Doc2Vec, enable the analysis of biomedical literature and clinical data to discover potential drug-disease relationships. These methods convert textual data into numerical representations, allowing for similarity calculations and predictive modeling. Several successful applications of text embedding for drug repurposing are highlighted. In addition, graph embedding methods, such as Node2Vec and GraphSAGE, are being employed to convert complex biological knowledge graphs into vector representations. These representations facilitate various network analysis tasks, including predicting drug-target interactions and identifying hidden associations between drugs and diseases. Case studies in both technologies demonstrate their effectiveness in drug repurposing. The advantages and limitations of both text and graph embedding technologies, and their complementarity with traditional structure-based approaches have been discussed. Finally, text and graph embedding methods can be employed in conjunction with traditional approaches of computational methods, which can offer a promising path to identifying novel drug repurposing opportunities, particularly for rare diseases.
{"title":"Emerging technologies for drug repurposing: Harnessing the potential of text and graph embedding approaches","authors":"Xialan Dong, Weifan Zheng","doi":"10.1016/j.aichem.2024.100060","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100060","url":null,"abstract":"<div><p>Drug repurposing is an approach to identifying new uses for existing drugs, where advanced computational methods, such as text and graph embedding techniques, are playing an ever-increasing role. This review provides a timely overview of these embedding methods for drug repurposing and discusses their integration with machine learning. Text embedding techniques, such as Word2Vec, FastText, BERT, and Doc2Vec, enable the analysis of biomedical literature and clinical data to discover potential drug-disease relationships. These methods convert textual data into numerical representations, allowing for similarity calculations and predictive modeling. Several successful applications of text embedding for drug repurposing are highlighted. In addition, graph embedding methods, such as Node2Vec and GraphSAGE, are being employed to convert complex biological knowledge graphs into vector representations. These representations facilitate various network analysis tasks, including predicting drug-target interactions and identifying hidden associations between drugs and diseases. Case studies in both technologies demonstrate their effectiveness in drug repurposing. The advantages and limitations of both text and graph embedding technologies, and their complementarity with traditional structure-based approaches have been discussed. Finally, text and graph embedding methods can be employed in conjunction with traditional approaches of computational methods, which can offer a promising path to identifying novel drug repurposing opportunities, particularly for rare diseases.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000186/pdfft?md5=181c1bd52e69ed59bfcbf5f5691c8a0d&pid=1-s2.0-S2949747724000186-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140181159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-15DOI: 10.1016/j.aichem.2024.100059
María Judit Montes de Oca-Estévez , Rita Prosmiti
The choice of a proper machine learning (ML) algorithm for constructing potential energy surface (PES) models has become a crucial tool in the fields of quantum chemistry and computational modeling. These algorithms offer the ability to make reliable and accurate predictions at a reasonable computational cost, and thus they can be then used in various molecular dynamics and spectroscopic studies. For that, it is not surprising that much of the current research focuses on the development of software that generates machine learning models using precalculated ab initio data points. This study is primarily dedicated to the application and assessment of various automated learning models. These models are trained and tested using datasets derived from CCSD(T)/CBS[56] calculations, aiming to represent intermolecular interactions in small molecules, such as the NgH complexes, where Ng represents helium (He), neon (Ne), and argon (Ar) atoms. These noble gas-containing molecules have gained increasing significance in the field of molecular astrochemistry, due to the recent discovery of HeH+ and ArH+ molecular cations in the interstellar medium (ISM), thereby opening up a wide range of possibilities in this scientific area. Consequently, the ML-generated PESs are employed to compute vibrational bound states for these molecular cations, with the goal of characterizing all their known isotopologues. Furthermore, the results are compared with spectroscopic data, when available, from previous studies in the literature. Our findings have the potential to provide valuable guidance for future ML-PES development and benchmarking studies involving noble gas-containing cations of astrophysical importance.
选择适当的机器学习(ML)算法来构建势能面(PES)模型已成为量子化学和计算建模领域的重要工具。这些算法能够以合理的计算成本做出可靠而准确的预测,因此可用于各种分子动力学和光谱学研究。因此,当前大部分研究都集中在利用预计算的 ab initio 数据点生成机器学习模型的软件开发上,也就不足为奇了。本研究主要致力于各种自动学习模型的应用和评估。这些模型是利用 CCSD(T)/CBS[56] 计算得到的数据集进行训练和测试的,旨在表示小分子中的分子间相互作用,如 NgH2+ 复合物,其中 Ng 代表氦(He)、氖(Ne)和氩(Ar)原子。由于最近在星际介质(ISM)中发现了 HeH+ 和 ArH+ 分子阳离子,这些含惰性气体的分子在分子天体化学领域的重要性与日俱增,从而为这一科学领域带来了广泛的可能性。因此,我们利用 ML 生成的 PES 计算了这些分子阳离子的振动束缚态,目的是确定其所有已知同素异形体的特征。此外,我们还将计算结果与以往文献研究中的光谱数据(如有)进行了比较。我们的研究结果有可能为未来涉及具有天体物理学重要性的含惰性气体阳离子的 ML-PES 开发和基准研究提供有价值的指导。
{"title":"Automated learning data-driven potential models for spectroscopic characterization of astrophysical interest noble gas-containing NgH2+ molecules","authors":"María Judit Montes de Oca-Estévez , Rita Prosmiti","doi":"10.1016/j.aichem.2024.100059","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100059","url":null,"abstract":"<div><p>The choice of a proper machine learning (ML) algorithm for constructing potential energy surface (PES) models has become a crucial tool in the fields of quantum chemistry and computational modeling. These algorithms offer the ability to make reliable and accurate predictions at a reasonable computational cost, and thus they can be then used in various molecular dynamics and spectroscopic studies. For that, it is not surprising that much of the current research focuses on the development of software that generates machine learning models using precalculated <em>ab initio</em> data points. This study is primarily dedicated to the application and assessment of various automated learning models. These models are trained and tested using datasets derived from CCSD(T)/CBS[56] calculations, aiming to represent intermolecular interactions in small molecules, such as the NgH<span><math><msubsup><mrow></mrow><mrow><mn>2</mn></mrow><mrow><mo>+</mo></mrow></msubsup></math></span> complexes, where Ng represents helium (He), neon (Ne), and argon (Ar) atoms. These noble gas-containing molecules have gained increasing significance in the field of molecular astrochemistry, due to the recent discovery of HeH<sup>+</sup> and ArH<sup>+</sup> molecular cations in the interstellar medium (ISM), thereby opening up a wide range of possibilities in this scientific area. Consequently, the ML-generated PESs are employed to compute vibrational bound states for these molecular cations, with the goal of characterizing all their known isotopologues. Furthermore, the results are compared with spectroscopic data, when available, from previous studies in the literature. Our findings have the potential to provide valuable guidance for future ML-PES development and benchmarking studies involving noble gas-containing cations of astrophysical importance.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000174/pdfft?md5=b79fa9d5a2a5cca40b8a1c916bfa7bf5&pid=1-s2.0-S2949747724000174-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140161021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-06DOI: 10.1016/j.aichem.2024.100058
Ang Guo , Zhiyu Chen , Yinzhong Ma , Yueguang Lv , Huanhuan Yan , Fang Li , Yao Xing , Qian Luo , Hairong Zheng
We present SOmicsFusion, a software toolbox for ’fusing’ spatial omics with classical biomedical imaging modalities, capitalizing on their inherent correspondences and complementarity when characterizing the same subject. By augmenting radiological and histological images with spatially resolved molecular profiling, this fusion offers a panoramic characterization of the biochemical perturbations underlying pathological conditions, thereby advancing our understanding of diseases like brain disorders and cancers. The cornerstone of SOmicsFusion is a coregistration tool that leverages an innovative two-stage machine learning pipeline to tackle the longstanding challenge of spatially aligning data from fundamentally different modalities, priming them for subsequent fusion analysis that often requires precise pixel-wise correspondence between the datasets. Specifically, the pipeline utilizes an original dimension reduction algorithm for representational domain alignment, followed by a Deep Learning-based method for spatial domain alignment. SOmicsFusion is demonstrated using mass spectrometry imaging (MSI)-mediated spatial metabolomics and four other modalities: magnetic resonance imaging (MRI), microscopy, brain atlas, and spatial transcriptomics. By reducing coregistration errors by 38–69% compared to existing pipelines, SOmicsFusion enhances the precision of associating molecule distribution with anatomy and pathology features, ultimately leading to more statistically robust findings. Furthermore, SOmicsFusion incorporates various downstream analysis tools, including overlay visualization, spatial correlation/co-expression analysis, pansharpening, and automated anatomy annotation. These tools facilitate the extraction of biological insights that would be unattainable through individual modalities alone. For instance, the coregistration and correlation between MSI and in vivo MRI datasets unveil that the spatial heterogeneity in metabolites stems from the temporal heterogeneity in the development of cerebral ischemia-reperfusion injury.
{"title":"SOmicsFusion: Multimodal coregistration and fusion between spatial metabolomics and biomedical imaging","authors":"Ang Guo , Zhiyu Chen , Yinzhong Ma , Yueguang Lv , Huanhuan Yan , Fang Li , Yao Xing , Qian Luo , Hairong Zheng","doi":"10.1016/j.aichem.2024.100058","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100058","url":null,"abstract":"<div><p>We present SOmicsFusion, a software toolbox for ’fusing’ spatial omics with classical biomedical imaging modalities, capitalizing on their inherent correspondences and complementarity when characterizing the same subject. By augmenting radiological and histological images with spatially resolved molecular profiling, this fusion offers a panoramic characterization of the biochemical perturbations underlying pathological conditions, thereby advancing our understanding of diseases like brain disorders and cancers. The cornerstone of SOmicsFusion is a coregistration tool that leverages an innovative two-stage machine learning pipeline to tackle the longstanding challenge of spatially aligning data from fundamentally different modalities, priming them for subsequent fusion analysis that often requires precise pixel-wise correspondence between the datasets. Specifically, the pipeline utilizes an original dimension reduction algorithm for representational domain alignment, followed by a Deep Learning-based method for spatial domain alignment. SOmicsFusion is demonstrated using mass spectrometry imaging (MSI)-mediated spatial metabolomics and four other modalities: magnetic resonance imaging (MRI), microscopy, brain atlas, and spatial transcriptomics. By reducing coregistration errors by 38–69% compared to existing pipelines, SOmicsFusion enhances the precision of associating molecule distribution with anatomy and pathology features, ultimately leading to more statistically robust findings. Furthermore, SOmicsFusion incorporates various downstream analysis tools, including overlay visualization, spatial correlation/co-expression analysis, pansharpening, and automated anatomy annotation. These tools facilitate the extraction of biological insights that would be unattainable through individual modalities alone. For instance, the coregistration and correlation between MSI and in vivo MRI datasets unveil that the spatial heterogeneity in metabolites stems from the temporal heterogeneity in the development of cerebral ischemia-reperfusion injury.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000162/pdfft?md5=963d89fc26dbcda572405e5ce54d24ee&pid=1-s2.0-S2949747724000162-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140134391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}