首页 > 最新文献

Journal of Cheminformatics最新文献

英文 中文
BuildAMol: a versatile Python toolkit for fragment-based molecular design BuildAMol:基于片段的分子设计的通用 Python 工具包。
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-25 DOI: 10.1186/s13321-024-00900-6
Noah Kleinschmidt, Thomas Lemmin

In recent years computational methods for molecular modeling have become a prime focus of computational biology and cheminformatics. Many dedicated systems exist for modeling specific classes of molecules such as proteins or small drug-like ligands. These are often heavily tailored toward the automated generation of molecular structures based on some meta-input by the user and are not intended for expert-driven structure assembly. Dedicated manual or semi-automated assembly software tools exist for a variety of molecule classes but are limited in the scope of structures they can produce. In this work we present BuildAMol, a highly flexible and extendable, general-purpose fragment-based molecular assembly toolkit. Written in Python and featuring a well-documented, user-friendly API, BuildAMol empowers researchers with a framework for detailed manual or semi-automated construction of diverse molecular models. Unlike specialized software, BuildAMol caters to a broad range of applications. We demonstrate its versatility across various use cases, encompassing generating metal complexes or the modeling of dendrimers or integrated into a drug discovery pipeline. By providing a robust foundation for expert-driven model building, BuildAMol holds promise as a valuable tool for the continuous integration and advancement of powerful deep learning techniques.

Scientific contribution

BuildAMol introduces a cutting-edge framework for molecular modeling that seamlessly blends versatility with user-friendly accessibility. This innovative toolkit integrates modeling, modification, optimization, and visualization functions within a unified API, and facilitates collaboration with other cheminformatics libraries. BuildAMol, with its shallow learning curve, serves as a versatile tool for various molecular applications while also laying the groundwork for the development of specialized software tools, contributing to the progress of molecular research and innovation.

近年来,分子建模的计算方法已成为计算生物学和化学信息学的首要关注点。有许多专用系统可用于蛋白质或小药物配体等特定类别分子的建模。这些系统通常是根据用户的一些元输入自动生成分子结构,而不是用于专家驱动的结构组装。目前已有针对各种分子类别的专用手动或半自动组装软件工具,但它们能生成的结构范围有限。在这项工作中,我们介绍了 BuildAMol,一个高度灵活、可扩展、基于片段的通用分子组装工具包。BuildAMol 使用 Python 编写,具有文档齐全、用户友好的应用程序接口(API),为研究人员提供了一个详细的手动或半自动构建各种分子模型的框架。与专用软件不同,BuildAMol 可满足广泛的应用需求。我们展示了它在各种用例中的多功能性,包括生成金属复合物或树枝状分子模型,或集成到药物发现流水线中。通过为专家驱动的模型构建提供强大的基础,BuildAMol有望成为持续集成和推进强大深度学习技术的宝贵工具。这一创新工具包在统一的应用程序接口(API)中集成了建模、修改、优化和可视化功能,并促进了与其他化学信息学库的协作。BuildAMol 的学习曲线较浅,是适用于各种分子应用的多功能工具,同时也为开发专用软件工具奠定了基础,促进了分子研究和创新的进步。
{"title":"BuildAMol: a versatile Python toolkit for fragment-based molecular design","authors":"Noah Kleinschmidt,&nbsp;Thomas Lemmin","doi":"10.1186/s13321-024-00900-6","DOIUrl":"10.1186/s13321-024-00900-6","url":null,"abstract":"<div><p>In recent years computational methods for molecular modeling have become a prime focus of computational biology and cheminformatics. Many dedicated systems exist for modeling specific classes of molecules such as proteins or small drug-like ligands. These are often heavily tailored toward the automated generation of molecular structures based on some meta-input by the user and are not intended for expert-driven structure assembly. Dedicated manual or semi-automated assembly software tools exist for a variety of molecule classes but are limited in the scope of structures they can produce. In this work we present BuildAMol, a highly flexible and extendable, general-purpose fragment-based molecular assembly toolkit. Written in Python and featuring a well-documented, user-friendly API, BuildAMol empowers researchers with a framework for detailed manual or semi-automated construction of diverse molecular models. Unlike specialized software, BuildAMol caters to a broad range of applications. We demonstrate its versatility across various use cases, encompassing generating metal complexes or the modeling of dendrimers or integrated into a drug discovery pipeline. By providing a robust foundation for expert-driven model building, BuildAMol holds promise as a valuable tool for the continuous integration and advancement of powerful deep learning techniques.</p><p><b>Scientific contribution</b></p><p>BuildAMol introduces a cutting-edge framework for molecular modeling that seamlessly blends versatility with user-friendly accessibility. This innovative toolkit integrates modeling, modification, optimization, and visualization functions within a unified API, and facilitates collaboration with other cheminformatics libraries. BuildAMol, with its shallow learning curve, serves as a versatile tool for various molecular applications while also laying the groundwork for the development of specialized software tools, contributing to the progress of molecular research and innovation.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00900-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142054568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning of multimodal networks with topological regularization for drug repositioning 利用拓扑正则化深度学习多模态网络,实现药物重新定位
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-23 DOI: 10.1186/s13321-024-00897-y
Yuto Ohnuki, Manato Akiyama, Yasubumi Sakakibara

Motivation

Computational techniques for drug-disease prediction are essential in enhancing drug discovery and repositioning. While many methods utilize multimodal networks from various biological databases, few integrate comprehensive multi-omics data, including transcriptomes, proteomes, and metabolomes. We introduce STRGNN, a novel graph deep learning approach that predicts drug-disease relationships using extensive multimodal networks comprising proteins, RNAs, metabolites, and compounds. We have constructed a detailed dataset incorporating multi-omics data and developed a learning algorithm with topological regularization. This algorithm selectively leverages informative modalities while filtering out redundancies.

Results

STRGNN demonstrates superior accuracy compared to existing methods and has identified several novel drug effects, corroborating existing literature. STRGNN emerges as a powerful tool for drug prediction and discovery. The source code for STRGNN, along with the dataset for performance evaluation, is available at https://github.com/yuto-ohnuki/STRGNN.git.

动机用于药物-疾病预测的计算技术对于促进药物发现和重新定位至关重要。虽然许多方法都利用了来自各种生物数据库的多模态网络,但很少有方法能整合全面的多组学数据,包括转录组、蛋白质组和代谢组。我们介绍的 STRGNN 是一种新颖的图深度学习方法,它利用由蛋白质、RNA、代谢物和化合物组成的广泛多模态网络预测药物与疾病的关系。我们构建了一个包含多组学数据的详细数据集,并开发了一种具有拓扑正则化的学习算法。结果与现有方法相比,STRGNN 的准确性更胜一筹,并发现了几种新的药物效应,证实了现有文献的观点。STRGNN 成为药物预测和发现的强大工具。STRGNN 的源代码以及用于性能评估的数据集可在 https://github.com/yuto-ohnuki/STRGNN.git 网站上获取。
{"title":"Deep learning of multimodal networks with topological regularization for drug repositioning","authors":"Yuto Ohnuki,&nbsp;Manato Akiyama,&nbsp;Yasubumi Sakakibara","doi":"10.1186/s13321-024-00897-y","DOIUrl":"10.1186/s13321-024-00897-y","url":null,"abstract":"<div><h3>Motivation</h3><p>Computational techniques for drug-disease prediction are essential in enhancing drug discovery and repositioning. While many methods utilize multimodal networks from various biological databases, few integrate comprehensive multi-omics data, including transcriptomes, proteomes, and metabolomes. We introduce STRGNN, a novel graph deep learning approach that predicts drug-disease relationships using extensive multimodal networks comprising proteins, RNAs, metabolites, and compounds. We have constructed a detailed dataset incorporating multi-omics data and developed a learning algorithm with topological regularization. This algorithm selectively leverages informative modalities while filtering out redundancies.</p><h3>Results</h3><p>STRGNN demonstrates superior accuracy compared to existing methods and has identified several novel drug effects, corroborating existing literature. STRGNN emerges as a powerful tool for drug prediction and discovery. The source code for STRGNN, along with the dataset for performance evaluation, is available at https://github.com/yuto-ohnuki/STRGNN.git.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00897-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142041500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic molecular fragmentation by evolutionary optimisation 通过进化优化实现自动分子破碎
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-19 DOI: 10.1186/s13321-024-00896-z
Fiona C. Y. Yu, Jorge L. Gálvez Vallejo, Giuseppe M. J. Barca

Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ (hbox {mol}^{-1}), respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ (hbox {mol}^{-1}) for MBE2 and 24.3 kJ (hbox {mol}^{-1}) for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ (hbox {mol}^{-1}) were observed at the MBE2 and MBE3 levels, respectively.

Scientific Contribution This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.

分子破碎是一套有效的方法,可降低量子化学计算的形式计算复杂度,同时提高算法的并行性。然而,由于缺乏自动化和有效的指标来评估分片方案的质量,分片技术的实际应用仍然受到阻碍。在这篇文章中,我们介绍了 "通过自动遗传搜索进行快速片段化"(QFRAGS),这是一种新型的自动片段化算法,它使用遗传优化程序生成分子片段,在多体扩展(MBE)中使用时产生低能量误差。通过在 HF/6-31G* 水平上使用二体(MBE2)和三体(MBE3)MBE 计算,对少于 500 个原子的蛋白质系统进行了 QFRAGS 基准测试,结果显示平均绝对能量误差(MAEE)分别为 20.6 和 2.2 kJ $hbox {mol}^{-1}$。对于超过 500 个原子的大型蛋白质系统,MBE2 的平均绝对能量误差为 181.5 kJ $hbox {mol}^{-1}$ 和 MBE3 的平均绝对能量误差为 24.3 kJ $hbox {mol}^{-1}$。此外,在使用 MBE 和片段分子轨道技术对 40 个蛋白质数据集进行人工片段分析时,QFRAGS 与三种人工片段分析方案进行了比较,QFRAGS 可获得相当或更低的 MAEE。当应用于 10 个脂聚糖/糖脂数据集时,在 MBE2 和 MBE3 水平上观察到的 MAE 分别为 7.9 和 0.3 kJ $hbox {mol}^{-1}$ 。科学贡献 本文介绍了 "通过自动遗传搜索进行快速破碎"(QFRAGS),这是一种创新的分子破碎算法,通过专门解决现有分子破碎方法缺乏自动化和有效破碎质量指标的问题,大大改进了现有的分子破碎方法。QFRAGS 采用进化优化策略,积极追求高质量的片段,生成的片段方案在拥有数百到数千个原子的系统中表现出最小的能量误差。QFRAGS 的出现代表了分子破碎领域的重大进步,大大提高了精确量子化学计算的可及性和计算可行性。
{"title":"Automatic molecular fragmentation by evolutionary optimisation","authors":"Fiona C. Y. Yu,&nbsp;Jorge L. Gálvez Vallejo,&nbsp;Giuseppe M. J. Barca","doi":"10.1186/s13321-024-00896-z","DOIUrl":"10.1186/s13321-024-00896-z","url":null,"abstract":"<div><p>Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ <span>(hbox {mol}^{-1})</span>, respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ <span>(hbox {mol}^{-1})</span> for MBE2 and 24.3 kJ <span>(hbox {mol}^{-1})</span> for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ <span>(hbox {mol}^{-1})</span> were observed at the MBE2 and MBE3 levels, respectively.</p><p><b>Scientific Contribution</b> This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00896-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow 化学信息学民主化:利用 KNIME 自动工作流程进行可解释的化学分组
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-16 DOI: 10.1186/s13321-024-00894-1
José T. Moreira-Filho, Dhruv Ranganath, Mike Conway, Charles Schmitt, Nicole Kleinstreuer, Kamel Mansouri

With the increased availability of chemical data in public databases, innovative techniques and algorithms have emerged for the analysis, exploration, visualization, and extraction of information from these data. One such technique is chemical grouping, where chemicals with common characteristics are categorized into distinct groups based on physicochemical properties, use, biological activity, or a combination. However, existing tools for chemical grouping often require specialized programming skills or the use of commercial software packages. To address these challenges, we developed a user-friendly chemical grouping workflow implemented in KNIME, a free, open-source, low/no-code, data analytics platform. The workflow serves as an all-encompassing tool, expertly incorporating a range of processes such as molecular descriptor calculation, feature selection, dimensionality reduction, hyperparameter search, and supervised and unsupervised machine learning methods, enabling effective chemical grouping and visualization of results. Furthermore, we implemented tools for interpretation, identifying key molecular descriptors for the chemical groups, and using natural language summaries to clarify the rationale behind these groupings. The workflow was designed to run seamlessly in both the KNIME local desktop version and KNIME Server WebPortal as a web application. It incorporates interactive interfaces and guides to assist users in a step-by-step manner. We demonstrate the utility of this workflow through a case study using an eye irritation and corrosion dataset.

Scientific contributions

This work presents a novel, comprehensive chemical grouping workflow in KNIME, enhancing accessibility by integrating a user-friendly graphical interface that eliminates the need for extensive programming skills. This workflow uniquely combines several features such as automated molecular descriptor calculation, feature selection, dimensionality reduction, and machine learning algorithms (both supervised and unsupervised), with hyperparameter optimization to refine chemical grouping accuracy. Moreover, we have introduced an innovative interpretative step and natural language summaries to elucidate the underlying reasons for chemical groupings, significantly advancing the usability of the tool and interpretability of the results.

随着化学数据在公共数据库中的可用性不断提高,用于分析、探索、可视化以及从这些数据中提取信息的创新技术和算法也应运而生。其中一种技术是化学分组,即根据理化性质、用途、生物活性或其组合,将具有共同特征的化学物质分为不同的组别。然而,现有的化学分组工具通常需要专业的编程技能或使用商业软件包。为了应对这些挑战,我们在 KNIME(一个免费、开源、低代码/无代码的数据分析平台)中开发了一个用户友好型化学分组工作流。该工作流是一个全方位的工具,专业地整合了分子描述符计算、特征选择、降维、超参数搜索、监督和非监督机器学习方法等一系列过程,实现了有效的化学分组和结果可视化。此外,我们还实施了解释工具,确定化学组的关键分子描述符,并使用自然语言摘要阐明这些分组背后的原理。该工作流程可在 KNIME 本地桌面版和 KNIME 服务器 WebPortal 网页应用中无缝运行。工作流程包含互动界面和指南,可帮助用户逐步操作。科学贡献 本研究在 KNIME 中提出了一种新颖、全面的化学分组工作流,通过集成用户友好的图形界面提高了工作流的可访问性,无需大量的编程技能。该工作流程独特地结合了多种功能,如自动分子描述符计算、特征选择、降维、机器学习算法(监督和非监督)以及超参数优化,以提高化学分组的准确性。此外,我们还引入了创新的解释步骤和自然语言摘要,以阐明化学分组的根本原因,从而大大提高了工具的可用性和结果的可解释性。
{"title":"Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow","authors":"José T. Moreira-Filho,&nbsp;Dhruv Ranganath,&nbsp;Mike Conway,&nbsp;Charles Schmitt,&nbsp;Nicole Kleinstreuer,&nbsp;Kamel Mansouri","doi":"10.1186/s13321-024-00894-1","DOIUrl":"10.1186/s13321-024-00894-1","url":null,"abstract":"<div><p>With the increased availability of chemical data in public databases, innovative techniques and algorithms have emerged for the analysis, exploration, visualization, and extraction of information from these data. One such technique is chemical grouping, where chemicals with common characteristics are categorized into distinct groups based on physicochemical properties, use, biological activity, or a combination. However, existing tools for chemical grouping often require specialized programming skills or the use of commercial software packages. To address these challenges, we developed a user-friendly chemical grouping workflow implemented in KNIME, a free, open-source, low/no-code, data analytics platform. The workflow serves as an all-encompassing tool, expertly incorporating a range of processes such as molecular descriptor calculation, feature selection, dimensionality reduction, hyperparameter search, and supervised and unsupervised machine learning methods, enabling effective chemical grouping and visualization of results. Furthermore, we implemented tools for interpretation, identifying key molecular descriptors for the chemical groups, and using natural language summaries to clarify the rationale behind these groupings. The workflow was designed to run seamlessly in both the KNIME local desktop version and KNIME Server WebPortal as a web application. It incorporates interactive interfaces and guides to assist users in a step-by-step manner. We demonstrate the utility of this workflow through a case study using an eye irritation and corrosion dataset.</p><p><b>Scientific contributions</b></p><p>This work presents a novel, comprehensive chemical grouping workflow in KNIME, enhancing accessibility by integrating a user-friendly graphical interface that eliminates the need for extensive programming skills. This workflow uniquely combines several features such as automated molecular descriptor calculation, feature selection, dimensionality reduction, and machine learning algorithms (both supervised and unsupervised), with hyperparameter optimization to refine chemical grouping accuracy. Moreover, we have introduced an innovative interpretative step and natural language summaries to elucidate the underlying reasons for chemical groupings, significantly advancing the usability of the tool and interpretability of the results.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00894-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metis: a python-based user interface to collect expert feedback for generative chemistry models Metis:基于 python- 的用户界面,用于收集生成化学模型的专家反馈意见
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-14 DOI: 10.1186/s13321-024-00892-3
Janosch Menke, Yasmine Nahal, Esben Jannik Bjerrum, Mikhail Kabeshov, Samuel Kaski, Ola Engkvist

One challenge that current de novo drug design models face is a disparity between the user’s expectations and the actual output of the model in practical applications. Tailoring models to better align with chemists’ implicit knowledge, expectation and preferences is key to overcoming this obstacle effectively. While interest in preference-based and human-in-the-loop machine learning in chemistry is continuously increasing, no tool currently exists that enables the collection of standardized and chemistry-specific feedback. Metis is a Python-based open-source graphical user interface (GUI), designed to solve this and enable the collection of chemists’ detailed feedback on molecular structures. The GUI enables chemists to explore and evaluate molecules, offering a user-friendly interface for annotating preferences and specifying desired or undesired structural features. By providing chemists the opportunity to give detailed feedback, allows researchers to capture more efficiently the chemist’s implicit knowledge and preferences. This knowledge is crucial to align the chemist’s idea with the de novo design agents. The GUI aims to enhance this collaboration between the human and the “machine” by providing an intuitive platform where chemists can interactively provide feedback on molecular structures, aiding in preference learning and refining de novo design strategies. Metis integrates with the existing de novo framework REINVENT, creating a closed-loop system where human expertise can continuously inform and refine the generative models.

Scientific contribution

We introduce a novel Graphical User Interface, that allows chemists/researchers to give detailed feedback on substructures and properties of small molecules. This tool can be used to learn the preferences of chemists in order to align de novo drug design models with the chemist’s ideas. The GUI can be customized to fit different needs and projects and enables direct integration into de novo REINVENT runs. We believe that Metis can facilitate the discussion and development of novel ways to integrate human feedback that goes beyond binary decisions of liking or disliking a molecule.

目前的新药设计模型面临的一个挑战是,在实际应用中,用户的期望与模型的实际输出之间存在差距。要有效克服这一障碍,关键在于对模型进行定制,使其更好地符合化学家的隐含知识、期望和偏好。虽然人们对化学领域基于偏好和人在环机器学习的兴趣与日俱增,但目前还没有一种工具能够收集标准化的化学反馈。Metis 是一个基于 Python 的开源图形用户界面(GUI),旨在解决这一问题,并收集化学家对分子结构的详细反馈。该图形用户界面使化学家能够探索和评估分子,提供了一个用户友好型界面,用于注释偏好和指定所需或不需要的结构特征。通过为化学家提供详细反馈的机会,研究人员可以更有效地捕捉化学家的隐含知识和偏好。这些知识对于将化学家的想法与从头设计代理结合起来至关重要。图形用户界面旨在加强人类与 "机器 "之间的合作,它提供了一个直观的平台,化学家可以通过该平台对分子结构进行交互式反馈,从而帮助偏好学习和改进从头设计策略。Metis 与现有的从头设计框架 REINVENT 相集成,创建了一个闭环系统,在这个系统中,人类的专业知识可以不断地为生成模型提供信息并对其进行完善。科学贡献我们介绍了一种新颖的图形用户界面,它允许化学家/研究人员对小分子的子结构和性质提供详细的反馈。该工具可用于了解化学家的偏好,从而使新药设计模型与化学家的想法保持一致。图形用户界面可根据不同需求和项目进行定制,并可直接集成到从头开始的 REINVENT 运行中。我们相信,Metis 可以促进对整合人类反馈的新方法的讨论和开发,这种方法超越了喜欢或不喜欢分子的二元决定。
{"title":"Metis: a python-based user interface to collect expert feedback for generative chemistry models","authors":"Janosch Menke,&nbsp;Yasmine Nahal,&nbsp;Esben Jannik Bjerrum,&nbsp;Mikhail Kabeshov,&nbsp;Samuel Kaski,&nbsp;Ola Engkvist","doi":"10.1186/s13321-024-00892-3","DOIUrl":"10.1186/s13321-024-00892-3","url":null,"abstract":"<div><p>One challenge that current de novo drug design models face is a disparity between the user’s expectations and the actual output of the model in practical applications. Tailoring models to better align with chemists’ implicit knowledge, expectation and preferences is key to overcoming this obstacle effectively. While interest in preference-based and human-in-the-loop machine learning in chemistry is continuously increasing, no tool currently exists that enables the collection of standardized and chemistry-specific feedback. <span>Metis</span> is a Python-based open-source graphical user interface (GUI), designed to solve this and enable the collection of chemists’ detailed feedback on molecular structures. The GUI enables chemists to explore and evaluate molecules, offering a user-friendly interface for annotating preferences and specifying desired or undesired structural features. By providing chemists the opportunity to give detailed feedback, allows researchers to capture more efficiently the chemist’s implicit knowledge and preferences. This knowledge is crucial to align the chemist’s idea with the de novo design agents. The GUI aims to enhance this collaboration between the human and the “machine” by providing an intuitive platform where chemists can interactively provide feedback on molecular structures, aiding in preference learning and refining de novo design strategies. <span>Metis</span> integrates with the existing de novo framework REINVENT, creating a closed-loop system where human expertise can continuously inform and refine the generative models.</p><p><b>Scientific contribution</b></p><p>We introduce a novel Graphical User Interface, that allows chemists/researchers to give detailed feedback on substructures and properties of small molecules. This tool can be used to learn the preferences of chemists in order to align de novo drug design models with the chemist’s ideas. The GUI can be customized to fit different needs and projects and enables direct integration into de novo REINVENT runs. We believe that <span>Metis</span> can facilitate the discussion and development of novel ways to integrate human feedback that goes beyond binary decisions of liking or disliking a molecule.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00892-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric deep learning for molecular property predictions with chemical accuracy across chemical space 用几何深度学习预测分子性质,实现跨化学空间的化学准确性
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-13 DOI: 10.1186/s13321-024-00895-0
Maarten R. Dobbelaere, István Lengyel, Christian V. Stevens, Kevin M. Van Geem

Chemical engineers heavily rely on precise knowledge of physicochemical properties to model chemical processes. Despite the growing popularity of deep learning, it is only rarely applied for property prediction due to data scarcity and limited accuracy for compounds in industrially-relevant areas of the chemical space. Herein, we present a geometric deep learning framework for predicting gas- and liquid-phase properties based on novel quantum chemical datasets comprising 124,000 molecules. Our findings reveal that the necessity for quantum-chemical information in deep learning models varies significantly depending on the modeled physicochemical property. Specifically, our top-performing geometric model meets the most stringent criteria for “chemically accurate” thermochemistry predictions. We also show that by carefully selecting the appropriate model featurization and evaluating prediction uncertainties, the reliability of the predictions can be strongly enhanced. These insights represent a crucial step towards establishing deep learning as the standard property prediction workflow in both industry and academia.

Scientific contribution

We propose a flexible property prediction tool that can handle two-dimensional and three-dimensional molecular information. A thermochemistry prediction methodology that achieves high-level quantum chemistry accuracy for a broad application range is presented. Trained deep learning models and large novel molecular databases of real-world molecules are provided to offer a directly usable and fast property prediction solution to practitioners.

化学工程师在很大程度上依赖物理化学特性的精确知识来建立化学过程模型。尽管深度学习越来越受欢迎,但由于数据稀缺以及化学领域工业相关领域化合物的准确性有限,深度学习很少应用于性质预测。在此,我们基于包含 124,000 个分子的新型量子化学数据集,提出了预测气相和液相性质的几何深度学习框架。我们的研究结果表明,深度学习模型中量子化学信息的必要性因建模理化性质的不同而有很大差异。具体来说,我们表现最佳的几何模型符合 "化学准确 "热化学预测的最严格标准。我们还表明,通过仔细选择适当的模型特征化和评估预测的不确定性,可以大大提高预测的可靠性。这些见解是将深度学习确立为工业界和学术界标准性质预测工作流程的关键一步。科学贡献 我们提出了一种灵活的性质预测工具,可以处理二维和三维分子信息。我们提出了一种热化学预测方法,它能在广泛的应用范围内实现高水平的量子化学准确性。我们提供了训练有素的深度学习模型和大型新型真实世界分子数据库,为从业人员提供了直接可用的快速性质预测解决方案。
{"title":"Geometric deep learning for molecular property predictions with chemical accuracy across chemical space","authors":"Maarten R. Dobbelaere,&nbsp;István Lengyel,&nbsp;Christian V. Stevens,&nbsp;Kevin M. Van Geem","doi":"10.1186/s13321-024-00895-0","DOIUrl":"10.1186/s13321-024-00895-0","url":null,"abstract":"<div><p>Chemical engineers heavily rely on precise knowledge of physicochemical properties to model chemical processes. Despite the growing popularity of deep learning, it is only rarely applied for property prediction due to data scarcity and limited accuracy for compounds in industrially-relevant areas of the chemical space. Herein, we present a geometric deep learning framework for predicting gas- and liquid-phase properties based on novel quantum chemical datasets comprising 124,000 molecules. Our findings reveal that the necessity for quantum-chemical information in deep learning models varies significantly depending on the modeled physicochemical property. Specifically, our top-performing geometric model meets the most stringent criteria for “chemically accurate” thermochemistry predictions. We also show that by carefully selecting the appropriate model featurization and evaluating prediction uncertainties, the reliability of the predictions can be strongly enhanced. These insights represent a crucial step towards establishing deep learning as the standard property prediction workflow in both industry and academia.</p><p><b>Scientific contribution</b></p><p>We propose a flexible property prediction tool that can handle two-dimensional and three-dimensional molecular information. A thermochemistry prediction methodology that achieves high-level quantum chemistry accuracy for a broad application range is presented. Trained deep learning models and large novel molecular databases of real-world molecules are provided to offer a directly usable and fast property prediction solution to practitioners.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00895-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141974156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MolCompass: multi-tool for the navigation in chemical space and visual validation of QSAR/QSPR models MolCompass:用于化学空间导航和 QSAR/QSPR 模型可视化验证的多功能工具。
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-12 DOI: 10.1186/s13321-024-00888-z
Sergey Sosnin

The exponential growth of data is challenging for humans because their ability to analyze data is limited. Especially in chemistry, there is a demand for tools that can visualize molecular datasets in a convenient graphical way. We propose a new, ready-to-use, multi-tool, and open-source framework for visualizing and navigating chemical space. This framework adheres to the low-code/no-code (LCNC) paradigm, providing a KNIME node, a web-based tool, and a Python package, making it accessible to a broad cheminformatics community. The core technique of the MolCompass framework employs a pre-trained parametric t-SNE model. We demonstrate how this framework can be adapted for the visualisation of chemical space and visual validation of binary classification QSAR/QSPR models, revealing their weaknesses and identifying model cliffs. All parts of the framework are publicly available on GitHub, providing accessibility to the broad scientific community.

Scientific contribution

We provide an open-source, ready-to-use set of tools for the visualization of chemical space. These tools can be insightful for chemists to analyze compound datasets and for the visual validation of QSAR/QSPR models.

由于人类分析数据的能力有限,数据的指数级增长对人类来说是一项挑战。特别是在化学领域,人们需要能以便捷的图形方式可视化分子数据集的工具。我们提出了一个新的、即用型、多工具和开源框架,用于可视化和导航化学空间。该框架采用低代码/无代码(LCNC)模式,提供一个 KNIME 节点、一个基于网络的工具和一个 Python 软件包,使广大化学信息学界都能使用。MolCompass 框架的核心技术采用了预先训练的参数 t-SNE 模型。我们展示了如何将这一框架用于化学空间的可视化和二元分类 QSAR/QSPR 模型的可视化验证,揭示其弱点并找出模型悬崖。该框架的所有部分均可在 GitHub 上公开获取,从而为广大科学界提供了可访问性。科学贡献我们为化学空间的可视化提供了一套开源、随时可用的工具。这些工具可以帮助化学家分析化合物数据集,并对 QSAR/QSPR 模型进行可视化验证。
{"title":"MolCompass: multi-tool for the navigation in chemical space and visual validation of QSAR/QSPR models","authors":"Sergey Sosnin","doi":"10.1186/s13321-024-00888-z","DOIUrl":"10.1186/s13321-024-00888-z","url":null,"abstract":"<div><p>The exponential growth of data is challenging for humans because their ability to analyze data is limited. Especially in chemistry, there is a demand for tools that can visualize molecular datasets in a convenient graphical way. We propose a new, ready-to-use, multi-tool, and open-source framework for visualizing and navigating chemical space. This framework adheres to the low-code/no-code (LCNC) paradigm, providing a KNIME node, a web-based tool, and a Python package, making it accessible to a broad cheminformatics community. The core technique of the MolCompass framework employs a pre-trained parametric t-SNE model. We demonstrate how this framework can be adapted for the visualisation of chemical space and visual validation of binary classification QSAR/QSPR models, revealing their weaknesses and identifying model cliffs. All parts of the framework are publicly available on GitHub, providing accessibility to the broad scientific community. </p><p><b>Scientific contribution</b></p><p>We provide an open-source, ready-to-use set of tools for the visualization of chemical space. These tools can be insightful for chemists to analyze compound datasets and for the visual validation of QSAR/QSPR models.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00888-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141915826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Building shape-focused pharmacophore models for effective docking screening 为有效对接筛选建立以形状为重点的药理模型
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-09 DOI: 10.1186/s13321-024-00857-6
Paola Moyano-Gómez, Jukka V. Lehtonen, Olli T. Pentikäinen, Pekka A. Postila

The performance of molecular docking can be improved by comparing the shape similarity of the flexibly sampled poses against the target proteins’ inverted binding cavities. The effectiveness of these pseudo-ligands or negative image-based models in docking rescoring is boosted further by performing enrichment-driven optimization. Here, we introduce a novel shape-focused pharmacophore modeling algorithm O-LAP that generates a new class of cavity-filling models by clumping together overlapping atomic content via pairwise distance graph clustering. Top-ranked poses of flexibly docked active ligands were used as the modeling input and multiple alternative clustering settings were benchmark-tested thoroughly with five demanding drug targets using random training/test divisions. In docking rescoring, the O-LAP modeling typically improved massively on the default docking enrichment; furthermore, the results indicate that the clustered models work well in rigid docking. The C+ +/Qt5-based algorithm O-LAP is released under the GNU General Public License v3.0 via GitHub (https://github.com/jvlehtonen/overlap-toolkit).

通过比较灵活采样姿势与目标蛋白质倒置结合腔的形状相似性,可以提高分子对接的性能。通过进行富集驱动优化,这些伪配体或基于负像的模型在对接重构中的有效性会进一步提高。在这里,我们介绍了一种新颖的以形状为重点的药理模型算法 O-LAP,该算法通过成对距离图聚类将重叠的原子内容聚集在一起,从而生成一类新的空腔填充模型。灵活对接的活性配体的排名靠前的姿势被用作建模输入,并使用随机训练/测试分区对五个要求苛刻的药物靶点进行了全面的基准测试。在对接重构中,O-LAP 建模通常比默认对接富集模型有很大改进;此外,结果表明聚类模型在刚性对接中效果良好。基于 C+ +/Qt5 的算法 O-LAP 在 GNU General Public License v3.0 下通过 GitHub ( https://github.com/jvlehtonen/overlap-toolkit ) 发布。本研究介绍了一种基于 C++/Qt5 的图聚类软件 O-LAP,用于生成新型的以形状为中心的药代动力学模型。在 O-LAP 建模中,目标蛋白质空腔被灵活对接的活性配体填充,重叠的配体原子被聚类,生成模型的形状/静电势与灵活采样的分子对接姿势进行比较。综合基准测试表明,O-LAP 建模确保了对接重构和刚性对接的高富集性。
{"title":"Building shape-focused pharmacophore models for effective docking screening","authors":"Paola Moyano-Gómez,&nbsp;Jukka V. Lehtonen,&nbsp;Olli T. Pentikäinen,&nbsp;Pekka A. Postila","doi":"10.1186/s13321-024-00857-6","DOIUrl":"10.1186/s13321-024-00857-6","url":null,"abstract":"<p>The performance of molecular docking can be improved by comparing the shape similarity of the flexibly sampled poses against the target proteins’ inverted binding cavities. The effectiveness of these pseudo-ligands or negative image-based models in docking rescoring is boosted further by performing enrichment-driven optimization. Here, we introduce a novel shape-focused pharmacophore modeling algorithm O-LAP that generates a new class of cavity-filling models by clumping together overlapping atomic content via pairwise distance graph clustering. Top-ranked poses of flexibly docked active ligands were used as the modeling input and multiple alternative clustering settings were benchmark-tested thoroughly with five demanding drug targets using random training/test divisions. In docking rescoring, the O-LAP modeling typically improved massively on the default docking enrichment; furthermore, the results indicate that the clustered models work well in rigid docking. The C+ +/Qt5-based algorithm O-LAP is released under the GNU General Public License v3.0 via GitHub (https://github.com/jvlehtonen/overlap-toolkit).</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00857-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141909087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of reinforcement learning in transformer-based molecular design 评估基于变压器的分子设计中的强化学习。
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-08 DOI: 10.1186/s13321-024-00887-0
Jiazhen He, Alessandro Tibo, Jon Paul Janet, Eva Nittinger, Christian Tyrchan, Werngard Czechtizky, Ola Engkvist

Designing compounds with a range of desirable properties is a fundamental challenge in drug discovery. In pre-clinical early drug discovery, novel compounds are often designed based on an already existing promising starting compound through structural modifications for further property optimization. Recently, transformer-based deep learning models have been explored for the task of molecular optimization by training on pairs of similar molecules. This provides a starting point for generating similar molecules to a given input molecule, but has limited flexibility regarding user-defined property profiles. Here, we evaluate the effect of reinforcement learning on transformer-based molecular generative models. The generative model can be considered as a pre-trained model with knowledge of the chemical space close to an input compound, while reinforcement learning can be viewed as a tuning phase, steering the model towards chemical space with user-specific desirable properties. The evaluation of two distinct tasks—molecular optimization and scaffold discovery—suggest that reinforcement learning could guide the transformer-based generative model towards the generation of more compounds of interest. Additionally, the impact of pre-trained models, learning steps and learning rates are investigated.

Scientific contribution

Our study investigates the effect of reinforcement learning on a transformer-based generative model initially trained for generating molecules similar to starting molecules. The reinforcement learning framework is applied to facilitate multiparameter optimisation of starting molecules. This approach allows for more flexibility for optimizing user-specific property profiles and helps finding more ideas of interest.

设计具有一系列理想特性的化合物是药物研发中的一项基本挑战。在临床前早期药物发现中,新型化合物往往是在已有的有前景的起始化合物基础上,通过结构改造进一步优化性质而设计出来的。最近,人们探索了基于转换器的深度学习模型,通过对相似分子进行训练来完成分子优化任务。这为生成与给定输入分子相似的分子提供了一个起点,但在用户定义的特性曲线方面灵活性有限。在此,我们评估了强化学习对基于变压器的分子生成模型的影响。生成模型可被视为预先训练好的模型,具有与输入化合物相近的化学空间知识,而强化学习可被视为一个调整阶段,将模型导向具有用户特定理想特性的化学空间。对分子优化和支架发现这两项不同任务的评估表明,强化学习可以引导基于转换器的生成模型生成更多感兴趣的化合物。此外,研究还探讨了预训练模型、学习步骤和学习率的影响。 科学贡献我们的研究探讨了强化学习对基于变压器的生成模型的影响,该模型最初是为生成与起始分子相似的分子而训练的。强化学习框架用于促进起始分子的多参数优化。这种方法可以更灵活地优化用户特定的属性剖面,并有助于找到更多感兴趣的想法。
{"title":"Evaluation of reinforcement learning in transformer-based molecular design","authors":"Jiazhen He,&nbsp;Alessandro Tibo,&nbsp;Jon Paul Janet,&nbsp;Eva Nittinger,&nbsp;Christian Tyrchan,&nbsp;Werngard Czechtizky,&nbsp;Ola Engkvist","doi":"10.1186/s13321-024-00887-0","DOIUrl":"10.1186/s13321-024-00887-0","url":null,"abstract":"<div><p>Designing compounds with a range of desirable properties is a fundamental challenge in drug discovery. In pre-clinical early drug discovery, novel compounds are often designed based on an already existing promising starting compound through structural modifications for further property optimization. Recently, transformer-based deep learning models have been explored for the task of molecular optimization by training on pairs of similar molecules. This provides a starting point for generating similar molecules to a given input molecule, but has limited flexibility regarding user-defined property profiles. Here, we evaluate the effect of reinforcement learning on transformer-based molecular generative models. The generative model can be considered as a pre-trained model with knowledge of the chemical space close to an input compound, while reinforcement learning can be viewed as a tuning phase, steering the model towards chemical space with user-specific desirable properties. The evaluation of two distinct tasks—molecular optimization and scaffold discovery—suggest that reinforcement learning could guide the transformer-based generative model towards the generation of more compounds of interest. Additionally, the impact of pre-trained models, learning steps and learning rates are investigated.</p><p><b>Scientific contribution</b></p><p>Our study investigates the effect of reinforcement learning on a transformer-based generative model initially trained for generating molecules similar to starting molecules. The reinforcement learning framework is applied to facilitate multiparameter optimisation of starting molecules. This approach allows for more flexibility for optimizing user-specific property profiles and helps finding more ideas of interest.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11312936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141905428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An automated calculation pipeline for differential pair interaction energies with molecular force fields using the Tinker Molecular Modeling Package 使用 Tinker 分子建模软件包的分子力场差分对相互作用能自动计算管道。
IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Pub Date : 2024-08-08 DOI: 10.1186/s13321-024-00890-5
Felix Bänsch, Mirco Daniel, Harald Lanig, Christoph Steinbeck, Achim Zielesny

An automated pipeline for comprehensive calculation of intermolecular interaction energies based on molecular force-fields using the Tinker molecular modelling package is presented. Starting with non-optimized chemically intuitive monomer structures, the pipeline allows the approximation of global minimum energy monomers and dimers, configuration sampling for various monomer–monomer distances, estimation of coordination numbers by molecular dynamics simulations, and the evaluation of differential pair interaction energies. The latter are used to derive Flory–Huggins parameters and isotropic particle–particle repulsions for Dissipative Particle Dynamics (DPD). The computational results for force fields MM3, MMFF94, OPLS-AA and AMOEBA09 are analyzed with Density Functional Theory (DFT) calculations and DPD simulations for a mixture of the non-ionic polyoxyethylene alkyl ether surfactant C10E4 with water to demonstrate the usefulness of the approach.

Scientific Contribution

To our knowledge, there is currently no open computational pipeline for differential pair interaction energies at all. This work aims to contribute an (at least academically available, open) approach based on molecular force fields that provides a robust and efficient computational scheme for their automated calculation for small to medium-sized (organic) molecular dimers. The usefulness of the proposed new calculation scheme is demonstrated for the generation of mesoscopic particles with their mutual repulsive interactions.

本文介绍了一种基于分子力场、使用 Tinker 分子建模软件包全面计算分子间相互作用能的自动流水线。从非优化的化学直观单体结构开始,该管道可以近似计算全局最小能量单体和二聚体、各种单体-单体距离的构型采样、通过分子动力学模拟估算配位数,以及评估差分配对相互作用能。后者用于为耗散粒子动力学(DPD)推导 Flory-Huggins 参数和各向同性粒子-粒子排斥。针对非离子型聚氧乙烯烷基醚表面活性剂 C10E4 与水的混合物,将力场 MM3、MMFF94、OPLS-AA 和 AMOEBA09 的计算结果与密度泛函理论(DFT)计算和 DPD 模拟进行了分析,以证明该方法的实用性。这项工作旨在贡献一种基于分子力场的(至少是学术上可用的、开放的)方法,为中小型(有机)分子二聚体的自动计算提供稳健高效的计算方案。在生成具有相互排斥作用的介观粒子时,证明了所提出的新计算方案的实用性。
{"title":"An automated calculation pipeline for differential pair interaction energies with molecular force fields using the Tinker Molecular Modeling Package","authors":"Felix Bänsch,&nbsp;Mirco Daniel,&nbsp;Harald Lanig,&nbsp;Christoph Steinbeck,&nbsp;Achim Zielesny","doi":"10.1186/s13321-024-00890-5","DOIUrl":"10.1186/s13321-024-00890-5","url":null,"abstract":"<div><p>An automated pipeline for comprehensive calculation of intermolecular interaction energies based on molecular force-fields using the Tinker molecular modelling package is presented. Starting with non-optimized chemically intuitive monomer structures, the pipeline allows the approximation of global minimum energy monomers and dimers, configuration sampling for various monomer–monomer distances, estimation of coordination numbers by molecular dynamics simulations, and the evaluation of differential pair interaction energies. The latter are used to derive Flory–Huggins parameters and isotropic particle–particle repulsions for Dissipative Particle Dynamics (DPD). The computational results for force fields MM3, MMFF94, OPLS-AA and AMOEBA09 are analyzed with Density Functional Theory (DFT) calculations and DPD simulations for a mixture of the non-ionic polyoxyethylene alkyl ether surfactant C<sub>10</sub>E<sub>4</sub> with water to demonstrate the usefulness of the approach.</p><p><b>Scientific Contribution</b></p><p>To our knowledge, there is currently no open computational pipeline for differential pair interaction energies at all. This work aims to contribute an (at least academically available, open) approach based on molecular force fields that provides a robust and efficient computational scheme for their automated calculation for small to medium-sized (organic) molecular dimers. The usefulness of the proposed new calculation scheme is demonstrated for the generation of mesoscopic particles with their mutual repulsive interactions.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11312682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141905427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Cheminformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1