Journal of Cheminformatics最新文献_第9页

The first South Korean data challenge for drug discovery using human and mouse liver microsomal stability data 韩国首个利用人和小鼠肝微粒体稳定性数据进行药物发现的数据挑战

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-09-03 DOI: 10.1186/s13321-025-01047-8

Nam-Chul Cho, SeongEun Hong, Jin Sook Song, EuiJu Yeo, SoI Jung, Yuno Lee, Seul Gee Hwang, Su Min Kang, JaeSung Hwang, Tae-Eun Jin

The Korea Chemical Bank (KCB) has generated a dataset containing metabolic stability data for approximately 4,000 compounds that have been tested on human and mouse liver microsomes. The first South Korea Data Challenge, named the Jump AI Challenge for Drug Discovery (JUMP AI 2023), was opened using the metabolic stability data of KCB in 2023. The objective of the JUMP AI 2023 was to promote and encourage the development of new drugs using artificial intelligence (AI) technology in South Korea. A total of 1254 teams participated in the competition, developing algorithms to estimate the remaining percentage of compounds after 30 min of incubation with human and mouse liver microsomes. The data set comprised training and test sets of 3498 and 483 compounds, respectively. This paper provides an overview of the JUMP AI 2023 and its outcomes, highlighting the diverse range of algorithms and artificial intelligence technologies employed by the competing teams. Among these, five teams stood out by utilizing GNN-based approaches winning awards. This competition was the first AI competition for drug discovery in South Korea, attracting numerous researchers and playing a key role in promoting drug research through the application of artificial intelligence technologies.

韩国化学银行（KCB）制作了包含在人类和小鼠肝微粒体上测试的4000多种化合物的代谢稳定性数据的数据集。第一届韩国数据挑战赛名为Jump AI药物发现挑战赛（Jump AI 2023），于2023年利用KCB的代谢稳定性数据开启。JUMP AI 2023的目标是促进和鼓励利用人工智能（AI）技术在韩国开发新药。共有1254个团队参加了比赛，开发算法来估计人类和小鼠肝微粒体孵育30分钟后化合物的剩余百分比。数据集分别由3498个化合物的训练集和483个化合物的测试集组成。本文概述了JUMP AI 2023及其成果，重点介绍了参赛团队采用的各种算法和人工智能技术。其中，5个团队利用基于gnn的方法脱颖而出，获得了奖项。此次大赛是韩国首次举办药物研发人工智能大赛，吸引了众多研究人员，在通过应用人工智能技术促进药物研究方面发挥了关键作用。

{"title":"The first South Korean data challenge for drug discovery using human and mouse liver microsomal stability data","authors":"Nam-Chul Cho, SeongEun Hong, Jin Sook Song, EuiJu Yeo, SoI Jung, Yuno Lee, Seul Gee Hwang, Su Min Kang, JaeSung Hwang, Tae-Eun Jin","doi":"10.1186/s13321-025-01047-8","DOIUrl":"10.1186/s13321-025-01047-8","url":null,"abstract":"<div><p>The Korea Chemical Bank (KCB) has generated a dataset containing metabolic stability data for approximately 4,000 compounds that have been tested on human and mouse liver microsomes. The first South Korea Data Challenge, named the Jump AI Challenge for Drug Discovery (JUMP AI 2023), was opened using the metabolic stability data of KCB in 2023. The objective of the JUMP AI 2023 was to promote and encourage the development of new drugs using artificial intelligence (AI) technology in South Korea. A total of 1254 teams participated in the competition, developing algorithms to estimate the remaining percentage of compounds after 30 min of incubation with human and mouse liver microsomes. The data set comprised training and test sets of 3498 and 483 compounds, respectively. This paper provides an overview of the JUMP AI 2023 and its outcomes, highlighting the diverse range of algorithms and artificial intelligence technologies employed by the competing teams. Among these, five teams stood out by utilizing GNN-based approaches winning awards. This competition was the first AI competition for drug discovery in South Korea, attracting numerous researchers and playing a key role in promoting drug research through the application of artificial intelligence technologies.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01047-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Box embeddings for extending ontologies: a data-driven and interpretable approach 用于扩展本体的框嵌入：一种数据驱动和可解释的方法

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-09-01 DOI: 10.1186/s13321-025-01086-1

Adel Memariani, Martin Glauer, Simon Flügel, Fabian Neuhaus, Janna Hastings, Till Mossakowski

Deriving symbolic knowledge from trained deep learning models is challenging due to the lack of transparency in such models. A promising approach to address this issue is to couple a semantic structure with the model outputs and thereby make the model interpretable. In prediction tasks such as multi-label classification, labels tend to form hierarchical relationships. Therefore, we propose enforcing a taxonomical structure on the model’s outputs throughout the training phase. In vector space, a taxonomy can be represented using axis-aligned hyper-rectangles, or boxes, which may overlap or nest within one another. The boundaries of a box determine the extent of a particular category. Thus, we used box-shaped embeddings of ontology classes to learn and transparently represent logical relationships that are only implicit in multi-label datasets. We assessed our model by measuring its ability to approximate the full set of inferred subclass relations in the ChEBI ontology, which is an important knowledge base in the field of life science. We demonstrate that our model captures implicit hierarchical relationships among labels, ensuring consistency with the underlying ontological conceptualization, while also achieving state-of-the-art performance in multi-label classification. Notably, this is accomplished without requiring an explicit taxonomy during the training process.

Our proposed approach advances chemical classification by enablinginterpretable outputs through a structured and geometricallyexpressive representation of molecules and their classes.

由于这些模型缺乏透明度，从经过训练的深度学习模型中获得符号知识是具有挑战性的。解决这个问题的一个很有前途的方法是将语义结构与模型输出相耦合，从而使模型可解释。在多标签分类等预测任务中，标签倾向于形成层次关系。因此，我们建议在整个训练阶段对模型的输出实施分类结构。在向量空间中，分类法可以使用与轴对齐的超矩形或框来表示，它们可以相互重叠或嵌套。框的边界决定了特定类别的范围。因此，我们使用本体类的盒形嵌入来学习和透明地表示仅在多标签数据集中隐含的逻辑关系。我们通过测量其近似ChEBI本体中所有推断子类关系的能力来评估我们的模型，ChEBI本体是生命科学领域的一个重要知识库。我们证明了我们的模型捕获了标签之间的隐式层次关系，确保了与底层本体概念化的一致性，同时在多标签分类中也实现了最先进的性能。值得注意的是，这在训练过程中不需要显式分类法就可以完成。我们提出的方法通过分子及其类的结构化和几何表达表示实现可解释的输出，从而推进化学分类。

{"title":"Box embeddings for extending ontologies: a data-driven and interpretable approach","authors":"Adel Memariani, Martin Glauer, Simon Flügel, Fabian Neuhaus, Janna Hastings, Till Mossakowski","doi":"10.1186/s13321-025-01086-1","DOIUrl":"10.1186/s13321-025-01086-1","url":null,"abstract":"<p>Deriving symbolic knowledge from trained deep learning models is challenging due to the lack of transparency in such models. A promising approach to address this issue is to couple a semantic structure with the model outputs and thereby make the model interpretable. In prediction tasks such as multi-label classification, labels tend to form hierarchical relationships. Therefore, we propose enforcing a taxonomical structure on the model’s outputs throughout the training phase. In vector space, a taxonomy can be represented using axis-aligned hyper-rectangles, or boxes, which may overlap or nest within one another. The boundaries of a box determine the extent of a particular category. Thus, we used box-shaped embeddings of ontology classes to learn and transparently represent logical relationships that are only implicit in multi-label datasets. We assessed our model by measuring its ability to approximate the full set of inferred subclass relations in the ChEBI ontology, which is an important knowledge base in the field of life science. We demonstrate that our model captures implicit hierarchical relationships among labels, ensuring consistency with the underlying ontological conceptualization, while also achieving state-of-the-art performance in multi-label classification. Notably, this is accomplished without requiring an explicit taxonomy during the training process.</p><p>Our proposed approach advances chemical classification by enabling\u0000interpretable outputs through a structured and geometrically\u0000expressive representation of molecules and their classes.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01086-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluation of chirality descriptors derived from SMILES heteroencoders 由SMILES异质编码器衍生的手性描述符的评估

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-31 DOI: 10.1186/s13321-025-01080-7

Natalia Baimacheva, Xinyue Gao, Joao Aires-de-Sousa

Molecular representations of chirality, derived from latent space vectors (LSVs) of SMILES heteroencoders, were explored to train machine learning models to predict chiral properties, and were compared to conventional circular fingerprints. Latent space arithmetic was applied to enhance the representation of chirality, by calculating differences between the original descriptor of a molecule and the descriptor of its enantiomer, or the difference between the original descriptor and the descriptor obtained with the stereochemistry-depleted SMILES string. Machine learning was performed with the Random Forest algorithm applied to a dataset of 3858 molecules extracted from the literature (1929 pairs of enantiomers) to predict the elution order observed on the Chiralpak® AD-H column, as well as intrinsic structural chirality labels (R/S or canonical SMILES @/@@). The descriptors derived from the heteroencoders achieved an accuracy of up to 0.75 in the prediction of the elution order, and the fingerprints were superior (0.82). A better predictive ability was observed with the difference LSV descriptors than with the original descriptors.

利用smile异质编码器的潜在空间向量（latent space vector, LSVs）对手性分子表征进行了探索，以训练机器学习模型来预测手性，并与传统圆形指纹进行了比较。通过计算分子的原始描述符与其对映体描述符之间的差异，或者原始描述符与用立体化学缺失的SMILES字符串得到的描述符之间的差异，应用潜在空间算法增强了手性的表示。使用随机森林算法对从文献中提取的3858个分子（1929对对映体）进行机器学习，以预测Chiralpak®AD-H柱上观察到的洗脱顺序，以及固有结构手性标签（R/S或规范SMILES @/@）。基于异质编码器的描述符对洗脱顺序的预测精度高达0.75，指纹图谱的预测精度为0.82。与原始描述符相比，不同的LSV描述符具有更好的预测能力。我们的工作提出了潜在空间算法来获得分子手性的描述符从SMILES异质编码器。我们利用这种分子表征建立了定量结构-对映体选择性关系，用于预测手性色谱中对映体的洗脱顺序，并与圆形指纹图谱的结果进行了比较。研究表明，相对对映体的δ描述子增强了潜在空间向量编码手性的能力。

{"title":"Evaluation of chirality descriptors derived from SMILES heteroencoders","authors":"Natalia Baimacheva, Xinyue Gao, Joao Aires-de-Sousa","doi":"10.1186/s13321-025-01080-7","DOIUrl":"10.1186/s13321-025-01080-7","url":null,"abstract":"<div><p>Molecular representations of chirality, derived from latent space vectors (LSVs) of SMILES heteroencoders, were explored to train machine learning models to predict chiral properties, and were compared to conventional circular fingerprints. Latent space arithmetic was applied to enhance the representation of chirality, by calculating differences between the original descriptor of a molecule and the descriptor of its enantiomer, or the difference between the original descriptor and the descriptor obtained with the stereochemistry-depleted SMILES string. Machine learning was performed with the Random Forest algorithm applied to a dataset of 3858 molecules extracted from the literature (1929 pairs of enantiomers) to predict the elution order observed on the Chiralpak® AD-H column, as well as intrinsic structural chirality labels (R/S or canonical SMILES @/@@). The descriptors derived from the heteroencoders achieved an accuracy of up to 0.75 in the prediction of the elution order, and the fingerprints were superior (0.82). A better predictive ability was observed with the difference LSV descriptors than with the original descriptors.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01080-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144920688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Alphappimi: a comprehensive deep learning framework for predicting PPI-modulator interactions Alphappimi：一个全面的深度学习框架，用于预测ppi -调制器相互作用

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-29 DOI: 10.1186/s13321-025-01077-2

Dayan Liu, Tao Song, Shuang Wang, Xue Li, Peifu Han, Jianmin Wang, Shudong Wang

Protein-protein interactions (PPIs) regulate essential biological processes through complex interfaces, with their dysfunction is associated with various diseases. Consequently, the identification of PPIs and their interface-targeting modulators has emerged as a critical therapeutic approach. However, discovering modulators that target PPIs and PPI interfaces remains challenging as traditional structure-similarity-based methods fail to effectively characterize PPI targets, particularly those for which no active compounds are known. Here, we present AlphaPPIMI, a comprehensive deep learning framework that combines large-scale pretrained language models with domain adaptation for predicting PPI-modulator interactions, specifically targeting PPI interface. To enable robust model development and evaluation, we constructed comprehensive benchmark datasets of PPI-modulator interactions (PPIMI). Our framework integrates comprehensive molecular features from Uni-Mol2, protein representations derived from state-of-the-art language models (ESM2 and ProTrans), and PPI structural characteristics encoded by PFeature. Through a specialized cross-attention architecture and conditional domain adversarial networks (CDAN), AlphaPPIMI effectively learns potential associations between PPI targets and modulators while ensuring robust cross-domain generalization. Extensive evaluations indicate that AlphaPPIMI achieves consistently improved performance over existing methods in PPIMI prediction, offering a promising approach for prioritizing candidate PPI modulators, particularly those targeting protein–protein interfaces.

This work presents AlphaPPIMI, a novel deep learning framework for accurately predicting modulators targeting protein-protein interactions (PPIs) and their interfaces. Its core contributions include a specialized cross-attention module for the synergistic fusion of multimodal pretrained representations, and the novel application of a Conditional Domain Adversarial Network (CDAN) to significantly improve generalization across diverse protein families. AlphaPPIMI demonstrates superior performance on curated benchmarks, providing a powerful computational tool for the discovery of targeted PPI therapeutics.

蛋白质-蛋白质相互作用（PPIs）通过复杂的界面调节重要的生物过程，其功能障碍与多种疾病有关。因此，鉴定PPIs及其界面靶向调节剂已成为一种关键的治疗方法。然而，发现靶向PPI和PPI界面的调节剂仍然具有挑战性，因为传统的基于结构相似性的方法无法有效地表征PPI靶点，特别是那些没有活性化合物的靶点。在这里，我们提出了AlphaPPIMI，这是一个全面的深度学习框架，将大规模预训练语言模型与领域自适应相结合，用于预测PPI-调制器相互作用，特别是针对PPI接口。为了实现稳健的模型开发和评估，我们构建了ppi -调制器相互作用（PPIMI）的综合基准数据集。我们的框架集成了来自Uni-Mol2的全面分子特征，来自最先进语言模型（ESM2和ProTrans）的蛋白质表示，以及由PFeature编码的PPI结构特征。通过专门的交叉注意架构和条件域对抗网络（CDAN）， AlphaPPIMI有效地学习PPI目标和调节器之间的潜在关联，同时确保鲁棒的跨域泛化。广泛的评估表明，AlphaPPIMI在PPIMI预测方面的表现优于现有方法，为确定候选PPI调节剂的优先级提供了一种有希望的方法，特别是那些靶向蛋白质-蛋白质界面的方法。这项工作提出了AlphaPPIMI，一个新的深度学习框架，用于准确预测靶向蛋白质-蛋白质相互作用（PPIs）及其界面的调节剂。其核心贡献包括一个专门的跨注意模块，用于多模态预训练表示的协同融合，以及条件域对抗网络（CDAN）的新应用，以显着提高不同蛋白质家族的泛化。AlphaPPIMI在规划基准上展示了卓越的性能，为发现靶向PPI治疗提供了强大的计算工具。

{"title":"Alphappimi: a comprehensive deep learning framework for predicting PPI-modulator interactions","authors":"Dayan Liu, Tao Song, Shuang Wang, Xue Li, Peifu Han, Jianmin Wang, Shudong Wang","doi":"10.1186/s13321-025-01077-2","DOIUrl":"10.1186/s13321-025-01077-2","url":null,"abstract":"<p>Protein-protein interactions (PPIs) regulate essential biological processes through complex interfaces, with their dysfunction is associated with various diseases. Consequently, the identification of PPIs and their interface-targeting modulators has emerged as a critical therapeutic approach. However, discovering modulators that target PPIs and PPI interfaces remains challenging as traditional structure-similarity-based methods fail to effectively characterize PPI targets, particularly those for which no active compounds are known. Here, we present AlphaPPIMI, a comprehensive deep learning framework that combines large-scale pretrained language models with domain adaptation for predicting PPI-modulator interactions, specifically targeting PPI interface. To enable robust model development and evaluation, we constructed comprehensive benchmark datasets of PPI-modulator interactions (PPIMI). Our framework integrates comprehensive molecular features from Uni-Mol2, protein representations derived from state-of-the-art language models (ESM2 and ProTrans), and PPI structural characteristics encoded by PFeature. Through a specialized cross-attention architecture and conditional domain adversarial networks (CDAN), AlphaPPIMI effectively learns potential associations between PPI targets and modulators while ensuring robust cross-domain generalization. Extensive evaluations indicate that AlphaPPIMI achieves consistently improved performance over existing methods in PPIMI prediction, offering a promising approach for prioritizing candidate PPI modulators, particularly those targeting protein–protein interfaces.</p><p>This work presents AlphaPPIMI, a novel deep learning framework for accurately predicting modulators targeting protein-protein interactions (PPIs) and their interfaces. Its core contributions include a specialized cross-attention module for the synergistic fusion of multimodal pretrained representations, and the novel application of a Conditional Domain Adversarial Network (CDAN) to significantly improve generalization across diverse protein families. AlphaPPIMI demonstrates superior performance on curated benchmarks, providing a powerful computational tool for the discovery of targeted PPI therapeutics.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01077-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144916122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AI-powered prediction of critical properties and boiling points: a hybrid ensemble learning and QSPR approach 人工智能驱动的关键性质和沸点预测：混合集成学习和QSPR方法

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-29 DOI: 10.1186/s13321-025-01062-9

Roda Bounaceur, Francisco Paes, Romain Privat, Jean-Noël Jaubert

In this paper, we propose a robust deep-learning model based on a Quantitative Structure − Property Relationship (QSPR) approach for estimating the critical temperature (TC), critical pressure (PC), acentric factor (ACEN) and normal boiling point (NBP) of any C, H, O, N, S, P, F, Cl, Br, I molecule. The Mordred calculator was used to determine 247 descriptors to characterize the molecules considered in this work. For each evaluated property, multiple neural networks were trained within a bagging framework. The predictions from the final ensemble were successfully tested against a large set of experimental data comprising more than 1700 molecules and compared with those from different recent learning models found in the literature. Comprehensive comparisons and extensive testing highlight the robustness and predictive power of the newly proposed multimodal learning model. The developed prediction tool is available on a website at https://lrgp-thermoppt.streamlit.app/. Furthermore, a source code for implementing the trained models in Python is available via github https://github.com/bounac80/AI-ThermPpt.

在本文中，我们提出了一个基于定量结构-性质关系（QSPR）方法的鲁棒深度学习模型，用于估计任何C， H， O， N， S， P， F, Cl, Br， I分子的临界温度（TC），临界压力（PC），无中心因子（ACEN）和正常沸点（NBP）。莫德雷德计算器被用来确定247个描述符来表征这项工作中考虑的分子。对于每个评估的属性，在bagging框架内训练多个神经网络。来自最终集合的预测成功地通过包含1700多个分子的大量实验数据进行了测试，并与文献中发现的不同最新学习模型进行了比较。综合比较和广泛的测试突出了新提出的多模态学习模型的鲁棒性和预测能力。开发的预测工具可在https://lrgp-thermoppt.streamlit.app/网站上获得。此外，在Python中实现训练模型的源代码可通过github https://github.com/bounac80/AI-ThermPpt获得。

{"title":"AI-powered prediction of critical properties and boiling points: a hybrid ensemble learning and QSPR approach","authors":"Roda Bounaceur, Francisco Paes, Romain Privat, Jean-Noël Jaubert","doi":"10.1186/s13321-025-01062-9","DOIUrl":"10.1186/s13321-025-01062-9","url":null,"abstract":"<div><p>In this paper, we propose a robust deep-learning model based on a Quantitative Structure − Property Relationship (QSPR) approach for estimating the critical temperature (TC), critical pressure (PC), acentric factor (ACEN) and normal boiling point (NBP) of any C, H, O, N, S, P, F, Cl, Br, I molecule. The Mordred calculator was used to determine 247 descriptors to characterize the molecules considered in this work. For each evaluated property, multiple neural networks were trained within a <i>bagging</i> framework. The predictions from the final ensemble were successfully tested against a large set of experimental data comprising more than 1700 molecules and compared with those from different recent learning models found in the literature. Comprehensive comparisons and extensive testing highlight the robustness and predictive power of the newly proposed multimodal learning model. The developed prediction tool is available on a website at https://lrgp-thermoppt.streamlit.app/. Furthermore, a source code for implementing the trained models in Python is available via github https://github.com/bounac80/AI-ThermPpt.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01062-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144916149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Biosynfoni: a biosynthesis-informed and interpretable lightweight molecular fingerprint Biosynfoni：生物合成信息和可解释的轻量级分子指纹

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-29 DOI: 10.1186/s13321-025-01081-6

Lucina-May Nollen, David Meijer, Maria Sorokina, Justin J. J. van der Hooft

Natural products provide a rich source of bioactive molecules for a variety of applications. Molecular fingerprints are the tool of choice for systematic large-scale studies of their structures. However, current molecular fingerprints insufficiently represent characteristic features of natural products inherently, decreasing the interpretability of natural product-specific predictions. Here, we show that a natural product-specific molecular fingerprint based on a relatively small set of selected biosynthetic building blocks provides more interpretable predictions of biosynthetic distance and natural product classification. Our fingerprint Biosynfoni outperforms MACCS, Morgan, and Daylight-like fingerprints in biosynthetic distance estimation, using 39 substructure keys. Moreover, Biosynfoni’s design, compactness, and concrete substructure definition allow easy visualisation of the detected substructures and their respective biosynthetic pathway origins. Through Biosynfoni, users can gain more insights from predictions and better examine the importance of features within machine learning models. Our results show that a short fingerprint consisting of biologically significant building blocks performs on par with top-performing molecular fingerprints for natural product classification while improving prediction explainability.

天然产物为各种应用提供了丰富的生物活性分子来源。分子指纹图谱是对其结构进行系统大规模研究的首选工具。然而，目前的分子指纹不足以代表天然产物固有的特征特征，降低了天然产物特异性预测的可解释性。在这里，我们展示了基于相对较小的选定生物合成构建块集的天然产物特异性分子指纹，为生物合成距离和天然产物分类提供了更可解释的预测。我们的指纹Biosynfoni在生物合成距离估计方面优于MACCS， Morgan和Daylight-like指纹，使用39个子结构键。此外，Biosynfoni的设计、紧凑性和具体的子结构定义可以很容易地可视化检测到的子结构及其各自的生物合成途径起源。通过Biosynfoni，用户可以从预测中获得更多的见解，并更好地检查机器学习模型中特征的重要性。我们的研究结果表明，由生物学上重要的构建块组成的短指纹在天然产品分类方面的表现与顶级分子指纹相当，同时提高了预测的可解释性。Biosynfoni通过简洁、清晰地反映天然产物的生物合成信息，有助于建立更具可解释性和轻量级的分类和反生物合成模型。

{"title":"Biosynfoni: a biosynthesis-informed and interpretable lightweight molecular fingerprint","authors":"Lucina-May Nollen, David Meijer, Maria Sorokina, Justin J. J. van der Hooft","doi":"10.1186/s13321-025-01081-6","DOIUrl":"10.1186/s13321-025-01081-6","url":null,"abstract":"<div><p>Natural products provide a rich source of bioactive molecules for a variety of applications. Molecular fingerprints are the tool of choice for systematic large-scale studies of their structures. However, current molecular fingerprints insufficiently represent characteristic features of natural products inherently, decreasing the interpretability of natural product-specific predictions. Here, we show that a natural product-specific molecular fingerprint based on a relatively small set of selected biosynthetic building blocks provides more interpretable predictions of biosynthetic distance and natural product classification. Our fingerprint Biosynfoni outperforms MACCS, Morgan, and Daylight-like fingerprints in biosynthetic distance estimation, using 39 substructure keys. Moreover, Biosynfoni’s design, compactness, and concrete substructure definition allow easy visualisation of the detected substructures and their respective biosynthetic pathway origins. Through Biosynfoni, users can gain more insights from predictions and better examine the importance of features within machine learning models. Our results show that a short fingerprint consisting of biologically significant building blocks performs on par with top-performing molecular fingerprints for natural product classification while improving prediction explainability.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01081-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144916145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FusionCLM: enhanced molecular property prediction via knowledge fusion of chemical language models FusionCLM：通过化学语言模型的知识融合增强分子性质预测

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-29 DOI: 10.1186/s13321-025-01073-6

Yutong Lu, Yan Yi Li, Yan Sun, Pingzhao Hu

Chemical Language Models (CLMs) have demonstrated capabilities in extracting patterns and predicting from vast volume of the Simplified Molecular Input Line Entry System (SMILES), a notation used to represent molecular structures. Different CLMs, developed from various architectures, can provide unique insights into molecular properties. To harness the uniqueness of different CLMs, we propose FusionCLM, a novel stacking-ensemble learning algorithm that integrate the outputs of multiple CLMs into a unified framework. FusionCLM first generates SMILES embeddings, predictions, and losses from each CLM. Auxiliary models are trained on these first-level predictions and embeddings to estimate test losses during inference. The losses and predictions are then concatenated to create an integrated feature matrix, which trains second-level meta-models for final predictions. Empirical testing on five datasets demonstrates that FusionCLM have better performance than individual CLM at the first level and three advanced multimodal deep learning frameworks, showcasing FusionCLM’s potential in advancing molecular property prediction.

化学语言模型（CLMs）已经证明了从大量简化分子输入线输入系统（SMILES）中提取模式和预测的能力，这是一种用于表示分子结构的符号。从不同架构开发的不同clm可以提供对分子特性的独特见解。为了利用不同clm的独特性，我们提出了一种新的堆叠集成学习算法FusionCLM，它将多个clm的输出集成到一个统一的框架中。FusionCLM首先从每个CLM中生成SMILES嵌入、预测和损失。辅助模型在这些一级预测和嵌入上进行训练，以估计推理过程中的测试损失。然后将损失和预测连接起来创建一个集成的特征矩阵，该特征矩阵为最终预测训练第二级元模型。在五个数据集上的实证测试表明，FusionCLM在一级和三个高级多模态深度学习框架上的性能优于单个CLM，显示了FusionCLM在推进分子性质预测方面的潜力。FusionCLM使用堆叠集成学习方法，该方法集成了来自多个clm的独特表示学习，从而可以更全面地学习分子smile数据。这可以提供更准确的分子性质预测，有助于促进早期发现和开发有前途的候选药物。通过评估和比较其与单个clm和现有多模态深度学习框架的性能，FusionCLM展示了预测精度的改进，将其与该领域的先前模型区分开来。

{"title":"FusionCLM: enhanced molecular property prediction via knowledge fusion of chemical language models","authors":"Yutong Lu, Yan Yi Li, Yan Sun, Pingzhao Hu","doi":"10.1186/s13321-025-01073-6","DOIUrl":"10.1186/s13321-025-01073-6","url":null,"abstract":"<div><p>Chemical Language Models (CLMs) have demonstrated capabilities in extracting patterns and predicting from vast volume of the Simplified Molecular Input Line Entry System (SMILES), a notation used to represent molecular structures. Different CLMs, developed from various architectures, can provide unique insights into molecular properties. To harness the uniqueness of different CLMs, we propose FusionCLM, a novel stacking-ensemble learning algorithm that integrate the outputs of multiple CLMs into a unified framework. FusionCLM first generates SMILES embeddings, predictions, and losses from each CLM. Auxiliary models are trained on these first-level predictions and embeddings to estimate test losses during inference. The losses and predictions are then concatenated to create an integrated feature matrix, which trains second-level meta-models for final predictions. Empirical testing on five datasets demonstrates that FusionCLM have better performance than individual CLM at the first level and three advanced multimodal deep learning frameworks, showcasing FusionCLM’s potential in advancing molecular property prediction.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01073-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144916147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mixture of experts for multitask learning in cardiotoxicity assessment 心脏毒性评估中多任务学习的专家组合

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-29 DOI: 10.1186/s13321-025-01072-7

Edoardo Luca Viganò, Mateusz Iwan, Erika Colombo, Davide Ballabio, Alessandra Roncaglioni

In recent years, the integration of Artificial Intelligence and Machine Learning methods with biochemical and biomedical research has revolutionized the field of toxicology, significantly advancing our understanding of the toxicological effects of chemicals on biological systems. Cardiovascular diseases remain the leading global cause of death. The constant exposure to multiple chemicals with potential cardiotoxic effects, including environmental contaminants, pesticides, food additives, and drugs, can significantly contribute to these adverse health outcomes. Traditional methods for assessing chemical hazards and their impact on biological function heavily rely on experimental assays and animal studies, which are often time-consuming, resource-intensive, and limited in scalability. To overcome these limitations in silico methods have emerged as indispensable tools in toxicological research, reducing the need for traditional in vivo testing and conserving valuable resources in terms of time and cost. In this study, Artificial Intelligence methods are used as first-tier components within an Integrated Approach to Testing and Assessment. We explored the potential benefits of using Multitask Neural Networks, where multiple levels of cardiotoxicity information are combined to enhance model performance. Multitask learning, based on specific architectures such as Mixture of Experts (MoE), showed promising results and surpasses the performance of single-task baseline models. When predicting a holdout set, multitask model achieved high performance on twelve different endpoints related to cardiotoxicity defined by Adverse Outcome Pathways Network. The best developed model achieved a balanced accuracy of 78%, a sensitivity of 80%, and a specificity of 76% across all endpoints in the holdout set.

An advanced multitask model was developed to predict cardiotoxicity mechanisms induced by small molecules. The model demonstrates broad mechanistic coverage and achieves performance comparable to, or exceeding, state-of-the-art methods. These results suggest that the model could serve as a valuable first-tier component in advanced New Approach Methodologies for prioritizing chemicals for further testing.

近年来，人工智能和机器学习方法与生物化学和生物医学研究的结合彻底改变了毒理学领域，大大提高了我们对化学物质对生物系统的毒理学作用的理解。心血管疾病仍然是全球主要的死亡原因。持续接触多种具有潜在心脏毒性作用的化学物质，包括环境污染物、农药、食品添加剂和药物，可显著导致这些不利的健康结果。评估化学危害及其对生物功能影响的传统方法严重依赖于实验分析和动物研究，这些方法往往耗时、资源密集且可扩展性有限。为了克服这些限制，计算机方法已成为毒理学研究中不可或缺的工具，减少了对传统体内测试的需求，并在时间和成本方面节省了宝贵的资源。在本研究中，人工智能方法被用作测试和评估集成方法中的第一层组件。我们探索了使用多任务神经网络的潜在好处，其中多级心脏毒性信息相结合以提高模型性能。基于专家混合（MoE）等特定架构的多任务学习显示出令人鼓舞的结果，并且超过了单任务基线模型的性能。当预测抵抗集时，多任务模型在与不良结果通路网络定义的心脏毒性相关的12个不同终点上取得了高性能。开发的最佳模型在holdout集中的所有端点上实现了78%的平衡精度，80%的灵敏度和76%的特异性。建立了一种先进的多任务模型来预测小分子诱导的心脏毒性机制。该模型展示了广泛的机制覆盖范围，并实现了与最先进方法相当或超过最先进方法的性能。这些结果表明，该模型可以作为先进的新方法方法中有价值的第一级组成部分，用于确定化学物质的优先级以进行进一步测试。

{"title":"Mixture of experts for multitask learning in cardiotoxicity assessment","authors":"Edoardo Luca Viganò, Mateusz Iwan, Erika Colombo, Davide Ballabio, Alessandra Roncaglioni","doi":"10.1186/s13321-025-01072-7","DOIUrl":"10.1186/s13321-025-01072-7","url":null,"abstract":"<p>In recent years, the integration of Artificial Intelligence and Machine Learning methods with biochemical and biomedical research has revolutionized the field of toxicology, significantly advancing our understanding of the toxicological effects of chemicals on biological systems. Cardiovascular diseases remain the leading global cause of death. The constant exposure to multiple chemicals with potential cardiotoxic effects, including environmental contaminants, pesticides, food additives, and drugs, can significantly contribute to these adverse health outcomes. Traditional methods for assessing chemical hazards and their impact on biological function heavily rely on experimental assays and animal studies, which are often time-consuming, resource-intensive, and limited in scalability. To overcome these limitations in silico methods have emerged as indispensable tools in toxicological research, reducing the need for traditional in vivo testing and conserving valuable resources in terms of time and cost. In this study, Artificial Intelligence methods are used as first-tier components within an Integrated Approach to Testing and Assessment. We explored the potential benefits of using Multitask Neural Networks, where multiple levels of cardiotoxicity information are combined to enhance model performance. Multitask learning, based on specific architectures such as Mixture of Experts (MoE), showed promising results and surpasses the performance of single-task baseline models. When predicting a holdout set, multitask model achieved high performance on twelve different endpoints related to cardiotoxicity defined by Adverse Outcome Pathways Network. The best developed model achieved a balanced accuracy of 78%, a sensitivity of 80%, and a specificity of 76% across all endpoints in the holdout set.</p><p>An advanced multitask model was developed to predict cardiotoxicity mechanisms induced by small molecules. The model demonstrates broad mechanistic coverage and achieves performance comparable to, or exceeding, state-of-the-art methods. These results suggest that the model could serve as a valuable first-tier component in advanced New Approach Methodologies for prioritizing chemicals for further testing.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01072-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144916148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AdapTor: Adaptive Topological Regression for quantitative structure–activity relationship modeling AdapTor：自适应拓扑回归定量结构-活动关系建模

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-28 DOI: 10.1186/s13321-025-01071-8

Yixiang Mao, Souparno Ghosh, Ranadip Pal

Quantitative structure–activity relationship (QSAR) modeling has become a critical tool in drug design. Recently proposed Topological Regression (TR), a computationally efficient and highly interpretable QSAR model that maps distances in the chemical domain to distances in the activity domain, has shown predictive performance comparable to state-of-the-art deep learning-based models. However, TR’s dependence on simple random sampling-based anchor selection and utilization of radial basis function for response reconstruction constrain its interpretability and predictive capacity. To address these limitations, we propose Adaptive Topological Regression (AdapToR) with adaptive anchor selection and optimization-based reconstruction. We evaluated AdapToR on the NCI60 GI50 dataset, which consists of over 50,000 drug responses across 60 human cancer cell lines, and compared its performance to Transformer CNN, Graph Transformer, TR, and other baseline models. The results demonstrate that AdapToR outperforms competing QSAR models for drug response prediction with significantly lower computational cost and greater interpretability as compared to deep learning-based models.

定量构效关系（QSAR）模型已成为药物设计的重要工具。最近提出的拓扑回归（TR）是一种计算效率高、可高度解释的QSAR模型，它将化学域的距离映射到活动域的距离，其预测性能可与最先进的基于深度学习的模型相媲美。然而，TR依赖于基于简单随机抽样的锚点选择和利用径向基函数进行响应重建，限制了其可解释性和预测能力。为了解决这些限制，我们提出了自适应拓扑回归（AdapToR），具有自适应锚点选择和基于优化的重建。我们在NCI60 GI50数据集上对AdapToR进行了评估，该数据集包括60种人类癌细胞系中超过50,000种药物反应，并将其性能与Transformer CNN、Graph Transformer、TR和其他基线模型进行了比较。结果表明，与基于深度学习的模型相比，AdapToR在药物反应预测方面优于竞争对手的QSAR模型，计算成本显著降低，可解释性更高。

{"title":"AdapTor: Adaptive Topological Regression for quantitative structure–activity relationship modeling","authors":"Yixiang Mao, Souparno Ghosh, Ranadip Pal","doi":"10.1186/s13321-025-01071-8","DOIUrl":"10.1186/s13321-025-01071-8","url":null,"abstract":"<div><p>Quantitative structure–activity relationship (QSAR) modeling has become a critical tool in drug design. Recently proposed Topological Regression (TR), a computationally efficient and highly interpretable QSAR model that maps distances in the chemical domain to distances in the activity domain, has shown predictive performance comparable to state-of-the-art deep learning-based models. However, TR’s dependence on simple random sampling-based anchor selection and utilization of radial basis function for response reconstruction constrain its interpretability and predictive capacity. To address these limitations, we propose Adaptive Topological Regression (AdapToR) with adaptive anchor selection and optimization-based reconstruction. We evaluated AdapToR on the NCI60 GI50 dataset, which consists of over 50,000 drug responses across 60 human cancer cell lines, and compared its performance to Transformer CNN, Graph Transformer, TR, and other baseline models. The results demonstrate that AdapToR outperforms competing QSAR models for drug response prediction with significantly lower computational cost and greater interpretability as compared to deep learning-based models.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01071-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144909651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retrosynthetic crosstalk between single-step reaction and multi-step planning 单步反应与多步计划间的反合成串扰

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics

Pub Date : 2025-08-28 DOI: 10.1186/s13321-025-01088-z

Junseok Choe, Hajung Kim, Yan Ting Chok, Mogan Gim, Jaewoo Kang

Retrosynthesis—the process of deconstructing complex molecules into simpler, more accessible precursors—is a cornerstone of drug discovery and material design. While machine learning has improved single-step retrosynthesis prediction, generating complete multi-step retrosynthetic routes remains challenging. In this study, we explore the integration of single-step retrosynthesis models with various planning algorithms to improve multi-step retrosynthetic route generation. We expand the exploration space beyond previously limited settings by incorporating combinations of planning algorithms and single-step retrosynthesis models and diverse datasets, enabling a more comprehensive assessment of retrosynthetic strategies. We evaluated synthetic routes based on both solvability, the ability to generate a complete route, and route feasibility, which reflects their practical executability in the laboratory. Our findings show that the model combination with the highest solvability does not always produce the most feasible routes, underscoring the need for more nuanced evaluation. Through a systematic analysis of combinations of planning algorithms and single-step retrosynthesis models, their performance across different datasets, and various practical metrics, our study provides a more comprehensive evaluation of retrosynthetic planning strategies. These insights contribute to a better understanding of computational retrosynthesis and its alignment with real-world applicability.

逆转录合成——将复杂分子分解成更简单、更容易获取的前体的过程——是药物发现和材料设计的基石。虽然机器学习改进了单步反合成预测，但生成完整的多步反合成路线仍然具有挑战性。在本研究中，我们探索了单步反合成模型与各种规划算法的集成，以改进多步反合成路线生成。我们通过结合规划算法、单步反合成模型和多种数据集，扩展了以前有限的探索空间，从而能够更全面地评估反合成策略。我们根据可解性、生成完整路线的能力和路线可行性来评估合成路线，这反映了它们在实验室中的实际可执行性。我们的研究结果表明，具有最高可解性的模型组合并不总是产生最可行的路线，强调需要更细致的评估。通过系统分析规划算法和单步反合成模型的组合，以及它们在不同数据集上的性能，以及各种实际指标，我们的研究为反合成规划策略提供了更全面的评估。这些见解有助于更好地理解计算反合成及其与现实世界适用性的一致性。我们为反合成任务提供了扩展的研究成果。我们还提出了实际世界中反合成路线有效性的可行性概念及其实用性。

{"title":"Retrosynthetic crosstalk between single-step reaction and multi-step planning","authors":"Junseok Choe, Hajung Kim, Yan Ting Chok, Mogan Gim, Jaewoo Kang","doi":"10.1186/s13321-025-01088-z","DOIUrl":"10.1186/s13321-025-01088-z","url":null,"abstract":"<div><p>Retrosynthesis—the process of deconstructing complex molecules into simpler, more accessible precursors—is a cornerstone of drug discovery and material design. While machine learning has improved single-step retrosynthesis prediction, generating complete multi-step retrosynthetic routes remains challenging. In this study, we explore the integration of single-step retrosynthesis models with various planning algorithms to improve multi-step retrosynthetic route generation. We expand the exploration space beyond previously limited settings by incorporating combinations of planning algorithms and single-step retrosynthesis models and diverse datasets, enabling a more comprehensive assessment of retrosynthetic strategies. We evaluated synthetic routes based on both solvability, the ability to generate a complete route, and route feasibility, which reflects their practical executability in the laboratory. Our findings show that the model combination with the highest solvability does not always produce the most feasible routes, underscoring the need for more nuanced evaluation. Through a systematic analysis of combinations of planning algorithms and single-step retrosynthesis models, their performance across different datasets, and various practical metrics, our study provides a more comprehensive evaluation of retrosynthetic planning strategies. These insights contribute to a better understanding of computational retrosynthesis and its alignment with real-world applicability.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01088-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0