首页 > 最新文献

Journal of Chemical Information and Modeling 最新文献

英文 中文
CACHE Challenge #3: Targeting the Nsp3 Macrodomain of SARS-CoV-2 CACHE挑战#3:靶向SARS-CoV-2的Nsp3大域
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-21 DOI: 10.1021/acs.jcim.5c02441
Oleksandra Herasymenko,Madhushika Silva,Galen J. Correy,Abd Al-Aziz A. Abu-Saleh,Suzanne Ackloo,Cheryl Arrowsmith,Alan Ashworth,Fuqiang Ban,Hartmut Beck,Kevin P. Bishop,Hugo J. Bohórquez,Albina Bolotokova,Marko Breznik,Irene Chau,Yu Chen,Artem Cherkasov,Wim Dehaen,Dennis Della Corte,Katrin Denzinger,Niklas P. Doering,Kristina Edfeldt,Aled Edwards,Darren Fayne,Francesco Gentile,Elisa Gibson,Ozan Gokdemir,Anders Gunnarsson,Judith Günther,John J. Irwin,Jan Halborg Jensen,Rachel J. Harding,Alexander Hillisch,Laurent Hoffer,Anders Hogner,Ashley Hutchinson,Shubhangi Kandwal,Andrea Karlova,Kushal Koirala,Sergei Kotelnikov,Dima Kozakov,Juyong Lee,Soowon Lee,Uta Lessel,Sijie Liu,Xuefeng Liu,Peter Loppnau,Jens Meiler,Rocco Moretti,Yurii S. Moroz,Charuvaka Muvva,Tudor I. Oprea,Brooks Paige,Amit Pandit,Keunwan Park,Gennady Poda,Mykola V. Protopopov,Vera Pütter,Rahul Ravichandran,Didier Rognan,Edina Rosta,Yogesh Sabnis,Thomas Scott,Almagul Seitova,Purshotam Sharma,François Sindt,Minghu Song,Casper Steinmann,Rick Stevens,Valerij Talagayev,Valentyna V. Tararina,Olga Tarkhanova,Damon Tingey,John F. Trant,Dakota Treleaven,Alexander Tropsha,Patrick Walters,Jude Wells,Yvonne Westermaier,Gerhard Wolber,Lars Wortmann,Shuangjia Zheng,James S. Fraser,Matthieu Schapira
The third Critical Assessment of Computational Hit-finding Experiments (CACHE) challenged computational teams to identify chemically novel ligands targeting the macrodomain 1 of SARS-CoV-2 Nsp3, a promising coronavirus drug target. Twenty-three groups deployed diverse design strategies to collectively select 1739 ligand candidates. While over 85% of the designed molecules were chemically novel, the best experimentally confirmed hits were structurally similar to previously published compounds. Confirming a trend observed in CACHE #1 and #2, two of the best-performing workflows used compounds selected by physics-based computational screening methods to train machine learning models able to rapidly screen large chemical libraries, while four others used exclusively physics-based approaches. Three pharmacophore searches and one fragment growing strategy were also part of the seven winning workflows. While active molecules discovered by CACHE #3 participants largely mimicked the adenine ring of the endogenous substrate, ADP-ribose, preserving the canonical chemotype commonly observed in previously reported Nsp3-Mac1 ligands, they still provide novel structure–activity relationship insights that may inform the development of future antivirals. Collectively, these results show that multiple molecular design strategies can efficiently converge on similar potent molecules.
第三次计算命中发现实验关键评估(CACHE)向计算团队提出了挑战,要求他们确定针对SARS-CoV-2 Nsp3大结构域1的化学新配体,这是一种有前景的冠状病毒药物靶点。23个小组采用不同的设计策略,共同选择了1739个候选配体。虽然超过85%的设计分子在化学上是新颖的,但实验证实的最佳命中与先前发表的化合物结构相似。证实了在CACHE #1和#2中观察到的趋势,两个性能最好的工作流程使用基于物理的计算筛选方法选择的化合物来训练能够快速筛选大型化学文库的机器学习模型,而其他四个则专门使用基于物理的方法。三个药效团搜索和一个片段增长策略也是七个获奖工作流程的一部分。虽然CACHE #3参与者发现的活性分子在很大程度上模仿了内源性底物adp核糖的腺嘌呤环,保留了在先前报道的Nsp3-Mac1配体中常见的典型化学型,但它们仍然提供了新的结构-活性关系见解,可能为未来抗病毒药物的开发提供信息。总的来说,这些结果表明,多种分子设计策略可以有效地收敛于相似的有效分子。
{"title":"CACHE Challenge #3: Targeting the Nsp3 Macrodomain of SARS-CoV-2","authors":"Oleksandra Herasymenko,Madhushika Silva,Galen J. Correy,Abd Al-Aziz A. Abu-Saleh,Suzanne Ackloo,Cheryl Arrowsmith,Alan Ashworth,Fuqiang Ban,Hartmut Beck,Kevin P. Bishop,Hugo J. Bohórquez,Albina Bolotokova,Marko Breznik,Irene Chau,Yu Chen,Artem Cherkasov,Wim Dehaen,Dennis Della Corte,Katrin Denzinger,Niklas P. Doering,Kristina Edfeldt,Aled Edwards,Darren Fayne,Francesco Gentile,Elisa Gibson,Ozan Gokdemir,Anders Gunnarsson,Judith Günther,John J. Irwin,Jan Halborg Jensen,Rachel J. Harding,Alexander Hillisch,Laurent Hoffer,Anders Hogner,Ashley Hutchinson,Shubhangi Kandwal,Andrea Karlova,Kushal Koirala,Sergei Kotelnikov,Dima Kozakov,Juyong Lee,Soowon Lee,Uta Lessel,Sijie Liu,Xuefeng Liu,Peter Loppnau,Jens Meiler,Rocco Moretti,Yurii S. Moroz,Charuvaka Muvva,Tudor I. Oprea,Brooks Paige,Amit Pandit,Keunwan Park,Gennady Poda,Mykola V. Protopopov,Vera Pütter,Rahul Ravichandran,Didier Rognan,Edina Rosta,Yogesh Sabnis,Thomas Scott,Almagul Seitova,Purshotam Sharma,François Sindt,Minghu Song,Casper Steinmann,Rick Stevens,Valerij Talagayev,Valentyna V. Tararina,Olga Tarkhanova,Damon Tingey,John F. Trant,Dakota Treleaven,Alexander Tropsha,Patrick Walters,Jude Wells,Yvonne Westermaier,Gerhard Wolber,Lars Wortmann,Shuangjia Zheng,James S. Fraser,Matthieu Schapira","doi":"10.1021/acs.jcim.5c02441","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02441","url":null,"abstract":"The third Critical Assessment of Computational Hit-finding Experiments (CACHE) challenged computational teams to identify chemically novel ligands targeting the macrodomain 1 of SARS-CoV-2 Nsp3, a promising coronavirus drug target. Twenty-three groups deployed diverse design strategies to collectively select 1739 ligand candidates. While over 85% of the designed molecules were chemically novel, the best experimentally confirmed hits were structurally similar to previously published compounds. Confirming a trend observed in CACHE #1 and #2, two of the best-performing workflows used compounds selected by physics-based computational screening methods to train machine learning models able to rapidly screen large chemical libraries, while four others used exclusively physics-based approaches. Three pharmacophore searches and one fragment growing strategy were also part of the seven winning workflows. While active molecules discovered by CACHE #3 participants largely mimicked the adenine ring of the endogenous substrate, ADP-ribose, preserving the canonical chemotype commonly observed in previously reported Nsp3-Mac1 ligands, they still provide novel structure–activity relationship insights that may inform the development of future antivirals. Collectively, these results show that multiple molecular design strategies can efficiently converge on similar potent molecules.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"6 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146006342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ab-SELDON: Leveraging Diversity Data for an Efficient Automated Computational Pipeline for Antibody Design. Ab-SELDON:利用多样性数据为抗体设计提供高效的自动化计算管道。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-20 DOI: 10.1021/acs.jcim.5c01924
Jean V Sampaio,Andrielly H S Costa,Aline O Albuquerque,Júlia S Souza,Diego S Almeida,Eduardo M Gaieta,Matheus V Almeida,Geraldo R Sartori,João H M Silva
The utilization of predictive tools has become increasingly prevalent in the development of biopharmaceuticals, reducing the time and cost of research. However, most methods for computational antibody design are hampered by their reliance on scarcely available antibody structures, potential for immunogenic modifications, and a restricted exploration of the paratope's potential chemical and conformational space. We propose Ab-SELDON, a modular and easily customizable antibody design pipeline capable of iteratively optimizing an antibody-antigen (Ab-Ag) interaction in five different modification steps, including CDR and framework grafting, and mutagenesis. The optimization process is guided by diversity data collected from millions of publicly available human antibody sequences. This approach enhanced the exploration of the chemical and conformational space of the paratope during computational tests involving the optimization of an anti-HER2 antibody. Optimization of another antibody against Gal-3BP stabilized the Ab-Ag interaction in molecular dynamics simulations at lower runtime than alternative pipelines. Tests with SKEMPI's Ab-Ag mutations also demonstrated the pipeline's ability to correctly identify the effect of the majority of mutations, especially multipoint and those that increased binding affinity. This freely available pipeline presents a new approach for computationally efficient and automated in silico antibody design, thereby facilitating the development of new biopharmaceuticals.
预测工具的使用在生物制药的开发中变得越来越普遍,减少了研究的时间和成本。然而,大多数计算抗体设计的方法都受到它们依赖于很少可用的抗体结构、免疫原性修饰的潜力以及对paratope潜在化学和构象空间的有限探索的阻碍。我们提出了Ab-SELDON,这是一个模块化且易于定制的抗体设计管道,能够在五个不同的修饰步骤中迭代优化抗体-抗原(Ab-Ag)相互作用,包括CDR和框架移植以及诱变。优化过程由从数百万公开可用的人类抗体序列中收集的多样性数据指导。这种方法在涉及抗her2抗体优化的计算测试中增强了对paratech的化学和构象空间的探索。优化另一种针对Gal-3BP的抗体在分子动力学模拟中稳定了Ab-Ag相互作用,比其他管道运行时间更短。对SKEMPI的Ab-Ag突变的测试也证明了该管道能够正确识别大多数突变的影响,特别是多点突变和增加结合亲和力的突变。这种免费提供的管道为计算效率和自动化的硅抗体设计提供了一种新的方法,从而促进了新的生物制药的发展。
{"title":"Ab-SELDON: Leveraging Diversity Data for an Efficient Automated Computational Pipeline for Antibody Design.","authors":"Jean V Sampaio,Andrielly H S Costa,Aline O Albuquerque,Júlia S Souza,Diego S Almeida,Eduardo M Gaieta,Matheus V Almeida,Geraldo R Sartori,João H M Silva","doi":"10.1021/acs.jcim.5c01924","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c01924","url":null,"abstract":"The utilization of predictive tools has become increasingly prevalent in the development of biopharmaceuticals, reducing the time and cost of research. However, most methods for computational antibody design are hampered by their reliance on scarcely available antibody structures, potential for immunogenic modifications, and a restricted exploration of the paratope's potential chemical and conformational space. We propose Ab-SELDON, a modular and easily customizable antibody design pipeline capable of iteratively optimizing an antibody-antigen (Ab-Ag) interaction in five different modification steps, including CDR and framework grafting, and mutagenesis. The optimization process is guided by diversity data collected from millions of publicly available human antibody sequences. This approach enhanced the exploration of the chemical and conformational space of the paratope during computational tests involving the optimization of an anti-HER2 antibody. Optimization of another antibody against Gal-3BP stabilized the Ab-Ag interaction in molecular dynamics simulations at lower runtime than alternative pipelines. Tests with SKEMPI's Ab-Ag mutations also demonstrated the pipeline's ability to correctly identify the effect of the majority of mutations, especially multipoint and those that increased binding affinity. This freely available pipeline presents a new approach for computationally efficient and automated in silico antibody design, thereby facilitating the development of new biopharmaceuticals.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"30 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometry-Enhanced Multiscale Joint Representation Learning for Drug-Target Interaction Prediction. 基于几何增强的多尺度联合表示学习的药物-靶标相互作用预测。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-20 DOI: 10.1021/acs.jcim.5c02347
Qiao Ning,Shaohang Qiao,Yawen Cai,Yanpeng Liu,Hui Li,Qian Ma,Shikai Guo
Drug-target interactions (DTIs) are the basis of the therapeutic effect of drugs, whose accurate prediction helps reduce the cost and time of experimental screening in drug development process. Present methods for DTIs prediction often focus on the study of molecular topological structure, which weakens spatial information such as the relative position of atoms and bond angle, and fail to effectively integrate molecular information with association network information. To address this issue, we propose a novel Geometry-enhanced Multiscale Joint Representation Learning method for drug-target interaction prediction (GMJRL). GMJRL not only considers the global information in the drug-target network from the macro-scale, but also extracts the geometric structure information on the drug and the target from the microscale, including the bond angle information on the drug and the atomic coordinate information on the target. To effectively fuse different scale representations, we develop a joint representation learning method with self-attention, which can capture correlations within the same scale and consider the interscale relationships, thus achieving effective fusion of the macro-scale and microscale representations. Finally, this study introduces a negative sampling algorithm to select reliable negative samples from unlabeled drug-target pairs. Extensive experiments validate that GMJRL yields promising outcomes in predicting drug-target interactions.
药物-靶标相互作用(DTIs)是药物治疗效果的基础,其准确预测有助于降低药物开发过程中实验筛选的成本和时间。现有的dti预测方法往往侧重于对分子拓扑结构的研究,弱化了原子相对位置、键角等空间信息,无法有效整合分子信息与关联网络信息。为了解决这一问题,我们提出了一种新的几何增强多尺度联合表示学习方法用于药物-靶标相互作用预测(GMJRL)。GMJRL既从宏观尺度考虑药物-靶点网络中的全局信息,又从微观尺度提取药物和靶点的几何结构信息,包括药物上的键角信息和靶点上的原子坐标信息。为了有效地融合不同的尺度表征,我们开发了一种具有自注意的联合表征学习方法,该方法可以捕获同一尺度内的相关性并考虑尺度间的关系,从而实现宏观尺度和微观尺度表征的有效融合。最后,本研究引入了一种负采样算法,从未标记的药物-靶标对中选择可靠的负样本。大量的实验证实,GMJRL在预测药物-靶标相互作用方面产生了有希望的结果。
{"title":"Geometry-Enhanced Multiscale Joint Representation Learning for Drug-Target Interaction Prediction.","authors":"Qiao Ning,Shaohang Qiao,Yawen Cai,Yanpeng Liu,Hui Li,Qian Ma,Shikai Guo","doi":"10.1021/acs.jcim.5c02347","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02347","url":null,"abstract":"Drug-target interactions (DTIs) are the basis of the therapeutic effect of drugs, whose accurate prediction helps reduce the cost and time of experimental screening in drug development process. Present methods for DTIs prediction often focus on the study of molecular topological structure, which weakens spatial information such as the relative position of atoms and bond angle, and fail to effectively integrate molecular information with association network information. To address this issue, we propose a novel Geometry-enhanced Multiscale Joint Representation Learning method for drug-target interaction prediction (GMJRL). GMJRL not only considers the global information in the drug-target network from the macro-scale, but also extracts the geometric structure information on the drug and the target from the microscale, including the bond angle information on the drug and the atomic coordinate information on the target. To effectively fuse different scale representations, we develop a joint representation learning method with self-attention, which can capture correlations within the same scale and consider the interscale relationships, thus achieving effective fusion of the macro-scale and microscale representations. Finally, this study introduces a negative sampling algorithm to select reliable negative samples from unlabeled drug-target pairs. Extensive experiments validate that GMJRL yields promising outcomes in predicting drug-target interactions.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"65 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Exploration of the Molecular Mechanism of Epigallocatechin Gallate against TDP-43 Aggregation. 表没食子儿茶素没食子酸酯抗TDP-43聚集分子机制的计算探索。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-20 DOI: 10.1021/acs.jcim.5c02616
Wenjuan Yi,Zhengdong Xu,Dushuo Feng,Lulu Guan,Jiaxing Tang,Yu Zou
Cytoplasmic accumulation of the transactive response deoxyribonucleic acid (DNA)-binding protein of 43 kDa (TDP-43) aggregates represents the primary pathological hallmark of TDP-43 proteinopathies including amyotrophic lateral sclerosis (ALS) and chronic traumatic encephalopathy (CTE). Inhibiting TDP-43 aggregation or disrupting its preformed fibrils might be promising strategies to prevent or delay the development of TDP-43 proteinopathies. Recently, the green tea polyphenol, epigallocatechin gallate (EGCG), was observed to prevent the formation of TDP-43 oligomeric species and fibrillar aggregates. Nevertheless, the atomic-level mechanism of this inhibition has been incompletely characterized. In this study, we performed a multitude of replica exchange with solute tempering 2 (REST2) and all-atom molecular dynamics (MD) simulations of 46.8 μs in total on TDP-43 models with and without EGCG. The REST2 simulation results revealed that EGCG impedes the β-sheet structure formation and interferes the interchain interaction of TDP-43304-348 dimer. Subsequent analyses show that EGCG could alter the distribution of free energy landscape and hinder the residue-residue interaction of the dimer. The binding analyses confirmed that EGCG preferentially bound to M307, F313, F316, W334, M339, Q344, and Q346 residues, and hydrophobic, polar, and π-π stacking interactions dominate the binding of EGCG on the dimer. Additional conventional molecular dynamics (MD) simulations demonstrated that the protofibrillar tetramer is the minimal stable TDP-43304-348 protofibril. Taking the tetramer as a protofibril model, we found that EGCG could reduce the structural stability and disrupt the β-sheet structure of TDP-43304-348 protofibril, thus possessing a destabilization effect on its higher-order structure. This investigation unveils the atomic-level mechanism by which EGCG against TDP-43 aggregation, which may provide potential fundamental knowledge of therapeutic strategies for TDP-43 proteinopathies.
43 kDa (TDP-43)聚合体的交互反应脱氧核糖核酸(DNA)结合蛋白的细胞质积累是肌萎缩性侧索硬化症(ALS)和慢性创伤性脑病(CTE)等TDP-43蛋白病变的主要病理标志。抑制TDP-43聚集或破坏其预先形成的原纤维可能是预防或延缓TDP-43蛋白病变发展的有希望的策略。最近,绿茶多酚表没食子儿茶素没食子酸酯(EGCG)被观察到可以阻止TDP-43寡聚物和纤维聚集体的形成。然而,这种抑制的原子水平机制尚未完全表征。在这项研究中,我们用溶质回火2 (REST2)和全原子分子动力学(MD)对加和不加EGCG的TDP-43模型进行了46.8 μs的复制交换和模拟。REST2模拟结果表明,EGCG阻碍了β-片结构的形成,干扰了TDP-43304-348二聚体的链间相互作用。随后的分析表明,EGCG可以改变自由能格局的分布,阻碍二聚体的残残相互作用。结合分析证实,EGCG优先与M307、F313、F316、W334、M339、Q344和Q346残基结合,疏水、极性和π-π堆叠相互作用主导了EGCG与二聚体的结合。另外的常规分子动力学(MD)模拟表明,原纤维四聚体是最小稳定的TDP-43304-348原纤维。以四聚体为原原纤维模型,我们发现EGCG可以降低TDP-43304-348原原纤维的结构稳定性,破坏其β-sheet结构,从而对其高阶结构具有不稳定作用。这项研究揭示了EGCG对抗TDP-43聚集的原子水平机制,这可能为TDP-43蛋白病变的治疗策略提供潜在的基础知识。
{"title":"Computational Exploration of the Molecular Mechanism of Epigallocatechin Gallate against TDP-43 Aggregation.","authors":"Wenjuan Yi,Zhengdong Xu,Dushuo Feng,Lulu Guan,Jiaxing Tang,Yu Zou","doi":"10.1021/acs.jcim.5c02616","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02616","url":null,"abstract":"Cytoplasmic accumulation of the transactive response deoxyribonucleic acid (DNA)-binding protein of 43 kDa (TDP-43) aggregates represents the primary pathological hallmark of TDP-43 proteinopathies including amyotrophic lateral sclerosis (ALS) and chronic traumatic encephalopathy (CTE). Inhibiting TDP-43 aggregation or disrupting its preformed fibrils might be promising strategies to prevent or delay the development of TDP-43 proteinopathies. Recently, the green tea polyphenol, epigallocatechin gallate (EGCG), was observed to prevent the formation of TDP-43 oligomeric species and fibrillar aggregates. Nevertheless, the atomic-level mechanism of this inhibition has been incompletely characterized. In this study, we performed a multitude of replica exchange with solute tempering 2 (REST2) and all-atom molecular dynamics (MD) simulations of 46.8 μs in total on TDP-43 models with and without EGCG. The REST2 simulation results revealed that EGCG impedes the β-sheet structure formation and interferes the interchain interaction of TDP-43304-348 dimer. Subsequent analyses show that EGCG could alter the distribution of free energy landscape and hinder the residue-residue interaction of the dimer. The binding analyses confirmed that EGCG preferentially bound to M307, F313, F316, W334, M339, Q344, and Q346 residues, and hydrophobic, polar, and π-π stacking interactions dominate the binding of EGCG on the dimer. Additional conventional molecular dynamics (MD) simulations demonstrated that the protofibrillar tetramer is the minimal stable TDP-43304-348 protofibril. Taking the tetramer as a protofibril model, we found that EGCG could reduce the structural stability and disrupt the β-sheet structure of TDP-43304-348 protofibril, thus possessing a destabilization effect on its higher-order structure. This investigation unveils the atomic-level mechanism by which EGCG against TDP-43 aggregation, which may provide potential fundamental knowledge of therapeutic strategies for TDP-43 proteinopathies.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"44 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recognition of Coexisting Phases in Model Membranes via an Unsupervised Method. 用无监督方法识别模型膜中共存相。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-20 DOI: 10.1021/acs.jcim.5c02665
Yuzhuo Dai,Jianwei Zhao,Beibei Wang,Qing Liang,Ruo-Xu Gu
Phase separation in bilayers composed of a few lipid species is widely used as a model for exploring the lateral heterogeneity of complex cell membranes. Molecular dynamics (MD) simulations offer atomistic insights into coexisting lipid phases. But identifying these phases from trajectories remains challenging. Here, we present an unsupervised method for lipid phase recognition in phase-separated bilayers. In this method, the membrane plane is first discretized into pixels. For each pixel, the local lipid packing degree, which is defined as the atomic density within that pixel, is calculated and assigned to the corresponding pixel. A threshold is then determined by fitting a two-component Gaussian mixture model (GMM) to the distribution of lipid packing degree, enabling phase state assignment to pixels and subsequent mapping back to lipids. Our method is applicable to different systems, regardless of their compositions or temperatures, thus minimizing potential artifacts. Tests on bilayers with diverse lipid compositions and temperatures show that our method outperforms the commonly used hidden Markov model (HMM) in both accuracy and robustness. Notably, in this method, phase recognition relies solely on bilayer-intrinsic properties (lipid packing degree), without requiring temporal information, labeled data, or assumptions about the local lipid environment. This makes our method broadly applicable to various tasks, including characterizing the phase transformation process before the system reaches equilibration and identifying coexisting phases in protein-containing bilayers. In summary, we provide a robust and accurate framework for identifying coexisting phases in bilayers and tracking their dynamic transitions in simulations.
由几种脂质组成的双分子层的相分离被广泛用于研究复杂细胞膜的横向非均质性。分子动力学(MD)模拟提供了共存的脂质相原子的见解。但是,从轨迹中识别这些阶段仍然具有挑战性。在这里,我们提出了一种无监督的方法来识别相分离双层中的脂质相。在这种方法中,首先将膜平面离散成像素。对于每个像素,计算局部脂质填充度,定义为该像素内的原子密度,并将其分配给相应的像素。然后,通过将双组分高斯混合模型(GMM)拟合到脂质堆积度的分布来确定阈值,从而使相态分配到像素并随后映射回脂质。我们的方法适用于不同的系统,不管它们的组成或温度如何,从而最大限度地减少了潜在的工件。在不同脂质组成和温度的双分子层上的测试表明,我们的方法在准确性和鲁棒性方面都优于常用的隐马尔可夫模型(HMM)。值得注意的是,在这种方法中,相位识别仅依赖于双层固有特性(脂质堆积程度),而不需要时间信息、标记数据或关于局部脂质环境的假设。这使得我们的方法广泛适用于各种任务,包括表征系统达到平衡之前的相变过程和识别含蛋白质双层中共存的相。总之,我们提供了一个强大而准确的框架来识别双层中共存的阶段,并在模拟中跟踪它们的动态转变。
{"title":"Recognition of Coexisting Phases in Model Membranes via an Unsupervised Method.","authors":"Yuzhuo Dai,Jianwei Zhao,Beibei Wang,Qing Liang,Ruo-Xu Gu","doi":"10.1021/acs.jcim.5c02665","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02665","url":null,"abstract":"Phase separation in bilayers composed of a few lipid species is widely used as a model for exploring the lateral heterogeneity of complex cell membranes. Molecular dynamics (MD) simulations offer atomistic insights into coexisting lipid phases. But identifying these phases from trajectories remains challenging. Here, we present an unsupervised method for lipid phase recognition in phase-separated bilayers. In this method, the membrane plane is first discretized into pixels. For each pixel, the local lipid packing degree, which is defined as the atomic density within that pixel, is calculated and assigned to the corresponding pixel. A threshold is then determined by fitting a two-component Gaussian mixture model (GMM) to the distribution of lipid packing degree, enabling phase state assignment to pixels and subsequent mapping back to lipids. Our method is applicable to different systems, regardless of their compositions or temperatures, thus minimizing potential artifacts. Tests on bilayers with diverse lipid compositions and temperatures show that our method outperforms the commonly used hidden Markov model (HMM) in both accuracy and robustness. Notably, in this method, phase recognition relies solely on bilayer-intrinsic properties (lipid packing degree), without requiring temporal information, labeled data, or assumptions about the local lipid environment. This makes our method broadly applicable to various tasks, including characterizing the phase transformation process before the system reaches equilibration and identifying coexisting phases in protein-containing bilayers. In summary, we provide a robust and accurate framework for identifying coexisting phases in bilayers and tracking their dynamic transitions in simulations.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"39 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Solvent Matters: Bridging Theory and Experiment in Quantum-Mechanical NMR Structural Elucidation. 溶剂物质:量子力学核磁共振结构解析中的桥接理论与实验。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-20 DOI: 10.1021/acs.jcim.5c02506
Iván Cortés,Cristina Cuadrado,José A Gavín,María Marta Zanardi,Antonio Hernández Daranas,Ariel M Sarotti
Quantum-mechanical NMR (QM-NMR) is widely used in structure elucidation. A long-sought holey grail in this field is solving structures from a simple 1H NMR spectrum with AI-driven workflows. Yet, solvent effects on chemical shifts, though long recognized, remain overlooked. We show in a theory-experiment study that implicit solvation models miss solvent-induced variations and introduce a Python tool to quantify solvent sensitivity, aiding more reliable QM-NMR structural assignments.
量子力学核磁共振(QM-NMR)在结构解析中有着广泛的应用。该领域长期寻求的圣杯是通过人工智能驱动的工作流程从简单的1H NMR谱中求解结构。然而,溶剂对化学变化的影响,虽然早就被认识到,但仍然被忽视。我们在一项理论实验研究中表明,隐式溶剂化模型错过了溶剂引起的变化,并引入了一个Python工具来量化溶剂敏感性,帮助更可靠的QM-NMR结构分配。
{"title":"Solvent Matters: Bridging Theory and Experiment in Quantum-Mechanical NMR Structural Elucidation.","authors":"Iván Cortés,Cristina Cuadrado,José A Gavín,María Marta Zanardi,Antonio Hernández Daranas,Ariel M Sarotti","doi":"10.1021/acs.jcim.5c02506","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02506","url":null,"abstract":"Quantum-mechanical NMR (QM-NMR) is widely used in structure elucidation. A long-sought holey grail in this field is solving structures from a simple 1H NMR spectrum with AI-driven workflows. Yet, solvent effects on chemical shifts, though long recognized, remain overlooked. We show in a theory-experiment study that implicit solvation models miss solvent-induced variations and introduce a Python tool to quantify solvent sensitivity, aiding more reliable QM-NMR structural assignments.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"99 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Sensitivity Analysis Methodology for Rule-Based Stochastic Chemical Systems 基于规则的随机化学系统的灵敏度分析方法
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-19 DOI: 10.1021/acs.jcim.5c02375
Erika M. Herrera Machado,Jakob L. Andersen,Rolf Fagerberg,Daniel Merkle
In this study, we introduce a sensitivity analysis methodology for stochastic systems in chemistry, where dynamics are often governed by random processes. Our approach is based on gradient estimation via finite differences, averaging simulation outcomes, and analyzing variability under intrinsic noise. We characterize gradient uncertainty as an angular range within which all plausible gradient directions are expected to lie. A key feature of our approach is that this uncertainty measure adaptively guides the number of simulations performed for each nominal-perturbation pair of points in order to minimize unnecessary computations while maintaining robustness. Systematically exploring a range of parameter values across the parameter space, rather than focusing on a single value, allows us to identify not only sensitive parameters but also regions of parameter space associated with different levels of sensitivity. These results are visualized through vector field plots to offer an intuitive representation of local sensitivity across parameter space. Additionally, global sensitivity coefficients over sampled points in the parameter space are computed to capture overall trends. Flexibility regarding the choice of output observable measures is another key feature of our method: while traditional sensitivity analyses often focus on species concentrations, our framework allows for the definition of a large range of problem-specific observables. This makes it broadly applicable in diverse chemical and biochemical scenarios. We demonstrate our approach on two systems: classical Michaelis–Menten kinetics and a rule-based model of the formose reaction, using the cheminformatics software MØD for Gillespie-based stochastic simulations.
在这项研究中,我们介绍了化学中随机系统的灵敏度分析方法,其中动力学通常由随机过程控制。我们的方法是基于基于有限差分的梯度估计,平均模拟结果,并分析内在噪声下的可变性。我们把梯度不确定性描述为一个角范围,在这个角范围内,所有可能的梯度方向都可能存在。我们方法的一个关键特征是,这种不确定性度量自适应地指导对每个标称扰动对点执行的模拟次数,以便在保持鲁棒性的同时最小化不必要的计算。系统地探索参数空间中的一系列参数值,而不是专注于单个值,使我们不仅可以识别敏感参数,还可以识别与不同灵敏度水平相关的参数空间区域。这些结果通过向量场图可视化,以提供跨参数空间的局部灵敏度的直观表示。此外,计算参数空间中采样点的全局灵敏度系数以捕获总体趋势。关于选择输出可观察测量的灵活性是我们方法的另一个关键特征:传统的敏感性分析通常侧重于物种浓度,而我们的框架允许定义大范围的特定问题的可观察值。这使得它广泛适用于各种化学和生化场景。我们在两个系统上展示了我们的方法:经典的Michaelis-Menten动力学和formose反应的基于规则的模型,使用化学信息学软件MØD进行基于gillespie的随机模拟。
{"title":"A Sensitivity Analysis Methodology for Rule-Based Stochastic Chemical Systems","authors":"Erika M. Herrera Machado,Jakob L. Andersen,Rolf Fagerberg,Daniel Merkle","doi":"10.1021/acs.jcim.5c02375","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02375","url":null,"abstract":"In this study, we introduce a sensitivity analysis methodology for stochastic systems in chemistry, where dynamics are often governed by random processes. Our approach is based on gradient estimation via finite differences, averaging simulation outcomes, and analyzing variability under intrinsic noise. We characterize gradient uncertainty as an angular range within which all plausible gradient directions are expected to lie. A key feature of our approach is that this uncertainty measure adaptively guides the number of simulations performed for each nominal-perturbation pair of points in order to minimize unnecessary computations while maintaining robustness. Systematically exploring a range of parameter values across the parameter space, rather than focusing on a single value, allows us to identify not only sensitive parameters but also regions of parameter space associated with different levels of sensitivity. These results are visualized through vector field plots to offer an intuitive representation of local sensitivity across parameter space. Additionally, global sensitivity coefficients over sampled points in the parameter space are computed to capture overall trends. Flexibility regarding the choice of output observable measures is another key feature of our method: while traditional sensitivity analyses often focus on species concentrations, our framework allows for the definition of a large range of problem-specific observables. This makes it broadly applicable in diverse chemical and biochemical scenarios. We demonstrate our approach on two systems: classical Michaelis–Menten kinetics and a rule-based model of the formose reaction, using the cheminformatics software MØD for Gillespie-based stochastic simulations.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"270 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Density Estimation Based on Mixtures of Gaussians for Perovskite Solar Cells Modeling. 基于混合高斯密度估计的钙钛矿太阳能电池模型。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-19 DOI: 10.1021/acs.jcim.5c02017
F Alexander Sepúlveda,Daniel Cerro-Ramos,T Jesper Jacobsson
Accurately modeling the complex relationships among synthesis parameters, material compositions, and performance metrics is essential for accelerating the development of perovskite solar cells (PSCs). In this context, machine learning (ML) has proven to be a valuable tool. While most ML applications in PSC research rely on discriminative "black-box" models, this study adopts a generative approach by modeling the joint probability density function. We employ Gaussian Mixture Models (GMMs), a pragmatic and interpretable choice well-suited for the scarce, low-dimensional tabular data typical of PSC research. This single GMM framework is evaluated on five distinct tasks: discovering clusters, regression, generating novel configurations, training on data sets with missing data and, inverse design of the experimental (synthesis) conditions. That is, assuming we have the perovskite material composition and a target PCE, we infer the experimental conditions. For this latter task we use a novel "GMM-Assisted Optimization" method, which demonstrates to be more effective than standard random-start optimization, achieving an RMSE of 1.52 against target PCEs, more than halving the 3.32 RMSE of the baseline. These findings highlight the power of probabilistic modeling for data-driven discovery in PSC research.
准确建模合成参数、材料组成和性能指标之间的复杂关系对于加速钙钛矿太阳能电池(PSCs)的发展至关重要。在这种情况下,机器学习(ML)已被证明是一个有价值的工具。虽然PSC研究中的大多数ML应用依赖于判别“黑盒”模型,但本研究通过建模联合概率密度函数采用生成方法。我们采用高斯混合模型(gmm),这是一种实用且可解释的选择,非常适合PSC研究中典型的稀缺、低维表格数据。这个单一的GMM框架在五个不同的任务上进行评估:发现聚类,回归,生成新的配置,对缺失数据的数据集进行训练,以及实验(合成)条件的逆设计。也就是说,假设我们有钙钛矿材料组成和目标PCE,我们推断实验条件。对于后一项任务,我们使用了一种新的“gmm辅助优化”方法,该方法比标准的随机启动优化更有效,针对目标pce实现了1.52的RMSE,比基线的3.32 RMSE减少了一半以上。这些发现突出了概率建模在PSC研究中数据驱动发现的力量。
{"title":"Density Estimation Based on Mixtures of Gaussians for Perovskite Solar Cells Modeling.","authors":"F Alexander Sepúlveda,Daniel Cerro-Ramos,T Jesper Jacobsson","doi":"10.1021/acs.jcim.5c02017","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02017","url":null,"abstract":"Accurately modeling the complex relationships among synthesis parameters, material compositions, and performance metrics is essential for accelerating the development of perovskite solar cells (PSCs). In this context, machine learning (ML) has proven to be a valuable tool. While most ML applications in PSC research rely on discriminative \"black-box\" models, this study adopts a generative approach by modeling the joint probability density function. We employ Gaussian Mixture Models (GMMs), a pragmatic and interpretable choice well-suited for the scarce, low-dimensional tabular data typical of PSC research. This single GMM framework is evaluated on five distinct tasks: discovering clusters, regression, generating novel configurations, training on data sets with missing data and, inverse design of the experimental (synthesis) conditions. That is, assuming we have the perovskite material composition and a target PCE, we infer the experimental conditions. For this latter task we use a novel \"GMM-Assisted Optimization\" method, which demonstrates to be more effective than standard random-start optimization, achieving an RMSE of 1.52 against target PCEs, more than halving the 3.32 RMSE of the baseline. These findings highlight the power of probabilistic modeling for data-driven discovery in PSC research.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"85 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145994807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Sensitivity Analysis Methodology for Rule-Based Stochastic Chemical Systems 基于规则的随机化学系统的灵敏度分析方法
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-19 DOI: 10.1021/acs.jcim.5c02375
Erika M. Herrera Machado,Jakob L. Andersen,Rolf Fagerberg,Daniel Merkle
In this study, we introduce a sensitivity analysis methodology for stochastic systems in chemistry, where dynamics are often governed by random processes. Our approach is based on gradient estimation via finite differences, averaging simulation outcomes, and analyzing variability under intrinsic noise. We characterize gradient uncertainty as an angular range within which all plausible gradient directions are expected to lie. A key feature of our approach is that this uncertainty measure adaptively guides the number of simulations performed for each nominal-perturbation pair of points in order to minimize unnecessary computations while maintaining robustness. Systematically exploring a range of parameter values across the parameter space, rather than focusing on a single value, allows us to identify not only sensitive parameters but also regions of parameter space associated with different levels of sensitivity. These results are visualized through vector field plots to offer an intuitive representation of local sensitivity across parameter space. Additionally, global sensitivity coefficients over sampled points in the parameter space are computed to capture overall trends. Flexibility regarding the choice of output observable measures is another key feature of our method: while traditional sensitivity analyses often focus on species concentrations, our framework allows for the definition of a large range of problem-specific observables. This makes it broadly applicable in diverse chemical and biochemical scenarios. We demonstrate our approach on two systems: classical Michaelis–Menten kinetics and a rule-based model of the formose reaction, using the cheminformatics software MØD for Gillespie-based stochastic simulations.
在这项研究中,我们介绍了化学中随机系统的灵敏度分析方法,其中动力学通常由随机过程控制。我们的方法是基于基于有限差分的梯度估计,平均模拟结果,并分析内在噪声下的可变性。我们把梯度不确定性描述为一个角范围,在这个角范围内,所有可能的梯度方向都可能存在。我们方法的一个关键特征是,这种不确定性度量自适应地指导对每个标称扰动对点执行的模拟次数,以便在保持鲁棒性的同时最小化不必要的计算。系统地探索参数空间中的一系列参数值,而不是专注于单个值,使我们不仅可以识别敏感参数,还可以识别与不同灵敏度水平相关的参数空间区域。这些结果通过向量场图可视化,以提供跨参数空间的局部灵敏度的直观表示。此外,计算参数空间中采样点的全局灵敏度系数以捕获总体趋势。关于选择输出可观察测量的灵活性是我们方法的另一个关键特征:传统的敏感性分析通常侧重于物种浓度,而我们的框架允许定义大范围的特定问题的可观察值。这使得它广泛适用于各种化学和生化场景。我们在两个系统上展示了我们的方法:经典的Michaelis-Menten动力学和formose反应的基于规则的模型,使用化学信息学软件MØD进行基于gillespie的随机模拟。
{"title":"A Sensitivity Analysis Methodology for Rule-Based Stochastic Chemical Systems","authors":"Erika M. Herrera Machado,Jakob L. Andersen,Rolf Fagerberg,Daniel Merkle","doi":"10.1021/acs.jcim.5c02375","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02375","url":null,"abstract":"In this study, we introduce a sensitivity analysis methodology for stochastic systems in chemistry, where dynamics are often governed by random processes. Our approach is based on gradient estimation via finite differences, averaging simulation outcomes, and analyzing variability under intrinsic noise. We characterize gradient uncertainty as an angular range within which all plausible gradient directions are expected to lie. A key feature of our approach is that this uncertainty measure adaptively guides the number of simulations performed for each nominal-perturbation pair of points in order to minimize unnecessary computations while maintaining robustness. Systematically exploring a range of parameter values across the parameter space, rather than focusing on a single value, allows us to identify not only sensitive parameters but also regions of parameter space associated with different levels of sensitivity. These results are visualized through vector field plots to offer an intuitive representation of local sensitivity across parameter space. Additionally, global sensitivity coefficients over sampled points in the parameter space are computed to capture overall trends. Flexibility regarding the choice of output observable measures is another key feature of our method: while traditional sensitivity analyses often focus on species concentrations, our framework allows for the definition of a large range of problem-specific observables. This makes it broadly applicable in diverse chemical and biochemical scenarios. We demonstrate our approach on two systems: classical Michaelis–Menten kinetics and a rule-based model of the formose reaction, using the cheminformatics software MØD for Gillespie-based stochastic simulations.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"9 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Restraint Quality, Not Quantity, Predicts Peptide-Protein Docking Outcomes. 约束质量,而不是数量,预测肽蛋白对接结果。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2026-01-18 DOI: 10.1021/acs.jcim.5c03021
Miriam Gulman,Jordan Chill,Dan Thomas Major
Understanding protein-peptide interactions is essential for uncovering cellular signaling mechanisms and advancing therapeutic development, as these interactions play central roles in numerous biological processes. Gaining structural insight into such complexes is crucial, yet traditional methods like nuclear magnetic resonance (NMR) and X-ray crystallography are often time-consuming and experimentally demanding. Computational approaches─including physics-based docking and deep-learning (DL) structure predictors such as AlphaFold3, Boltz-2, and Chai-1─offer powerful alternatives. Accurately modeling flexible peptides that bind to shallow, surface-exposed regions remains difficult for physics-based methods, and although multiple sequence alignment-driven DL models can achieve excellent performance in well-behaved systems, they too can struggle when the peptide adopts noncanonical conformations or when sequence identity is low. In such cases, distance restraints are often required to guide the docking toward accurate and biologically meaningful solutions, yet acquiring multiple high-quality restraints is often difficult. To address the limitation of physics and DL approaches, we developed a restraint scoring function that integrates evolutionary conservation, spatial proximity, and geometric distribution to assess the informativeness of restraint sets. This enables a more accurate evaluation of docking inputs and overcomes the shortcomings of relying solely on restraint count. Building on this framework, we introduce a minimal-restraint docking strategy, capable of identifying optimized subsets of restraints that lead to high-quality structural models. We evaluate a comprehensive set of protein-peptide systems, including 43 SH3 domain complexes, 8 WW domain complexes, and 19 medium-difficulty cases from the PepPCBench benchmark. Our approach shows that model quality improves as the restraint score increases, supporting restraint score as a simple, interpretable indicator of docking success. We further identify clear, domain-specific restraint-score thresholds for the SH3 and WW systems that enable accurate model selection. Together, these results offer a scalable and efficient strategy for structure prediction in data-limited contexts and lay the groundwork for restraint-informed modeling with quantifiable confidence, as well as a powerful foundation for data-efficient machine learning-based peptide-protein docking.
了解蛋白-肽相互作用对于揭示细胞信号机制和推进治疗发展至关重要,因为这些相互作用在许多生物过程中起着核心作用。获得对这些复合物的结构洞察力是至关重要的,然而传统的方法,如核磁共振(NMR)和x射线晶体学通常是耗时和实验要求高的。计算方法──包括基于物理的对接和深度学习(DL)结构预测器,如AlphaFold3、Boltz-2和Chai-1──提供了强大的替代方案。对于基于物理的方法来说,准确地建模结合在浅层表面暴露区域的柔性肽仍然很困难,尽管多序列比对驱动的深度学习模型可以在性能良好的系统中取得优异的性能,但当肽采用非规范构象或序列一致性较低时,它们也会遇到困难。在这种情况下,通常需要距离约束来引导对接获得精确的、有生物学意义的解决方案,然而获得多个高质量的约束通常是困难的。为了解决物理和深度学习方法的局限性,我们开发了一个约束评分函数,该函数集成了进化守恒性、空间接近性和几何分布,以评估约束集的信息量。这使得对接输入的评估更加准确,克服了单纯依赖约束计数的缺点。在此框架的基础上,我们引入了一种最小约束对接策略,能够识别约束的优化子集,从而产生高质量的结构模型。我们评估了一套全面的蛋白质肽系统,包括43个SH3结构域复合物,8个WW结构域复合物,以及来自PepPCBench基准的19个中等难度案例。我们的方法表明,模型质量随着约束分数的增加而提高,支持约束分数作为对接成功的简单、可解释的指标。我们进一步为SH3和WW系统确定了明确的、特定于领域的约束分数阈值,以实现准确的模型选择。总之,这些结果为数据有限的环境下的结构预测提供了一种可扩展和有效的策略,为具有可量化置信度的约束信息建模奠定了基础,也为基于数据高效的机器学习的肽-蛋白对接奠定了强大的基础。
{"title":"Restraint Quality, Not Quantity, Predicts Peptide-Protein Docking Outcomes.","authors":"Miriam Gulman,Jordan Chill,Dan Thomas Major","doi":"10.1021/acs.jcim.5c03021","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c03021","url":null,"abstract":"Understanding protein-peptide interactions is essential for uncovering cellular signaling mechanisms and advancing therapeutic development, as these interactions play central roles in numerous biological processes. Gaining structural insight into such complexes is crucial, yet traditional methods like nuclear magnetic resonance (NMR) and X-ray crystallography are often time-consuming and experimentally demanding. Computational approaches─including physics-based docking and deep-learning (DL) structure predictors such as AlphaFold3, Boltz-2, and Chai-1─offer powerful alternatives. Accurately modeling flexible peptides that bind to shallow, surface-exposed regions remains difficult for physics-based methods, and although multiple sequence alignment-driven DL models can achieve excellent performance in well-behaved systems, they too can struggle when the peptide adopts noncanonical conformations or when sequence identity is low. In such cases, distance restraints are often required to guide the docking toward accurate and biologically meaningful solutions, yet acquiring multiple high-quality restraints is often difficult. To address the limitation of physics and DL approaches, we developed a restraint scoring function that integrates evolutionary conservation, spatial proximity, and geometric distribution to assess the informativeness of restraint sets. This enables a more accurate evaluation of docking inputs and overcomes the shortcomings of relying solely on restraint count. Building on this framework, we introduce a minimal-restraint docking strategy, capable of identifying optimized subsets of restraints that lead to high-quality structural models. We evaluate a comprehensive set of protein-peptide systems, including 43 SH3 domain complexes, 8 WW domain complexes, and 19 medium-difficulty cases from the PepPCBench benchmark. Our approach shows that model quality improves as the restraint score increases, supporting restraint score as a simple, interpretable indicator of docking success. We further identify clear, domain-specific restraint-score thresholds for the SH3 and WW systems that enable accurate model selection. Together, these results offer a scalable and efficient strategy for structure prediction in data-limited contexts and lay the groundwork for restraint-informed modeling with quantifiable confidence, as well as a powerful foundation for data-efficient machine learning-based peptide-protein docking.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"57 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145994808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemical Information and Modeling
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1