Achieving well-informed decision-making in drug discovery: a comprehensive calibration study using neural network-based structure-activity models

IF 5.7 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Journal of Cheminformatics Pub Date : 2025-03-05 DOI:10.1186/s13321-025-00964-y
Hannah Rosa Friesacher, Ola Engkvist, Lewis Mervin, Yves Moreau, Adam Arany
{"title":"Achieving well-informed decision-making in drug discovery: a comprehensive calibration study using neural network-based structure-activity models","authors":"Hannah Rosa Friesacher,&nbsp;Ola Engkvist,&nbsp;Lewis Mervin,&nbsp;Yves Moreau,&nbsp;Adam Arany","doi":"10.1186/s13321-025-00964-y","DOIUrl":null,"url":null,"abstract":"<div><p>In the drug discovery process, where experiments can be costly and time-consuming, computational models that predict drug-target interactions are valuable tools to accelerate the development of new therapeutic agents. Estimating the uncertainty inherent in these neural network predictions provides valuable information that facilitates optimal decision-making when risk assessment is crucial. However, such models can be poorly calibrated, which results in unreliable uncertainty estimates that do not reflect the true predictive uncertainty. In this study, we compare different metrics, including accuracy and calibration scores, used for model hyperparameter tuning to investigate which model selection strategy achieves well-calibrated models. Furthermore, we propose to use a computationally efficient Bayesian uncertainty estimation method named HMC Bayesian Last Layer (HBLL), which generates Hamiltonian Monte Carlo (HMC) trajectories to obtain samples for the parameters of a Bayesian logistic regression fitted to the hidden layer of the baseline neural network. We report that this approach improves model calibration and achieves the performance of common uncertainty quantification methods by combining the benefits of uncertainty estimation and probability calibration methods. Finally, we show that combining post hoc calibration method with well-performing uncertainty quantification approaches can boost model accuracy and calibration. </p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-00964-y","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-00964-y","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

In the drug discovery process, where experiments can be costly and time-consuming, computational models that predict drug-target interactions are valuable tools to accelerate the development of new therapeutic agents. Estimating the uncertainty inherent in these neural network predictions provides valuable information that facilitates optimal decision-making when risk assessment is crucial. However, such models can be poorly calibrated, which results in unreliable uncertainty estimates that do not reflect the true predictive uncertainty. In this study, we compare different metrics, including accuracy and calibration scores, used for model hyperparameter tuning to investigate which model selection strategy achieves well-calibrated models. Furthermore, we propose to use a computationally efficient Bayesian uncertainty estimation method named HMC Bayesian Last Layer (HBLL), which generates Hamiltonian Monte Carlo (HMC) trajectories to obtain samples for the parameters of a Bayesian logistic regression fitted to the hidden layer of the baseline neural network. We report that this approach improves model calibration and achieves the performance of common uncertainty quantification methods by combining the benefits of uncertainty estimation and probability calibration methods. Finally, we show that combining post hoc calibration method with well-performing uncertainty quantification approaches can boost model accuracy and calibration.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在药物发现中实现明智的决策:使用基于神经网络的结构-活性模型的综合校准研究
在药物发现过程中,实验可能是昂贵和耗时的,预测药物-靶标相互作用的计算模型是加速新治疗剂开发的有价值的工具。评估这些神经网络预测中固有的不确定性提供了有价值的信息,有助于在风险评估至关重要的情况下进行最佳决策。然而,这些模型可能校准得很差,这导致不可靠的不确定性估计,不能反映真正的预测不确定性。在本研究中,我们比较了用于模型超参数调优的不同度量,包括精度和校准分数,以研究哪种模型选择策略可以获得校准良好的模型。此外,我们提出了一种计算效率高的贝叶斯不确定性估计方法,称为HMC贝叶斯最后一层(HBLL),该方法生成哈密顿蒙特卡罗(HMC)轨迹,以获取拟合基线神经网络隐藏层的贝叶斯逻辑回归参数的样本。该方法结合了不确定性估计和概率定标方法的优点,改进了模型定标,达到了常用不确定性定标方法的性能。最后,我们证明了将事后校准方法与性能良好的不确定度量化方法相结合可以提高模型的精度和校准。在这项工作中,我们提供了一个全面的概率校准研究,使用神经网络进行药物-靶标相互作用预测。我们报告了超参数选择策略,以及不确定性估计和概率校准方法对不确定性估计的可靠性的显着影响,这对于有效的药物发现过程至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
期刊最新文献
ChemGraphX: an open-source web tool for computing topological indices and entropy measures. TheorChem2Blender: an open-source tool to convert computational chemistry outputs into Blender-ready 3D objects. Word embeddings as autonomous predictors in materials design-the effect of inherent variability on information transfer. Enzyformer: a two-stage pretrained model for enzymatic retrosynthesis. Cosynllm: predicting drug combination synergy with LLM-generated descriptions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1