首页 > 最新文献

Artificial intelligence in the life sciences最新文献

英文 中文
From explainable artificial intelligence to human understanding 从可解释的人工智能到人类的理解
Pub Date : 2025-04-04 DOI: 10.1016/j.ailsci.2025.100131
Jürgen Bajorath
{"title":"From explainable artificial intelligence to human understanding","authors":"Jürgen Bajorath","doi":"10.1016/j.ailsci.2025.100131","DOIUrl":"10.1016/j.ailsci.2025.100131","url":null,"abstract":"","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100131"},"PeriodicalIF":0.0,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143825943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-objective synthesis planning by means of Monte Carlo Tree search 基于蒙特卡罗树搜索的多目标综合规划
Pub Date : 2025-02-19 DOI: 10.1016/j.ailsci.2025.100130
Helen Lai , Christos Kannas , Alan Kai Hassen , Emma Granqvist , Annie M. Westerlund , Djork-Arné Clevert , Mike Preuss , Samuel Genheden
We introduce a multi-objective search algorithm for retrosynthesis planning, based on a Monte Carlo Tree search formalism. The multi-objective search allows for combining diverse set of objectives without considering their scale or weighting factors. To benchmark this novel algorithm, we employ four objectives in a total of eight retrosynthesis experiments on a PaRoutes benchmark set. The objectives range from simple ones based on starting material and step count to complex ones based on synthesis complexity and route similarity. We show that with the careful employment of complex objectives, the multi-objective algorithm can outperform the single-objective search and provides a more diverse set of solutions. However, for many target compounds, the single- and multi-objective settings are equivalent. Nevertheless, our algorithm provides a framework for incorporating novel objectives for specific applications in synthesis planning.
介绍了一种基于蒙特卡罗树搜索形式的多目标搜索算法。多目标搜索允许组合不同的目标集合,而不考虑它们的规模或权重因素。为了对这种新算法进行基准测试,我们在PaRoutes基准集上总共进行了8次反合成实验,共使用了4个目标。目标范围从基于起始材料和步数的简单目标到基于综合复杂性和路线相似度的复杂目标。我们表明,通过仔细使用复杂目标,多目标算法可以优于单目标搜索,并提供更多样化的解决方案集。然而,对于许多目标化合物,单目标和多目标设置是等效的。然而,我们的算法提供了一个框架,为综合规划的具体应用纳入新的目标。
{"title":"Multi-objective synthesis planning by means of Monte Carlo Tree search","authors":"Helen Lai ,&nbsp;Christos Kannas ,&nbsp;Alan Kai Hassen ,&nbsp;Emma Granqvist ,&nbsp;Annie M. Westerlund ,&nbsp;Djork-Arné Clevert ,&nbsp;Mike Preuss ,&nbsp;Samuel Genheden","doi":"10.1016/j.ailsci.2025.100130","DOIUrl":"10.1016/j.ailsci.2025.100130","url":null,"abstract":"<div><div>We introduce a multi-objective search algorithm for retrosynthesis planning, based on a Monte Carlo Tree search formalism. The multi-objective search allows for combining diverse set of objectives without considering their scale or weighting factors. To benchmark this novel algorithm, we employ four objectives in a total of eight retrosynthesis experiments on a PaRoutes benchmark set. The objectives range from simple ones based on starting material and step count to complex ones based on synthesis complexity and route similarity. We show that with the careful employment of complex objectives, the multi-objective algorithm can outperform the single-objective search and provides a more diverse set of solutions. However, for many target compounds, the single- and multi-objective settings are equivalent. Nevertheless, our algorithm provides a framework for incorporating novel objectives for specific applications in synthesis planning.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100130"},"PeriodicalIF":0.0,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing uncertainty quantification in drug discovery with censored regression labels 用删节回归标签增强药物发现中的不确定度量化
Pub Date : 2025-02-13 DOI: 10.1016/j.ailsci.2025.100128
Emma Svensson , Hannah Rosa Friesacher , Susanne Winiwarter , Lewis Mervin , Adam Arany , Ola Engkvist
In the early stages of drug discovery, decisions regarding which experiments to pursue can be influenced by computational models for quantitative structure–activity relationships (QSAR). These decisions are critical due to the time-consuming and expensive nature of the experiments. Therefore, it is becoming essential to accurately quantify the uncertainty in machine learning predictions, such that resources can be used optimally and trust in the models improves. While computational methods for QSAR modeling often suffer from limited data and sparse experimental observations, additional information can exist in the form of censored labels that provide thresholds rather than precise values of observations. However, the standard approaches that quantify uncertainty in machine learning cannot fully utilize censored labels. In this work, we adapt ensemble-based, Bayesian, and Gaussian models with tools to learn from censored labels by using the Tobit model from survival analysis. Our results demonstrate that despite the partial information available in censored labels, they are essential to reliably estimate uncertainties in real pharmaceutical settings where approximately one-third or more of experimental labels are censored.
在药物发现的早期阶段,关于进行哪些实验的决定可能受到定量结构-活性关系(QSAR)计算模型的影响。由于实验的耗时和昂贵,这些决定是至关重要的。因此,准确量化机器学习预测中的不确定性变得至关重要,这样资源才能得到最佳利用,并提高对模型的信任。虽然QSAR建模的计算方法经常受到有限的数据和稀疏的实验观测的影响,但额外的信息可以以审查标签的形式存在,这些标签提供了阈值,而不是观测的精确值。然而,量化机器学习中不确定性的标准方法不能充分利用审查标签。在这项工作中,我们采用基于集成、贝叶斯和高斯模型的工具,通过使用生存分析中的Tobit模型从审查标签中学习。我们的结果表明,尽管在审查标签中提供部分信息,但它们对于可靠地估计真实制药环境中的不确定性至关重要,其中大约三分之一或更多的实验标签被审查。
{"title":"Enhancing uncertainty quantification in drug discovery with censored regression labels","authors":"Emma Svensson ,&nbsp;Hannah Rosa Friesacher ,&nbsp;Susanne Winiwarter ,&nbsp;Lewis Mervin ,&nbsp;Adam Arany ,&nbsp;Ola Engkvist","doi":"10.1016/j.ailsci.2025.100128","DOIUrl":"10.1016/j.ailsci.2025.100128","url":null,"abstract":"<div><div>In the early stages of drug discovery, decisions regarding which experiments to pursue can be influenced by computational models for quantitative structure–activity relationships (QSAR). These decisions are critical due to the time-consuming and expensive nature of the experiments. Therefore, it is becoming essential to accurately quantify the uncertainty in machine learning predictions, such that resources can be used optimally and trust in the models improves. While computational methods for QSAR modeling often suffer from limited data and sparse experimental observations, additional information can exist in the form of censored labels that provide thresholds rather than precise values of observations. However, the standard approaches that quantify uncertainty in machine learning cannot fully utilize censored labels. In this work, we adapt ensemble-based, Bayesian, and Gaussian models with tools to learn from censored labels by using the Tobit model from survival analysis. Our results demonstrate that despite the partial information available in censored labels, they are essential to reliably estimate uncertainties in real pharmaceutical settings where approximately one-third or more of experimental labels are censored.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100128"},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conformal prediction-based machine learning in Cheminformatics: Current applications and new challenges 化学信息学中基于保形预测的机器学习:当前应用和新挑战
Pub Date : 2025-02-08 DOI: 10.1016/j.ailsci.2025.100127
Mario Astigarraga , Andrés Sánchez-Ruiz , Gonzalo Colmenarejo
Conformal Prediction (CP) is a distribution-free Machine Learning (ML) framework that has been developed in the last ∼25 years to provide well calibrated prediction subsets/intervals that include the true label with a user pre-defined probability, only requiring data exchangeability. It is based on the concept of nonconformity (or dissimilarity) of the new prediction compared to previous data and their predictions, so that the prediction subset/interval size is larger for new “unusual” instances and smaller for “typical” instances. Given its simplicity and ease of applicability, since 2012 it has been widely adopted in Cheminformatics, especially in the Quantitative Structure-Activity Relationship (QSAR) modeling and Molecular Screening areas. This rapid popularization of CP in Cheminformatics can be explained on the grounds that: (a) it can handle the applicability domain (AD) issue of ML models, of large importance in Cheminformatics due to the immense size of the chemical space; (b) it deals with classification of heavily imbalanced datasets typical in Molecular Screening; and (c) it quantifies compound-specific prediction uncertainties, especially useful as it allows to implement gain-cost strategies to accelerate drug discovery by reducing compounds to test. This comprehensive review introduces the method, provides a full appraisal of the work done in the field of Cheminformatics (with special emphasis in the QSAR and Molecular Screening arenas), and discusses its pros and cons and new challenges, especially for Deep Learning applications and nonexchangeable datasets, a very frequent situation in Cheminformatics.
保形预测(CP)是一种无分布的机器学习(ML)框架,在过去的25年里开发出来,提供了经过校准的预测子集/区间,其中包括具有用户预定义概率的真实标签,只需要数据可交换性。它基于新预测与以前的数据及其预测相比较的不一致性(或不相似性)的概念,因此预测子集/区间大小对于新的“不寻常”实例较大,而对于“典型”实例较小。由于它的简单性和适用性,自2012年以来,它被广泛应用于化学信息学,特别是在定量结构-活性关系(QSAR)建模和分子筛选领域。CP在化学信息学中的迅速普及可以解释为:(a)它可以处理ML模型的适用性域(AD)问题,由于化学空间的巨大规模,这在化学信息学中非常重要;(b)处理分子筛选中典型的严重不平衡数据集的分类;(c)它量化了特定化合物的预测不确定性,尤其有用,因为它允许实施收益成本策略,通过减少要测试的化合物来加速药物发现。这篇全面的综述介绍了该方法,全面评估了化学信息学领域的工作(特别强调QSAR和分子筛选领域),并讨论了其优缺点和新的挑战,特别是深度学习应用和不可交换数据集,这是化学信息学中非常常见的情况。
{"title":"Conformal prediction-based machine learning in Cheminformatics: Current applications and new challenges","authors":"Mario Astigarraga ,&nbsp;Andrés Sánchez-Ruiz ,&nbsp;Gonzalo Colmenarejo","doi":"10.1016/j.ailsci.2025.100127","DOIUrl":"10.1016/j.ailsci.2025.100127","url":null,"abstract":"<div><div>Conformal Prediction (CP) is a distribution-free Machine Learning (ML) framework that has been developed in the last ∼25 years to provide well calibrated prediction subsets/intervals that include the true label with a user pre-defined probability, only requiring data exchangeability. It is based on the concept of <em>nonconformity</em> (or dissimilarity) of the new prediction compared to previous data and their predictions, so that the prediction subset/interval size is larger for new “unusual” instances and smaller for “typical” instances. Given its simplicity and ease of applicability, since 2012 it has been widely adopted in Cheminformatics, especially in the Quantitative Structure-Activity Relationship (QSAR) modeling and Molecular Screening areas. This rapid popularization of CP in Cheminformatics can be explained on the grounds that: (a) it can handle the applicability domain (AD) issue of ML models, of large importance in Cheminformatics due to the immense size of the chemical space; (b) it deals with classification of heavily imbalanced datasets typical in Molecular Screening; and (c) it quantifies compound-specific prediction uncertainties, especially useful as it allows to implement gain-cost strategies to accelerate drug discovery by reducing compounds to test. This comprehensive review introduces the method, provides a full appraisal of the work done in the field of Cheminformatics (with special emphasis in the QSAR and Molecular Screening arenas), and discusses its pros and cons and new challenges, especially for Deep Learning applications and nonexchangeable datasets, a very frequent situation in Cheminformatics.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100127"},"PeriodicalIF":0.0,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143402856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LIDEB's Useful Decoys (LUDe): A freely available decoy-generation tool. Benchmarking and scope libdeb的有用诱饵(LUDe):一个免费的诱饵生成工具。基准和范围
Pub Date : 2025-02-07 DOI: 10.1016/j.ailsci.2025.100129
Lucas N. Alberca , Denis N. Prada Gori , Maximiliano J. Fallico , Alexandre V. Fassio , Alan Talevi , Carolina L. Bellera
In the field of chemoinformatics, and in particular, when developing models to be applied in virtual screening campaigns, it is essential to run retrospective virtual screening experiments that evaluate the performance of such models in a scenario similar to the real one. That is, the ability to recover a small number of active compounds dispersed among a much larger number of compounds without the desired activity. However, such a retrospective experiment is often limited by the relative scarcity of known inactive compounds against the pharmacological target of interest. In these cases, automatic decoy (putative inactive compound) generation tools are often of great importance. Their basic goal is to generate decoys that are similar enough to the known active compounds to challenge the models, but different enough so that the probability that the decoys modulate the molecular target of interest is small.
In this article, we report the latest version of our open-source decoy generation tool LUDe, inspired by the well-known DUD-E but designed to reduce the probability of generating decoys topologically similar to known active compounds. We have carried out a benchmarking exercise against DUD-E through 102 pharmacological targets, using the DOE score and the Doppelganger score as comparison criteria. LUDe decoys obtained better DOE scores across most of the targets, indicating a lower risk of artificial enrichment. The mean Doppelganger score, in contrast, was similar for LUDe and DUD-E decoys, exhibiting a slight improvement for LUDe decoys for most of the targets. Simulation experiments were performed to verify whether the generated decoys are unsuitable to validate ligand-based models. Our results suggest that LUDe decoys are apt to be used to validate and compare machine learning ligand-based screening approaches. Importantly, LUDe may be used locally, independently from external server availability, and is thus suitable to obtain decoys from large datasets. It is available as a Web App (at https://lideb.biol.unlp.edu.ar/?page_id=1076) and as Python code at (https://github.com/LIDeB/LUDe.v1.0)
在化学信息学领域,特别是在开发用于虚拟筛选活动的模型时,必须进行回顾性虚拟筛选实验,以评估此类模型在类似于真实场景中的性能。也就是说,能够恢复分散在大量没有所需活性的化合物中的少量活性化合物。然而,这种回顾性实验往往受到相对稀缺的已知非活性化合物对感兴趣的药理学目标的限制。在这些情况下,自动诱饵(假定为非活性化合物)生成工具通常非常重要。他们的基本目标是生成与已知活性化合物足够相似的诱饵来挑战模型,但又足够不同,这样诱饵调节感兴趣的分子目标的概率就很小。在这篇文章中,我们报告了我们的开源诱饵生成工具LUDe的最新版本,它的灵感来自于众所周知的DUD-E,但旨在降低生成与已知活性化合物拓扑结构相似的诱饵的概率。我们通过102个药理学靶点对DUD-E进行了基准测试,使用DOE评分和Doppelganger评分作为比较标准。LUDe诱饵在大多数目标上获得了更好的DOE分数,表明人工富集的风险较低。相比之下,LUDe和DUD-E诱饵的平均二重身得分相似,在大多数目标上,LUDe诱饵略有改善。通过仿真实验验证所生成的诱饵是否不适合验证基于配体的模型。我们的研究结果表明,LUDe诱饵易于用于验证和比较基于机器学习配体的筛选方法。重要的是,LUDe可以在本地使用,独立于外部服务器可用性,因此适合从大型数据集获取诱饵。它可以作为Web应用程序(https://lideb.biol.unlp.edu.ar/?page_id=1076)和Python代码(https://github.com/LIDeB/LUDe.v1.0)获得。
{"title":"LIDEB's Useful Decoys (LUDe): A freely available decoy-generation tool. Benchmarking and scope","authors":"Lucas N. Alberca ,&nbsp;Denis N. Prada Gori ,&nbsp;Maximiliano J. Fallico ,&nbsp;Alexandre V. Fassio ,&nbsp;Alan Talevi ,&nbsp;Carolina L. Bellera","doi":"10.1016/j.ailsci.2025.100129","DOIUrl":"10.1016/j.ailsci.2025.100129","url":null,"abstract":"<div><div>In the field of chemoinformatics, and in particular, when developing models to be applied in virtual screening campaigns, it is essential to run retrospective virtual screening experiments that evaluate the performance of such models in a scenario similar to the real one. That is, the ability to recover a small number of active compounds dispersed among a much larger number of compounds without the desired activity. However, such a retrospective experiment is often limited by the relative scarcity of known inactive compounds against the pharmacological target of interest. In these cases, automatic decoy (putative inactive compound) generation tools are often of great importance. Their basic goal is to generate decoys that are similar enough to the known active compounds to challenge the models, but different enough so that the probability that the decoys modulate the molecular target of interest is small.</div><div>In this article, we report the latest version of our open-source decoy generation tool LUDe, inspired by the well-known DUD-E but designed to reduce the probability of generating decoys topologically similar to known active compounds. We have carried out a benchmarking exercise against DUD-E through 102 pharmacological targets, using the DOE score and the Doppelganger score as comparison criteria. LUDe decoys obtained better DOE scores across most of the targets, indicating a lower risk of artificial enrichment. The mean Doppelganger score, in contrast, was similar for LUDe and DUD-E decoys, exhibiting a slight improvement for LUDe decoys for most of the targets. Simulation experiments were performed to verify whether the generated decoys are unsuitable to validate ligand-based models. Our results suggest that LUDe decoys are apt to be used to validate and compare machine learning ligand-based screening approaches. Importantly, LUDe may be used locally, independently from external server availability, and is thus suitable to obtain decoys from large datasets. It is available as a Web App (at <span><span>https://lideb.biol.unlp.edu.ar/?page_id=1076</span><svg><path></path></svg></span>) and as Python code at (<span><span>https://github.com/LIDeB/LUDe.v1.0</span><svg><path></path></svg></span>)</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100129"},"PeriodicalIF":0.0,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
“Foundation models for research: A matter of trust?” “研究的基础模型:信任问题?”
Pub Date : 2025-02-06 DOI: 10.1016/j.ailsci.2025.100126
Koen Bruynseels , Lotte Asveld , Jeroen van den Hoven
Science would not be possible without trust among experts, trust of the public in experts, and reliance on scientific instruments and methods. The rapid adoption of scientific foundation models and their use in AI agents is changing scientific practices and thereby impacting this epistemic fabric which hinges on trust and reliance. Foundation models are machine learning models that are trained on large bodies of data and can be applied to a multitude of tasks. Their application in science raises the question of whether scientific foundation models can be relied upon as a research tool and to what extent, or even be trusted as if they were research partners.
Conceptual clarification of the notions of trust and reliance in science is pivotal in the face of foundation models. Trust and reliance form the glue for the increasingly distributed epistemic labour within contemporary technoscientific systems. We build on two concepts of trust in science, namely trust in science as shared values, and trust in science based on commitments to processes that provide objective claims. We analyse whether scientific foundation models are research tools to which the concept of reliance applies, or research partners that can be trustworthy or not. We consider these foundation models within their socio-technical contexts.
Allocation of trust should be reserved for human agents and the organizations they operate in. Reliance applies to foundation models and artificial intelligence agents. This distinction is important to unambiguously allocate responsibility, which is crucial in maintaining the fabric of trust that underpins science.
没有专家之间的信任、公众对专家的信任以及对科学仪器和方法的依赖,科学就不可能实现。科学基础模型的迅速采用及其在人工智能代理中的应用正在改变科学实践,从而影响这种依赖于信任和依赖的认知结构。基础模型是在大量数据上训练的机器学习模型,可以应用于多种任务。它们在科学中的应用提出了一个问题,即科学基础模型是否可以作为一种研究工具来依赖,以及在多大程度上可以信任,甚至可以像信任研究伙伴一样信任它们。面对基础模型,科学中信任和依赖概念的概念澄清是至关重要的。信任和依赖形成了当代技术科学系统中日益分散的知识劳动的粘合剂。我们建立在对科学信任的两个概念之上,即对作为共同价值观的科学的信任,以及基于对提供客观主张的过程的承诺的科学信任。我们分析科学基础模型是适用于依赖概念的研究工具,还是值得信赖的研究伙伴。我们在其社会技术背景下考虑这些基础模型。信任的分配应该保留给人类代理人和他们所处的组织。Reliance适用于基础模型和人工智能代理。这种区别对于明确分配责任非常重要,这对于维持支撑科学的信任结构至关重要。
{"title":"“Foundation models for research: A matter of trust?”","authors":"Koen Bruynseels ,&nbsp;Lotte Asveld ,&nbsp;Jeroen van den Hoven","doi":"10.1016/j.ailsci.2025.100126","DOIUrl":"10.1016/j.ailsci.2025.100126","url":null,"abstract":"<div><div>Science would not be possible without trust among experts, trust of the public in experts, and reliance on scientific instruments and methods. The rapid adoption of scientific foundation models and their use in AI agents is changing scientific practices and thereby impacting this epistemic fabric which hinges on trust and reliance. Foundation models are machine learning models that are trained on large bodies of data and can be applied to a multitude of tasks. Their application in science raises the question of whether scientific foundation models can be relied upon as a research tool and to what extent, or even be trusted as if they were research partners.</div><div>Conceptual clarification of the notions of trust and reliance in science is pivotal in the face of foundation models. Trust and reliance form the glue for the increasingly distributed epistemic labour within contemporary technoscientific systems. We build on two concepts of trust in science, namely trust in science as shared values, and trust in science based on commitments to processes that provide objective claims. We analyse whether scientific foundation models are research tools to which the concept of reliance applies, or research partners that can be trustworthy or not. We consider these foundation models within their socio-technical contexts.</div><div>Allocation of trust should be reserved for human agents and the organizations they operate in. Reliance applies to foundation models and artificial intelligence agents. This distinction is important to unambiguously allocate responsibility, which is crucial in maintaining the fabric of trust that underpins science.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100126"},"PeriodicalIF":0.0,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Actively protective combinatorial analysis: A scalable novel method for detecting variants that contribute to reduced disease prevalence in high-risk individuals 主动保护性组合分析:一种可扩展的新方法,用于检测有助于降低高危人群疾病患病率的变异
Pub Date : 2025-01-31 DOI: 10.1016/j.ailsci.2025.100125
J Sardell, S Das, K Taylor, C Stubberfield, A Malinowski, M Strivens, S Gardner
We present a novel method for routinely identifying disease resilience associations that offers powerful insights for the discovery of a new class of disease protective targets. We show how this can be used to identify mechanisms in the background of normal cellular biology that work to slow or stop progression of complex, chronic diseases.
Actively protective combinatorial analysis identifies combinations of features that contribute to reducing risk of disease in individuals who remain healthy even though their genomic profile suggests that they have high risk of developing disease. These protective signatures can potentially be used to identify novel drug targets, pharmacogenomic and/or therapeutic mRNA opportunities and to better stratify patients by overall disease risk and mechanistic subtype.
We describe the method and illustrate how it offers increased power for detecting disease-associated genetic variants relative to traditional methods. We exemplify this by identifying individuals who remain healthy despite possessing several disease signatures associated with increased risk of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) or amyotrophic lateral sclerosis (ALS). We then identify combinations of SNP-genotypes significantly associated with reduced disease prevalence in these high-risk protected cohorts.
We discuss how actively protective combinatorial analysis generates novel insights into the genetic drivers of established disease biology and detects gene-disease associations missed by standard statistical approaches such as meta-GWAS. The results support the mechanism of action hypotheses identified in our original causative disease analyses. They also illustrate the potential for development of precision medicine approaches that can increase healthspan by reducing the progression of disease.
我们提出了一种常规识别疾病恢复力关联的新方法,为发现一类新的疾病保护靶点提供了强有力的见解。我们展示了如何使用这种方法来识别正常细胞生物学背景下的机制,这些机制可以减缓或阻止复杂慢性疾病的进展。积极保护性组合分析确定了有助于降低个体患病风险的特征组合,即使他们的基因组图谱表明他们有很高的患病风险。这些保护性特征可以潜在地用于识别新的药物靶点、药物基因组学和/或治疗性mRNA机会,并根据总体疾病风险和机制亚型更好地对患者进行分层。我们描述了这种方法,并说明了与传统方法相比,它如何为检测疾病相关的遗传变异提供了更大的能力。我们通过识别具有与肌痛性脑脊髓炎/慢性疲劳综合征(ME/CFS)或肌萎缩侧索硬化症(ALS)风险增加相关的几种疾病特征的个体来证明这一点。然后,我们在这些高风险受保护的队列中确定与降低疾病患病率显著相关的snp基因型组合。我们讨论了积极保护性组合分析如何对已建立的疾病生物学的遗传驱动因素产生新的见解,并检测被标准统计方法(如meta-GWAS)遗漏的基因-疾病关联。结果支持在我们最初的致病分析中确定的作用机制假设。它们还说明了精密医学方法的发展潜力,这种方法可以通过减少疾病的进展来延长健康寿命。
{"title":"Actively protective combinatorial analysis: A scalable novel method for detecting variants that contribute to reduced disease prevalence in high-risk individuals","authors":"J Sardell,&nbsp;S Das,&nbsp;K Taylor,&nbsp;C Stubberfield,&nbsp;A Malinowski,&nbsp;M Strivens,&nbsp;S Gardner","doi":"10.1016/j.ailsci.2025.100125","DOIUrl":"10.1016/j.ailsci.2025.100125","url":null,"abstract":"<div><div>We present a novel method for routinely identifying disease resilience associations that offers powerful insights for the discovery of a new class of disease protective targets. We show how this can be used to identify mechanisms in the background of normal cellular biology that work to slow or stop progression of complex, chronic diseases.</div><div>Actively protective combinatorial analysis identifies combinations of features that contribute to reducing risk of disease in individuals who remain healthy even though their genomic profile suggests that they have high risk of developing disease. These protective signatures can potentially be used to identify novel drug targets, pharmacogenomic and/or therapeutic mRNA opportunities and to better stratify patients by overall disease risk and mechanistic subtype.</div><div>We describe the method and illustrate how it offers increased power for detecting disease-associated genetic variants relative to traditional methods. We exemplify this by identifying individuals who remain healthy despite possessing several disease signatures associated with increased risk of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) or amyotrophic lateral sclerosis (ALS). We then identify combinations of SNP-genotypes significantly associated with reduced disease prevalence in these high-risk protected cohorts.</div><div>We discuss how actively protective combinatorial analysis generates novel insights into the genetic drivers of established disease biology and detects gene-disease associations missed by standard statistical approaches such as meta-GWAS. The results support the mechanism of action hypotheses identified in our original causative disease analyses. They also illustrate the potential for development of precision medicine approaches that can increase healthspan by reducing the progression of disease.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100125"},"PeriodicalIF":0.0,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clinical diagnostics and medical image analysis 临床诊断和医学图像分析
Pub Date : 2024-12-09 DOI: 10.1016/j.ailsci.2024.100119
Jürgen Bajorath
{"title":"Clinical diagnostics and medical image analysis","authors":"Jürgen Bajorath","doi":"10.1016/j.ailsci.2024.100119","DOIUrl":"10.1016/j.ailsci.2024.100119","url":null,"abstract":"","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100119"},"PeriodicalIF":0.0,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143578542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable artificial intelligence for targeted protein degradation predictions 可解释的人工智能靶向蛋白质降解预测
Pub Date : 2024-12-09 DOI: 10.1016/j.ailsci.2024.100121
Francis J. Prael III , Jutta Blank , William C. Forrester , Lingling Shen , Raquel Rodríguez-Pérez
Defining structure-activity relationships (SAR) is a central task in medicinal chemistry. Apart from optimizing activity against the target of interest, off-target activities and other properties need to be balanced to ensure a suitable property profile, which is an exceptional challenge in drug design. Machine learning (ML) can identify structural patterns in large compound collections that are correlated to biological activity or other molecular properties. Such ML-based SAR modeling has the potential of greatly assisting in compound optimization. However, the black-box character of most ML models has limited their application to help establishing SAR hypotheses. Explainable ML or, more generally, explainable artificial intelligence (XAI) aims at “opening the black box” by estimating how model inputs – e.g., chemical structures – contribute to model predictions. Although a variety of model interpretation methods have been proposed, XAI for medicinal chemistry is still an active field of research and XAI strategies are dominated by proofs of concept rather than by practical applications in drug discovery programs. Moreover, with the advent of new modalities, the applicability of ML and XAI models remains under-investigated. Herein, we present a novel application of XAI methods to targeted protein degradation (TPD) predictions. We report a case study of ML-based SAR modeling with explainable predictions of Cereblon (CRBN) glues for GSPT1 (G1 to S phase transition 1 protein). We showcase how XAI results were able to mirror expert knowledge based on structural data. Importantly, quantitative evaluations showed the ability of our ML/XAI workflow to accurately describe TPD activity cliffs across different proteins. These findings support use of the proposed XAI strategy to help rationalizing model predictions and illustrates how XAI methods can be exploited to balance SAR across different targets or properties for the new modality of TPDs.
确定构效关系(SAR)是药物化学的核心任务。除了针对感兴趣的靶标优化活性外,还需要平衡非靶标活性和其他性质,以确保合适的性质特征,这是药物设计中的一个特殊挑战。机器学习(ML)可以识别与生物活性或其他分子特性相关的大型化合物集合中的结构模式。这种基于ml的SAR建模具有极大的帮助化合物优化的潜力。然而,大多数机器学习模型的黑箱特性限制了它们在帮助建立SAR假设方面的应用。可解释的机器学习,或者更一般地说,可解释的人工智能(XAI)旨在通过估计模型输入(例如化学结构)如何有助于模型预测来“打开黑盒子”。尽管已经提出了各种各样的模型解释方法,但药物化学的XAI仍然是一个活跃的研究领域,XAI策略主要是概念证明,而不是药物发现计划的实际应用。此外,随着新模式的出现,ML和XAI模型的适用性仍有待研究。在此,我们提出了XAI方法在靶向蛋白质降解(TPD)预测中的新应用。我们报告了一个基于ml的SAR模型的案例研究,该模型具有对GSPT1 (G1到S相变1蛋白)的Cereblon (CRBN)胶的可解释预测。我们展示了XAI结果如何能够反映基于结构数据的专家知识。重要的是,定量评估表明我们的ML/XAI工作流程能够准确描述不同蛋白质之间的TPD活性悬崖。这些发现支持使用所提出的XAI策略来帮助合理化模型预测,并说明了如何利用XAI方法来平衡不同目标或属性之间的SAR,以适应新的tpd模式。
{"title":"Explainable artificial intelligence for targeted protein degradation predictions","authors":"Francis J. Prael III ,&nbsp;Jutta Blank ,&nbsp;William C. Forrester ,&nbsp;Lingling Shen ,&nbsp;Raquel Rodríguez-Pérez","doi":"10.1016/j.ailsci.2024.100121","DOIUrl":"10.1016/j.ailsci.2024.100121","url":null,"abstract":"<div><div>Defining structure-activity relationships (SAR) is a central task in medicinal chemistry. Apart from optimizing activity against the target of interest, off-target activities and other properties need to be balanced to ensure a suitable property profile, which is an exceptional challenge in drug design. Machine learning (ML) can identify structural patterns in large compound collections that are correlated to biological activity or other molecular properties. Such ML-based SAR modeling has the potential of greatly assisting in compound optimization. However, the black-box character of most ML models has limited their application to help establishing SAR hypotheses. Explainable ML or, more generally, explainable artificial intelligence (XAI) aims at “opening the black box” by estimating how model inputs – e.g., chemical structures – contribute to model predictions. Although a variety of model interpretation methods have been proposed, XAI for medicinal chemistry is still an active field of research and XAI strategies are dominated by proofs of concept rather than by practical applications in drug discovery programs. Moreover, with the advent of new modalities, the applicability of ML and XAI models remains under-investigated. Herein, we present a novel application of XAI methods to targeted protein degradation (TPD) predictions. We report a case study of ML-based SAR modeling with explainable predictions of Cereblon (CRBN) glues for GSPT1 (G1 to S phase transition 1 protein). We showcase how XAI results were able to mirror expert knowledge based on structural data. Importantly, quantitative evaluations showed the ability of our ML/XAI workflow to accurately describe TPD activity cliffs across different proteins. These findings support use of the proposed XAI strategy to help rationalizing model predictions and illustrates how XAI methods can be exploited to balance SAR across different targets or properties for the new modality of TPDs.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100121"},"PeriodicalIF":0.0,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Publication of research using proprietary data 使用专有数据发表研究成果
Pub Date : 2024-12-09 DOI: 10.1016/j.ailsci.2024.100120
Raquel Rodríguez-Pérez , Jürgen Bajorath
{"title":"Publication of research using proprietary data","authors":"Raquel Rodríguez-Pérez ,&nbsp;Jürgen Bajorath","doi":"10.1016/j.ailsci.2024.100120","DOIUrl":"10.1016/j.ailsci.2024.100120","url":null,"abstract":"","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100120"},"PeriodicalIF":0.0,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Artificial intelligence in the life sciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1