Interpretable and explainable predictive machine learning models for data-driven protein engineering.

IF 12.1 1区工程技术 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Biotechnology advances Pub Date : 2024-12-05 DOI:10.1016/j.biotechadv.2024.108495

David Medina-Ortiz, Ashkan Khalifeh, Hoda Anvari-Kazemabad, Mehdi D Davari

{"title":"Interpretable and explainable predictive machine learning models for data-driven protein engineering.","authors":"David Medina-Ortiz, Ashkan Khalifeh, Hoda Anvari-Kazemabad, Mehdi D Davari","doi":"10.1016/j.biotechadv.2024.108495","DOIUrl":null,"url":null,"abstract":"<p><p>Protein engineering through directed evolution and (semi)rational design has become a powerful approach for optimizing and enhancing proteins with desired properties. The integration of artificial intelligence methods has further accelerated protein engineering process by enabling the development of predictive models based on data-driven strategies. However, the lack of interpretability and transparency in these models limits their trustworthiness and applicability in real-world scenarios. Explainable Artificial Intelligence addresses these challenges by providing insights into the decision-making processes of machine learning models, enhancing their reliability and interpretability. Explainable strategies has been successfully applied in various biotechnology fields, including drug discovery, genomics, and medicine, yet its application in protein engineering remains underexplored. The incorporation of explainable strategies in protein engineering holds significant potential, as it can guide protein design by revealing how predictive models function, benefiting approaches such as machine learning-assisted directed evolution. This perspective work explores the principles and methodologies of explainable artificial intelligence, highlighting its relevance in biotechnology and its potential to enhance protein design. Additionally, three theoretical pipelines integrating predictive models with explainable strategies are proposed, focusing on their advantages, disadvantages, and technical requirements. Finally, the remaining challenges of explainable artificial intelligence in protein engineering and future directions for its development as a support tool for traditional protein engineering methodologies are discussed.</p>","PeriodicalId":8946,"journal":{"name":"Biotechnology advances","volume":" ","pages":"108495"},"PeriodicalIF":12.1000,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biotechnology advances","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.biotechadv.2024.108495","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Protein engineering through directed evolution and (semi)rational design has become a powerful approach for optimizing and enhancing proteins with desired properties. The integration of artificial intelligence methods has further accelerated protein engineering process by enabling the development of predictive models based on data-driven strategies. However, the lack of interpretability and transparency in these models limits their trustworthiness and applicability in real-world scenarios. Explainable Artificial Intelligence addresses these challenges by providing insights into the decision-making processes of machine learning models, enhancing their reliability and interpretability. Explainable strategies has been successfully applied in various biotechnology fields, including drug discovery, genomics, and medicine, yet its application in protein engineering remains underexplored. The incorporation of explainable strategies in protein engineering holds significant potential, as it can guide protein design by revealing how predictive models function, benefiting approaches such as machine learning-assisted directed evolution. This perspective work explores the principles and methodologies of explainable artificial intelligence, highlighting its relevance in biotechnology and its potential to enhance protein design. Additionally, three theoretical pipelines integrating predictive models with explainable strategies are proposed, focusing on their advantages, disadvantages, and technical requirements. Finally, the remaining challenges of explainable artificial intelligence in protein engineering and future directions for its development as a support tool for traditional protein engineering methodologies are discussed.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于数据驱动蛋白质工程的可解释和可解释的预测机器学习模型。

通过定向进化和（半）理性设计的蛋白质工程已成为优化和增强具有所需特性的蛋白质的有力方法。人工智能方法的集成通过基于数据驱动策略的预测模型的开发，进一步加速了蛋白质工程的进程。然而，这些模型缺乏可解释性和透明度，限制了它们在现实场景中的可信度和适用性。可解释的人工智能通过洞察机器学习模型的决策过程，提高其可靠性和可解释性来解决这些挑战。可解释策略已成功地应用于各种生物技术领域，包括药物发现、基因组学和医学，但其在蛋白质工程中的应用仍有待探索。蛋白质工程中可解释策略的结合具有巨大的潜力，因为它可以通过揭示预测模型的功能来指导蛋白质设计，有利于机器学习辅助定向进化等方法。这项前瞻性工作探讨了可解释人工智能的原理和方法，强调了其在生物技术中的相关性及其增强蛋白质设计的潜力。此外，提出了三种将预测模型与可解释策略相结合的理论管道，重点分析了它们的优缺点和技术要求。最后，讨论了可解释人工智能在蛋白质工程中的剩余挑战，以及其作为传统蛋白质工程方法的支持工具的未来发展方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Biotechnology advances 工程技术-生物工程与应用微生物

CiteScore

25.50

自引率

2.50%

发文量

167

审稿时长

37 days

期刊介绍： Biotechnology Advances is a comprehensive review journal that covers all aspects of the multidisciplinary field of biotechnology. The journal focuses on biotechnology principles and their applications in various industries, agriculture, medicine, environmental concerns, and regulatory issues. It publishes authoritative articles that highlight current developments and future trends in the field of biotechnology. The journal invites submissions of manuscripts that are relevant and appropriate. It targets a wide audience, including scientists, engineers, students, instructors, researchers, practitioners, managers, governments, and other stakeholders in the field. Additionally, special issues are published based on selected presentations from recent relevant conferences in collaboration with the organizations hosting those conferences.