超高通量筛选与机器学习在生物催化剂工程中的协同作用

IF 3.3 3区化学 Q2 CHEMISTRY, PHYSICAL Faraday Discussions Pub Date : 2024-04-23 DOI:10.1039/d4fd00065j

Maximilian Gantz, Simon Valentin Mathis, Friederike Nintzel, Pietro Lio, Florian Hollfelder

{"title":"超高通量筛选与机器学习在生物催化剂工程中的协同作用","authors":"Maximilian Gantz, Simon Valentin Mathis, Friederike Nintzel, Pietro Lio, Florian Hollfelder","doi":"10.1039/d4fd00065j","DOIUrl":null,"url":null,"abstract":"Protein design and directed evolution have separately contributed enormously to protein engineering. Without being mutually exclusive, the former relies on computation from first principles, while the latter is a combinatorial approach based on chance. Advances in ultrahigh throughput (uHT) screening, next generation sequencing and machine learning may create alternative routes to engineered proteins, where functional information linked to specific sequences is interpreted and extrapolated in silico. In particular, the miniaturisation of functional tests in water-in-oil emulsion droplets with picoliter volumes and their rapid generation and analysis (>1 kHz) allows screening of >107-membered libraries in a day. Subsequently decoding the selected clones by short or long-read sequencing methods leads to large sequence-function datasets that may allow extrapolation from experimental directed evolution to further improved mutants beyond the observed hits. In this work, we explore experimental strategies for how to draw up ‘fitness landscapes’ in sequence space with uHT droplet microfluidics, review the current state of AI/ML in enzyme engineering and discuss how uHT datasets may be combined with AI/ML to make meaningful predictions and accelerate biocatalyst engineering.","PeriodicalId":76,"journal":{"name":"Faraday Discussions","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On synergy between ultrahigh throughput screening and machine learning in biocatalyst engineering\",\"authors\":\"Maximilian Gantz, Simon Valentin Mathis, Friederike Nintzel, Pietro Lio, Florian Hollfelder\",\"doi\":\"10.1039/d4fd00065j\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein design and directed evolution have separately contributed enormously to protein engineering. Without being mutually exclusive, the former relies on computation from first principles, while the latter is a combinatorial approach based on chance. Advances in ultrahigh throughput (uHT) screening, next generation sequencing and machine learning may create alternative routes to engineered proteins, where functional information linked to specific sequences is interpreted and extrapolated in silico. In particular, the miniaturisation of functional tests in water-in-oil emulsion droplets with picoliter volumes and their rapid generation and analysis (>1 kHz) allows screening of >107-membered libraries in a day. Subsequently decoding the selected clones by short or long-read sequencing methods leads to large sequence-function datasets that may allow extrapolation from experimental directed evolution to further improved mutants beyond the observed hits. In this work, we explore experimental strategies for how to draw up ‘fitness landscapes’ in sequence space with uHT droplet microfluidics, review the current state of AI/ML in enzyme engineering and discuss how uHT datasets may be combined with AI/ML to make meaningful predictions and accelerate biocatalyst engineering.\",\"PeriodicalId\":76,\"journal\":{\"name\":\"Faraday Discussions\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Faraday Discussions\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1039/d4fd00065j\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Faraday Discussions","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d4fd00065j","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

摘要

蛋白质设计和定向进化分别为蛋白质工程做出了巨大贡献。前者依赖于从第一原理出发的计算，而后者则是一种基于偶然性的组合方法。超高通量（uHT）筛选、下一代测序和机器学习的进步可能会为工程蛋白质开辟另一条途径，即在硅学中解释和推断与特定序列相关的功能信息。尤其是油包水乳剂液滴中的皮升体积功能测试微型化及其快速生成和分析（>1 kHz），可在一天内筛选出 107 个元件库。随后，通过短线程或长线程测序方法对筛选出的克隆进行解码，可获得大量序列-功能数据集，从而通过实验定向进化推断出观察到的突变体之外的进一步改良突变体。在这项工作中，我们探讨了如何利用 uHT 液滴微流控技术在序列空间中绘制 "适合度景观 "的实验策略，回顾了人工智能/ML 在酶工程中的应用现状，并讨论了如何将 uHT 数据集与人工智能/ML 结合起来，以做出有意义的预测并加速生物催化剂工程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

On synergy between ultrahigh throughput screening and machine learning in biocatalyst engineering

Protein design and directed evolution have separately contributed enormously to protein engineering. Without being mutually exclusive, the former relies on computation from first principles, while the latter is a combinatorial approach based on chance. Advances in ultrahigh throughput (uHT) screening, next generation sequencing and machine learning may create alternative routes to engineered proteins, where functional information linked to specific sequences is interpreted and extrapolated in silico. In particular, the miniaturisation of functional tests in water-in-oil emulsion droplets with picoliter volumes and their rapid generation and analysis (>1 kHz) allows screening of >107-membered libraries in a day. Subsequently decoding the selected clones by short or long-read sequencing methods leads to large sequence-function datasets that may allow extrapolation from experimental directed evolution to further improved mutants beyond the observed hits. In this work, we explore experimental strategies for how to draw up ‘fitness landscapes’ in sequence space with uHT droplet microfluidics, review the current state of AI/ML in enzyme engineering and discuss how uHT datasets may be combined with AI/ML to make meaningful predictions and accelerate biocatalyst engineering.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Faraday Discussions 化学-物理化学

自引率

0.00%

发文量

259

期刊介绍： Discussion summary and research papers from discussion meetings that focus on rapidly developing areas of physical chemistry and its interfaces