新颖性可以预测吗？

IF 6.9 2区生物学 Q1 CELL BIOLOGY Cold Spring Harbor perspectives in biology Pub Date : 2024-02-01 DOI:10.1101/cshperspect.a041469

Clara Fannjiang, Jennifer Listgarten

{"title":"新颖性可以预测吗？","authors":"Clara Fannjiang, Jennifer Listgarten","doi":"10.1101/cshperspect.a041469","DOIUrl":null,"url":null,"abstract":"Machine learning-based design has gained traction in the sciences, most notably in the design of small molecules, materials, and proteins, with societal applications ranging from drug development and plastic degradation to carbon sequestration. When designing objects to achieve novel property values with machine learning, one faces a fundamental challenge: how to push past the frontier of current knowledge, distilled from the training data into the model, in a manner that rationally controls the risk of failure. If one trusts learned models too much in extrapolation, one is likely to design rubbish. In contrast, if one does not extrapolate, one cannot find novelty. Herein, we ponder how one might strike a useful balance between these two extremes. We focus in particular on designing proteins with novel property values, although much of our discussion is relevant to machine learning-based design more broadly.","PeriodicalId":10494,"journal":{"name":"Cold Spring Harbor perspectives in biology","volume":" ","pages":""},"PeriodicalIF":6.9000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10835614/pdf/","citationCount":"0","resultStr":"{\"title\":\"Is Novelty Predictable?\",\"authors\":\"Clara Fannjiang, Jennifer Listgarten\",\"doi\":\"10.1101/cshperspect.a041469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning-based design has gained traction in the sciences, most notably in the design of small molecules, materials, and proteins, with societal applications ranging from drug development and plastic degradation to carbon sequestration. When designing objects to achieve novel property values with machine learning, one faces a fundamental challenge: how to push past the frontier of current knowledge, distilled from the training data into the model, in a manner that rationally controls the risk of failure. If one trusts learned models too much in extrapolation, one is likely to design rubbish. In contrast, if one does not extrapolate, one cannot find novelty. Herein, we ponder how one might strike a useful balance between these two extremes. We focus in particular on designing proteins with novel property values, although much of our discussion is relevant to machine learning-based design more broadly.\",\"PeriodicalId\":10494,\"journal\":{\"name\":\"Cold Spring Harbor perspectives in biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2024-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10835614/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cold Spring Harbor perspectives in biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1101/cshperspect.a041469\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cold Spring Harbor perspectives in biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/cshperspect.a041469","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

基于机器学习的设计已在科学领域大行其道，其中最引人注目的是小分子、材料和蛋白质的设计，其社会应用范围包括药物开发、塑料降解和碳封存。当利用机器学习设计物体以实现新的属性值时，我们面临着一个根本性的挑战：如何以合理控制失败风险的方式，突破从训练数据中提炼出的模型的现有知识前沿。如果在推断过程中过于信任已学模型，就很可能设计出垃圾模型。相反，如果不进行推断，就找不到新意。在此，我们思考如何在这两个极端之间取得有益的平衡。我们特别关注设计具有新颖属性值的蛋白质，尽管我们的讨论大多与更广泛的基于机器学习的设计相关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Is Novelty Predictable?

Machine learning-based design has gained traction in the sciences, most notably in the design of small molecules, materials, and proteins, with societal applications ranging from drug development and plastic degradation to carbon sequestration. When designing objects to achieve novel property values with machine learning, one faces a fundamental challenge: how to push past the frontier of current knowledge, distilled from the training data into the model, in a manner that rationally controls the risk of failure. If one trusts learned models too much in extrapolation, one is likely to design rubbish. In contrast, if one does not extrapolate, one cannot find novelty. Herein, we ponder how one might strike a useful balance between these two extremes. We focus in particular on designing proteins with novel property values, although much of our discussion is relevant to machine learning-based design more broadly.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cold Spring Harbor perspectives in biology CELL BIOLOGY-

CiteScore

15.00

自引率

1.40%

发文量

审稿时长

3-8 weeks

期刊介绍： Cold Spring Harbor Perspectives in Biology offers a comprehensive platform in the molecular life sciences, featuring reviews that span molecular, cell, and developmental biology, genetics, neuroscience, immunology, cancer biology, and molecular pathology. This online publication provides in-depth insights into various topics, making it a valuable resource for those engaged in diverse aspects of biological research.