利用结构语言模型对蛋白质和抗体复合物进行无监督进化。

IF 44.7 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Science Pub Date : 2024-07-04 DOI:10.1126/science.adk8946
Varun R. Shanker, Theodora U. J. Bruun, Brian L. Hie, Peter S. Kim
{"title":"利用结构语言模型对蛋白质和抗体复合物进行无监督进化。","authors":"Varun R. Shanker,&nbsp;Theodora U. J. Bruun,&nbsp;Brian L. Hie,&nbsp;Peter S. Kim","doi":"10.1126/science.adk8946","DOIUrl":null,"url":null,"abstract":"<div >Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.</div>","PeriodicalId":21678,"journal":{"name":"Science","volume":null,"pages":null},"PeriodicalIF":44.7000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised evolution of protein and antibody complexes with a structure-informed language model\",\"authors\":\"Varun R. Shanker,&nbsp;Theodora U. J. Bruun,&nbsp;Brian L. Hie,&nbsp;Peter S. Kim\",\"doi\":\"10.1126/science.adk8946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div >Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.</div>\",\"PeriodicalId\":21678,\"journal\":{\"name\":\"Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":44.7000,\"publicationDate\":\"2024-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://www.science.org/doi/10.1126/science.adk8946\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science","FirstCategoryId":"103","ListUrlMain":"https://www.science.org/doi/10.1126/science.adk8946","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

仅根据序列信息训练的大型语言模型就能学习蛋白质设计的高级原理。然而,除了序列之外,蛋白质的三维结构决定了它们的特定功能、活性和可进化性。在这里,我们展示了一个使用蛋白质结构骨干坐标增强的通用蛋白质语言模型,它可以指导不同蛋白质的进化,而无需对单个功能任务进行建模。我们还证明,只在单链结构上训练过的 ESM-IF1 可以扩展到蛋白质复合物的工程设计。利用这种方法,我们筛选了用于治疗严重急性呼吸系统综合症冠状病毒 2(SARS-CoV-2)感染的两种治疗性临床抗体的约 30 个变体。我们对抗体逸出的病毒变体 BQ.1.1 和 XBB.1.5 的中和能力和亲和力分别提高了 25 倍和 37 倍。这些发现凸显了整合结构信息来识别高效蛋白质进化轨迹的优势,而不需要任何特定任务的训练数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Unsupervised evolution of protein and antibody complexes with a structure-informed language model
Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Science
Science 综合性期刊-综合性期刊
CiteScore
61.10
自引率
0.90%
发文量
0
审稿时长
2.1 months
期刊介绍: Science is a leading outlet for scientific news, commentary, and cutting-edge research. Through its print and online incarnations, Science reaches an estimated worldwide readership of more than one million. Science’s authorship is global too, and its articles consistently rank among the world's most cited research. Science serves as a forum for discussion of important issues related to the advancement of science by publishing material on which a consensus has been reached as well as including the presentation of minority or conflicting points of view. Accordingly, all articles published in Science—including editorials, news and comment, and book reviews—are signed and reflect the individual views of the authors and not official points of view adopted by AAAS or the institutions with which the authors are affiliated. Science seeks to publish those papers that are most influential in their fields or across fields and that will significantly advance scientific understanding. Selected papers should present novel and broadly important data, syntheses, or concepts. They should merit recognition by the wider scientific community and general public provided by publication in Science, beyond that provided by specialty journals. Science welcomes submissions from all fields of science and from any source. The editors are committed to the prompt evaluation and publication of submitted papers while upholding high standards that support reproducibility of published research. Science is published weekly; selected papers are published online ahead of print.
期刊最新文献
De novo gene synthesis by an antiviral reverse transcriptase. Tricking phages with a reverse move. Nr5a2 is dispensable for zygotic genome activation but essential for morula development. A bird's-eye view of avian extinctions. A cautious approach to subsidies for environmental sustainability.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1