基于语言学的抗体语言形式化是抗体语言模型的基础。

IF 12 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Nature computational science Pub Date : 2024-06-14 DOI:10.1038/s43588-024-00642-3
Mai Ha Vu, Philippe A. Robert, Rahmad Akbar, Bartlomiej Swiatczak, Geir Kjetil Sandve, Dag Trygve Truslew Haug, Victor Greiff
{"title":"基于语言学的抗体语言形式化是抗体语言模型的基础。","authors":"Mai Ha Vu, Philippe A. Robert, Rahmad Akbar, Bartlomiej Swiatczak, Geir Kjetil Sandve, Dag Trygve Truslew Haug, Victor Greiff","doi":"10.1038/s43588-024-00642-3","DOIUrl":null,"url":null,"abstract":"Apparent parallels between natural language and antibody sequences have led to a surge in deep language models applied to antibody sequences for predicting cognate antigen recognition. However, a linguistic formal definition of antibody language does not exist, and insight into how antibody language models capture antibody-specific binding features remains largely uninterpretable. Here we describe how a linguistic formalization of the antibody language, by characterizing its tokens and grammar, could address current challenges in antibody language model rule mining. The parallels between natural language and antibody sequences could serve as a stepping stone to using deep language models for analyzing antibody sequences. This Perspective discusses how issues in antibody language model rule mining could be addressed by linguistically formalizing the antibody language.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":null,"pages":null},"PeriodicalIF":12.0000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Linguistics-based formalization of the antibody language as a basis for antibody language models\",\"authors\":\"Mai Ha Vu, Philippe A. Robert, Rahmad Akbar, Bartlomiej Swiatczak, Geir Kjetil Sandve, Dag Trygve Truslew Haug, Victor Greiff\",\"doi\":\"10.1038/s43588-024-00642-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Apparent parallels between natural language and antibody sequences have led to a surge in deep language models applied to antibody sequences for predicting cognate antigen recognition. However, a linguistic formal definition of antibody language does not exist, and insight into how antibody language models capture antibody-specific binding features remains largely uninterpretable. Here we describe how a linguistic formalization of the antibody language, by characterizing its tokens and grammar, could address current challenges in antibody language model rule mining. The parallels between natural language and antibody sequences could serve as a stepping stone to using deep language models for analyzing antibody sequences. This Perspective discusses how issues in antibody language model rule mining could be addressed by linguistically formalizing the antibody language.\",\"PeriodicalId\":74246,\"journal\":{\"name\":\"Nature computational science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":12.0000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature computational science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.nature.com/articles/s43588-024-00642-3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s43588-024-00642-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

自然语言与抗体序列之间的明显相似性导致了将深度语言模型应用于抗体序列以预测同源抗原识别的热潮。然而,抗体语言的语言学形式定义并不存在,而且对抗体语言模型如何捕捉抗体特异性结合特征的深入了解在很大程度上仍无法解读。在此,我们将介绍如何通过表征抗体语言的词块和语法,对抗体语言进行语言形式化,从而解决目前在抗体语言模型规则挖掘方面所面临的挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Linguistics-based formalization of the antibody language as a basis for antibody language models
Apparent parallels between natural language and antibody sequences have led to a surge in deep language models applied to antibody sequences for predicting cognate antigen recognition. However, a linguistic formal definition of antibody language does not exist, and insight into how antibody language models capture antibody-specific binding features remains largely uninterpretable. Here we describe how a linguistic formalization of the antibody language, by characterizing its tokens and grammar, could address current challenges in antibody language model rule mining. The parallels between natural language and antibody sequences could serve as a stepping stone to using deep language models for analyzing antibody sequences. This Perspective discusses how issues in antibody language model rule mining could be addressed by linguistically formalizing the antibody language.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
11.70
自引率
0.00%
发文量
0
期刊最新文献
Heat wave attribution assessment using deep learning Deconstructing the compounds of altruism The motive cocktail in altruistic behaviors Computational challenges in additive manufacturing for metamaterials design Computational design of art-inspired metamaterials
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1