AbLEF: Antibody Language Ensemble Fusion for thermodynamically empowered property predictions.

IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Bioinformatics Pub Date : 2024-04-16 DOI:10.1093/bioinformatics/btae268
Zachary A Rollins, Talal Widatalla, Andrew Waight, Alan C Cheng, Essam Metwally
{"title":"AbLEF: Antibody Language Ensemble Fusion for thermodynamically empowered property predictions.","authors":"Zachary A Rollins, Talal Widatalla, Andrew Waight, Alan C Cheng, Essam Metwally","doi":"10.1093/bioinformatics/btae268","DOIUrl":null,"url":null,"abstract":"MOTIVATION\nPre-trained protein language and/or structural models are often fine-tuned on drug development properties (ie, developability properties) to accelerate drug discovery initiatives. However, these models generally rely on a single structural conformation and/or a single sequence as a molecular representation. We present a physics-based model whereby 3D conformational ensemble representations are fused by a transformer-based architecture and concatenated to a language representation to predict antibody protein properties. AbLEF enables the direct infusion of thermodynamic information into latent space and this enhances property prediction by explicitly infusing dynamic molecular behavior that occurs during experimental measurement.\n\n\nRESULTS\nWe showcase the AbLEF model on two developability properties: hydrophobic interaction chromatography retention time (HIC-RT) and temperature of aggregation (Tagg). We find that (1) 3D conformational ensembles that are generated from molecular simulation can further improve antibody property prediction for small datasets, (2) the performance benefit from 3D conformational ensembles matches shallow machine learning methods in the small data regime, and (3) fine-tuned large protein language models can match smaller antibody-specific language models at predicting antibody properties.\n\n\nAVAILABILITY AND IMPLEMENTATION\nAbLEF codebase is available at https://github.com/merck/AbLEF.\n\n\nSUPPLEMENTARY INFORMATION\nSupplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae268","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

MOTIVATION Pre-trained protein language and/or structural models are often fine-tuned on drug development properties (ie, developability properties) to accelerate drug discovery initiatives. However, these models generally rely on a single structural conformation and/or a single sequence as a molecular representation. We present a physics-based model whereby 3D conformational ensemble representations are fused by a transformer-based architecture and concatenated to a language representation to predict antibody protein properties. AbLEF enables the direct infusion of thermodynamic information into latent space and this enhances property prediction by explicitly infusing dynamic molecular behavior that occurs during experimental measurement. RESULTS We showcase the AbLEF model on two developability properties: hydrophobic interaction chromatography retention time (HIC-RT) and temperature of aggregation (Tagg). We find that (1) 3D conformational ensembles that are generated from molecular simulation can further improve antibody property prediction for small datasets, (2) the performance benefit from 3D conformational ensembles matches shallow machine learning methods in the small data regime, and (3) fine-tuned large protein language models can match smaller antibody-specific language models at predicting antibody properties. AVAILABILITY AND IMPLEMENTATION AbLEF codebase is available at https://github.com/merck/AbLEF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
AbLEF:用于热力学特性预测的抗体语言集合融合。
动机 预先训练好的蛋白质语言和/或结构模型通常会根据药物开发特性(即可开发性特性)进行微调,以加快药物发现的进程。然而,这些模型通常依赖于单一结构构象和/或单一序列作为分子表征。我们提出了一种基于物理的模型,通过这种模型,三维构象组合表征被一种基于变换器的架构融合,并与语言表征相串联,从而预测抗体蛋白质的特性。AbLEF 能够将热力学信息直接注入潜空间,并通过明确注入实验测量过程中发生的动态分子行为来增强特性预测。结果 我们展示了 AbLEF 模型的两种显影特性:疏水相互作用色谱保留时间(HIC-RT)和聚集温度(Tagg)。我们发现:(1) 通过分子模拟生成的三维构象组合可以进一步改善小数据集的抗体性质预测;(2) 三维构象组合的性能优势与小数据体系中的浅层机器学习方法相匹配;(3) 经过微调的大型蛋白质语言模型在预测抗体性质方面可以与较小的特定抗体语言模型相媲美。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Bioinformatics
Bioinformatics 生物-生化研究方法
CiteScore
11.20
自引率
5.20%
发文量
753
审稿时长
2.1 months
期刊介绍: The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.
期刊最新文献
MEHunter: Transformer-based mobile element variant detection from long reads PQSDC: a parallel lossless compressor for quality scores data via sequences partition and Run-Length prediction mapping. MUSE-XAE: MUtational Signature Extraction with eXplainable AutoEncoder enhances tumour types classification. CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics CORDAX web server: An online platform for the prediction and 3D visualization of aggregation motifs in protein sequences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1