DERNA Enables Pareto Optimal RNA Design.

IF 1.4 4区 生物学 Q4 BIOCHEMICAL RESEARCH METHODS Journal of Computational Biology Pub Date : 2024-03-01 Epub Date: 2024-02-27 DOI:10.1089/cmb.2023.0283
Xinyu Gu, Yuanyuan Qi, Mohammed El-Kebir
{"title":"DERNA Enables Pareto Optimal RNA Design.","authors":"Xinyu Gu, Yuanyuan Qi, Mohammed El-Kebir","doi":"10.1089/cmb.2023.0283","DOIUrl":null,"url":null,"abstract":"<p><p>The design of an RNA sequence <math><mstyle><mi>v</mi></mstyle></math> that encodes an input target protein sequence <math><mstyle><mi>w</mi></mstyle></math> is a crucial aspect of messenger RNA (mRNA) vaccine development. There are an exponential number of possible RNA sequences for a single target protein due to codon degeneracy. These potential RNA sequences can assume various secondary structure conformations, each with distinct minimum free energy (MFE), impacting thermodynamic stability and mRNA half-life. Furthermore, the presence of species-specific codon usage bias, quantified by the codon adaptation index (CAI), plays a vital role in translation efficiency. While earlier studies focused on optimizing either MFE or CAI, recent research has underscored the advantages of simultaneously optimizing both objectives. However, optimizing one objective comes at the expense of the other. In this work, we present the Pareto Optimal RNA Design problem, aiming to identify the set of Pareto optimal solutions for which no alternative solutions exist that exhibit better MFE and CAI values. Our algorithm DEsign RNA (DERNA) uses the weighted sum method to enumerate the Pareto front by optimizing convex combinations of both objectives. We use dynamic programming to solve each convex combination in <math><mstyle><mi>O</mi></mstyle><mrow><mo>(</mo><mrow><mo>|</mo><mstyle><mi>w</mi></mstyle><msup><mrow><mo>|</mo></mrow><mrow><mn>3</mn></mrow></msup></mrow><mo>)</mo></mrow></math> time and <math><mstyle><mi>O</mi></mstyle><mrow><mo>(</mo><mrow><mo>|</mo><mstyle><mi>w</mi></mstyle><msup><mrow><mo>|</mo></mrow><mrow><mn>2</mn></mrow></msup></mrow><mo>)</mo></mrow></math> space. Compared with a CDSfold, previous approach that only optimizes MFE, we show on a benchmark data set that DERNA obtains solutions with identical MFE but superior CAI. Moreover, we show that DERNA matches the performance in terms of solution quality of LinearDesign, a recent approach that similarly seeks to balance MFE and CAI. We conclude by demonstrating our method's potential for mRNA vaccine design for the SARS-CoV-2 spike protein.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2023.0283","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/2/27 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The design of an RNA sequence v that encodes an input target protein sequence w is a crucial aspect of messenger RNA (mRNA) vaccine development. There are an exponential number of possible RNA sequences for a single target protein due to codon degeneracy. These potential RNA sequences can assume various secondary structure conformations, each with distinct minimum free energy (MFE), impacting thermodynamic stability and mRNA half-life. Furthermore, the presence of species-specific codon usage bias, quantified by the codon adaptation index (CAI), plays a vital role in translation efficiency. While earlier studies focused on optimizing either MFE or CAI, recent research has underscored the advantages of simultaneously optimizing both objectives. However, optimizing one objective comes at the expense of the other. In this work, we present the Pareto Optimal RNA Design problem, aiming to identify the set of Pareto optimal solutions for which no alternative solutions exist that exhibit better MFE and CAI values. Our algorithm DEsign RNA (DERNA) uses the weighted sum method to enumerate the Pareto front by optimizing convex combinations of both objectives. We use dynamic programming to solve each convex combination in O(|w|3) time and O(|w|2) space. Compared with a CDSfold, previous approach that only optimizes MFE, we show on a benchmark data set that DERNA obtains solutions with identical MFE but superior CAI. Moreover, we show that DERNA matches the performance in terms of solution quality of LinearDesign, a recent approach that similarly seeks to balance MFE and CAI. We conclude by demonstrating our method's potential for mRNA vaccine design for the SARS-CoV-2 spike protein.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DERNA 实现了帕累托最优 RNA 设计。
设计能编码输入目标蛋白质序列 w 的 RNA 序列 v 是信使 RNA (mRNA) 疫苗开发的一个关键环节。由于密码子退化,单个目标蛋白质可能存在指数数量的 RNA 序列。这些潜在的 RNA 序列可以形成各种二级结构构象,每种构象都具有不同的最小自由能 (MFE),从而影响热力学稳定性和 mRNA 的半衰期。此外,以密码子适应指数(CAI)量化的物种特异性密码子使用偏差在翻译效率中起着至关重要的作用。早期的研究侧重于优化 MFE 或 CAI,而最近的研究则强调了同时优化这两个目标的优势。然而,优化一个目标会牺牲另一个目标。在这项工作中,我们提出了帕累托最优 RNA 设计问题,旨在找出一组帕累托最优解,对于这组最优解,不存在能表现出更好的 MFE 值和 CAI 值的替代方案。我们的算法 DEsign RNA (DERNA) 使用加权和方法,通过优化两个目标的凸组合来枚举帕累托前沿。我们使用动态编程在 O(|w|3) 时间和 O(|w|2) 空间内求解每个凸组合。与之前只优化 MFE 的 CDSfold 方法相比,我们在一个基准数据集上表明,DERNA 得到的解决方案具有相同的 MFE,但 CAI 更优。此外,我们还展示了 DERNA 在解决方案质量方面与 LinearDesign 的性能不相上下,后者是最近推出的一种方法,同样寻求 MFE 和 CAI 之间的平衡。最后,我们展示了我们的方法在 SARS-CoV-2 穗蛋白 mRNA 疫苗设计方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Computational Biology
Journal of Computational Biology 生物-计算机:跨学科应用
CiteScore
3.60
自引率
5.90%
发文量
113
审稿时长
6-12 weeks
期刊介绍: Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics. Journal of Computational Biology coverage includes: -Genomics -Mathematical modeling and simulation -Distributed and parallel biological computing -Designing biological databases -Pattern matching and pattern detection -Linking disparate databases and data -New tools for computational biology -Relational and object-oriented database technology for bioinformatics -Biological expert system design and use -Reasoning by analogy, hypothesis formation, and testing by machine -Management of biological databases
期刊最新文献
Estimating Haplotype Structure and Frequencies: A Bayesian Approach to Unknown Design in Pooled Genomic Data. Detection and Segmentation of Glioma Tumors Utilizing a UNet Convolutional Neural Network Approach with Non-Subsampled Shearlet Transform. Estimating Enzyme Expression and Metabolic Pathway Activity in Borreliella-Infected and Uninfected Mice. Nearly Instantaneous Time-Varying Reproduction Number for Contagious Diseases-a Direct Approach Based on Nonlinear Regression. NPI-DCGNN: An Accurate Tool for Identifying ncRNA-Protein Interactions Using a Dual-Channel Graph Neural Network.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1