Rishabh Anand, Chaitanya K. Joshi, Alex Morehead, Arian R. Jamasb, Charles Harris, Simon V. Mathis, Kieran Didi, Bryan Hooi, Pietro Liò
{"title":"RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design","authors":"Rishabh Anand, Chaitanya K. Joshi, Alex Morehead, Arian R. Jamasb, Charles Harris, Simon V. Mathis, Kieran Didi, Bryan Hooi, Pietro Liò","doi":"arxiv-2406.13839","DOIUrl":null,"url":null,"abstract":"We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone\ndesign. We build upon SE(3) flow matching for protein backbone generation and\nestablish protocols for data preparation and evaluation to address unique\nchallenges posed by RNA modeling. We formulate RNA structures as a set of\nrigid-body frames and associated loss functions which account for larger, more\nconformationally flexible RNA backbones (13 atoms per nucleotide) vs. proteins\n(4 atoms per residue). Toward tackling the lack of diversity in 3D RNA\ndatasets, we explore training with structural clustering and cropping\naugmentations. Additionally, we define a suite of evaluation metrics to measure\nwhether the generated RNA structures are globally self-consistent (via inverse\nfolding followed by forward folding) and locally recover RNA-specific\nstructural descriptors. The most performant version of RNA-FrameFlow generates\nlocally realistic RNA backbones of 40-150 nucleotides, over 40% of which pass\nour validity criteria as measured by a self-consistency TM-score >= 0.45, at\nwhich two RNAs have the same global fold. Open-source code:\nhttps://github.com/rish-16/rna-backbone-design","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.13839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone
design. We build upon SE(3) flow matching for protein backbone generation and
establish protocols for data preparation and evaluation to address unique
challenges posed by RNA modeling. We formulate RNA structures as a set of
rigid-body frames and associated loss functions which account for larger, more
conformationally flexible RNA backbones (13 atoms per nucleotide) vs. proteins
(4 atoms per residue). Toward tackling the lack of diversity in 3D RNA
datasets, we explore training with structural clustering and cropping
augmentations. Additionally, we define a suite of evaluation metrics to measure
whether the generated RNA structures are globally self-consistent (via inverse
folding followed by forward folding) and locally recover RNA-specific
structural descriptors. The most performant version of RNA-FrameFlow generates
locally realistic RNA backbones of 40-150 nucleotides, over 40% of which pass
our validity criteria as measured by a self-consistency TM-score >= 0.45, at
which two RNAs have the same global fold. Open-source code:
https://github.com/rish-16/rna-backbone-design