Henrik Wiechers, Christopher J Williams, Benjamin Eltzner, Franziska Hoppe, Michael G Prisant, Vincent B Chen, Ezra Miller, Kanti V Mardia, Jane S Richardson, Stephan F Huckemann
{"title":"RNAprecis: Prediction of full-detail RNA conformation from the experimentally best-observed sparse parameters.","authors":"Henrik Wiechers, Christopher J Williams, Benjamin Eltzner, Franziska Hoppe, Michael G Prisant, Vincent B Chen, Ezra Miller, Kanti V Mardia, Jane S Richardson, Stephan F Huckemann","doi":"10.1101/2025.02.06.636803","DOIUrl":null,"url":null,"abstract":"<p><p>We address the problem of predicting high detail RNA structure geometry from the information available in low resolution experimental maps of electron density. Here low resolution refers to ≥2.5Å where the location of the phosphate groups and the glyocosidic bonds can be determined from electron density but all other backbone atom positions cannot. In contrast, high resolution determines all backbone atomic positions. To this end, we firstly create a gold standard data base for four groups of manually corrected suites, each reflecting one out of four sugar pucker-pair configurations. Secondly we develop and employ a modified version of the previously devised algorithm MINT-AGE to learn clusters that are in high correspondence with gold standard's conformational classes based on 3D RNA structure. Since some of the manually corrected classes are of very small size, the modified version of MINT-AGE is able to also identify very small clusters. Thirdly, the new algorithm RNAprecis assigns low resolution structures to newly designed 3D shape coordinates. Our improvements include: (i) learned classes augmented to cover also very low sample sizes and (ii) regularizing a key distance by introducing an adaptive Mahalanobis distance. On a test data containing many clashing and suites modeled as conformational outliers, RNA precis shows good results suggesting that our learning method generalizes well. In particular, our modified MINT-AGE clustering can be finer than the existing curated gold standard suite conformers. For example, the <b>0a</b> conformer has been separated into two clusters seen in different structural contexts. Such new distinctions can have implications for biochemical interpretation of RNA structure.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11839040/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.02.06.636803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RNAprecis: Prediction of full-detail RNA conformation from the experimentally best-observed sparse parameters.
We address the problem of predicting high detail RNA structure geometry from the information available in low resolution experimental maps of electron density. Here low resolution refers to ≥2.5Å where the location of the phosphate groups and the glyocosidic bonds can be determined from electron density but all other backbone atom positions cannot. In contrast, high resolution determines all backbone atomic positions. To this end, we firstly create a gold standard data base for four groups of manually corrected suites, each reflecting one out of four sugar pucker-pair configurations. Secondly we develop and employ a modified version of the previously devised algorithm MINT-AGE to learn clusters that are in high correspondence with gold standard's conformational classes based on 3D RNA structure. Since some of the manually corrected classes are of very small size, the modified version of MINT-AGE is able to also identify very small clusters. Thirdly, the new algorithm RNAprecis assigns low resolution structures to newly designed 3D shape coordinates. Our improvements include: (i) learned classes augmented to cover also very low sample sizes and (ii) regularizing a key distance by introducing an adaptive Mahalanobis distance. On a test data containing many clashing and suites modeled as conformational outliers, RNA precis shows good results suggesting that our learning method generalizes well. In particular, our modified MINT-AGE clustering can be finer than the existing curated gold standard suite conformers. For example, the 0a conformer has been separated into two clusters seen in different structural contexts. Such new distinctions can have implications for biochemical interpretation of RNA structure.