SmithHunter: a workflow for the identification of candidate smithRNAs and their targets.

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS BMC Bioinformatics Pub Date : 2024-09-02 DOI:10.1186/s12859-024-05909-0

Giovanni Marturano, Diego Carli, Claudio Cucini, Antonio Carapelli, Federico Plazzi, Francesco Frati, Marco Passamonti, Francesco Nardi

{"title":"SmithHunter: a workflow for the identification of candidate smithRNAs and their targets.","authors":"Giovanni Marturano, Diego Carli, Claudio Cucini, Antonio Carapelli, Federico Plazzi, Francesco Frati, Marco Passamonti, Francesco Nardi","doi":"10.1186/s12859-024-05909-0","DOIUrl":null,"url":null,"abstract":"Background: SmithRNAs (Small MITochondrial Highly-transcribed RNAs) are a novel class of small RNA molecules that are encoded in the mitochondrial genome and regulate the expression of nuclear transcripts. Initial evidence for their existence came from the Manila clam Ruditapes philippinarum, where they have been described and whose activity has been biologically validated through RNA injection experiments. Current evidence on the existence of these RNAs in other species is based only on small RNA sequencing. As a preliminary step to characterize smithRNAs across different metazoan lineages, a dedicated, unified, analytical workflow is needed.Results: We propose a novel workflow specifically designed for smithRNAs. Sequence data (from small RNA sequencing) uniquely mapping to the mitochondrial genome are clustered into putative smithRNAs and prefiltered based on their abundance, presence in replicate libraries and 5' and 3' transcription boundary conservation. The surviving sequences are subsequently compared to the untranslated regions of nuclear transcripts based on seed pairing, overall match and thermodynamic stability to identify possible targets. Ample collateral information and graphics are produced to help characterize these molecules in the species of choice and guide the operator through the analysis. The workflow was tested on the original Manila clam data. Under basic settings, the results of the original study are largely replicated. The effect of additional parameter customization (clustering threshold, stringency, minimum number of replicates, seed matching) was further evaluated.Conclusions: The study of smithRNAs is still in its infancy and no dedicated analytical workflow is currently available. At its core, the SmithHunter workflow builds over the bioinformatic procedure originally applied to identify candidate smithRNAs in the Manila clam. In fact, this is currently the only evidence for smithRNAs that has been biologically validated and, therefore, the elective starting point for characterizing smithRNAs in other species. The original analysis was readapted using current software implementations and some minor issues were solved. Moreover, the workflow was improved by allowing the customization of different analytical parameters, mostly focusing on stringency and the possibility of accounting for a minimal level of genetic differentiation among samples.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"286"},"PeriodicalIF":3.3000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370224/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-024-05909-0","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: SmithRNAs (Small MITochondrial Highly-transcribed RNAs) are a novel class of small RNA molecules that are encoded in the mitochondrial genome and regulate the expression of nuclear transcripts. Initial evidence for their existence came from the Manila clam Ruditapes philippinarum, where they have been described and whose activity has been biologically validated through RNA injection experiments. Current evidence on the existence of these RNAs in other species is based only on small RNA sequencing. As a preliminary step to characterize smithRNAs across different metazoan lineages, a dedicated, unified, analytical workflow is needed.

Results: We propose a novel workflow specifically designed for smithRNAs. Sequence data (from small RNA sequencing) uniquely mapping to the mitochondrial genome are clustered into putative smithRNAs and prefiltered based on their abundance, presence in replicate libraries and 5' and 3' transcription boundary conservation. The surviving sequences are subsequently compared to the untranslated regions of nuclear transcripts based on seed pairing, overall match and thermodynamic stability to identify possible targets. Ample collateral information and graphics are produced to help characterize these molecules in the species of choice and guide the operator through the analysis. The workflow was tested on the original Manila clam data. Under basic settings, the results of the original study are largely replicated. The effect of additional parameter customization (clustering threshold, stringency, minimum number of replicates, seed matching) was further evaluated.

Conclusions: The study of smithRNAs is still in its infancy and no dedicated analytical workflow is currently available. At its core, the SmithHunter workflow builds over the bioinformatic procedure originally applied to identify candidate smithRNAs in the Manila clam. In fact, this is currently the only evidence for smithRNAs that has been biologically validated and, therefore, the elective starting point for characterizing smithRNAs in other species. The original analysis was readapted using current software implementations and some minor issues were solved. Moreover, the workflow was improved by allowing the customization of different analytical parameters, mostly focusing on stringency and the possibility of accounting for a minimal level of genetic differentiation among samples.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SmithHunter：用于识别候选 smithRNA 及其靶标的工作流程。

背景：SmithRNA（线粒体高转录小 RNA）是一类新型的小 RNA 分子，在线粒体基因组中编码，可调节核转录本的表达。它们存在的最初证据来自马尼拉蛤蜊 Ruditapes philippinarum。目前在其他物种中存在这些 RNA 的证据仅基于小 RNA 测序。作为鉴定不同元古脊椎动物谱系中铁匠核糖核酸特征的第一步，需要一个专门的、统一的分析工作流程：结果：我们提出了一种专为铁丝核糖核酸设计的新型工作流程。将唯一映射到线粒体基因组的序列数据（来自小 RNA 测序）聚类为推测的 smithRNA，并根据其丰度、在重复文库中的存在情况以及 5' 和 3' 转录边界的保守性进行预筛选。随后，根据种子配对、整体匹配和热力学稳定性，将存活的序列与核转录本的非翻译区进行比较，以确定可能的靶标。同时还会生成大量的附带信息和图形，以帮助确定这些分子在所选物种中的特征，并指导操作者完成分析。该工作流程在马尼拉蛤的原始数据上进行了测试。在基本设置下，原始研究的结果基本得到了复制。我们还进一步评估了附加参数定制（聚类阈值、严格程度、最小重复次数、种子匹配）的效果：史密斯核糖核酸的研究仍处于起步阶段，目前还没有专门的分析工作流程。SmithHunter 工作流程的核心是建立在最初用于识别马尼拉蛤中候选 smithRNAs 的生物信息学程序之上。事实上，这是目前唯一经过生物学验证的铁锈色核糖核酸证据，因此也是鉴定其他物种铁锈色核糖核酸特征的首选起点。利用当前的软件实现对原始分析进行了重新调整，并解决了一些小问题。此外，还改进了工作流程，允许定制不同的分析参数，主要集中在严格性和考虑样本间最低遗传差异水平的可能性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.