ETENLNC：端对端 lncRNA 识别和分析框架，促进构建已知和新型 lncRNA 调控网络

IF 3.1 4区生物学 Q2 BIOLOGY Computational Biology and Chemistry Pub Date : 2024-10-01 Epub Date: 2024-06-30 DOI:10.1016/j.compbiolchem.2024.108140

Prangan Nath , Kaveri Bhuyan , Dhruba Kumar Bhattacharyya , Pankaj Barah

{"title":"ETENLNC：端对端 lncRNA 识别和分析框架，促进构建已知和新型 lncRNA 调控网络","authors":"Prangan Nath , Kaveri Bhuyan , Dhruba Kumar Bhattacharyya , Pankaj Barah","doi":"10.1016/j.compbiolchem.2024.108140","DOIUrl":null,"url":null,"abstract":"<div><p>Long non-coding RNAs (lncRNAs) play crucial roles in the regulation of gene expression and maintenance of genomic integrity through various interactions with DNA, RNA, and proteins. The availability of large-scale sequence data from various high-throughput platforms has opened possibilities to identify, predict, and functionally annotate lncRNAs. As a result, there is a growing demand for an integrative computational framework capable of identifying known lncRNAs, predicting novel lncRNAs, and inferring the downstream regulatory interactions of lncRNAs at the genome-scale. We present ETENLNC (End-To-End-Novel-Long-NonCoding), a user-friendly, integrative, open-source, scalable, and modular computational framework for identifying and analyzing lncRNAs from raw RNA-Seq data. ETENLNC employs six stringent filtration steps to identify novel lncRNAs, performs differential expression analysis of mRNA and lncRNA transcripts, and predicts regulatory interactions between lncRNAs, mRNAs, miRNAs, and proteins. We benchmarked ETENLNC against six existing tools and optimized it for desktop workstations and high-performance computing environments using data from three different species. ETENLNC is freely available on GitHub: https://github.com/EvolOMICS-TU/ETENLNC.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"112 ","pages":"Article 108140"},"PeriodicalIF":3.1000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ETENLNC: An end to end lncRNA identification and analysis framework to facilitate construction of known and novel lncRNA regulatory networks\",\"authors\":\"Prangan Nath , Kaveri Bhuyan , Dhruba Kumar Bhattacharyya , Pankaj Barah\",\"doi\":\"10.1016/j.compbiolchem.2024.108140\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Long non-coding RNAs (lncRNAs) play crucial roles in the regulation of gene expression and maintenance of genomic integrity through various interactions with DNA, RNA, and proteins. The availability of large-scale sequence data from various high-throughput platforms has opened possibilities to identify, predict, and functionally annotate lncRNAs. As a result, there is a growing demand for an integrative computational framework capable of identifying known lncRNAs, predicting novel lncRNAs, and inferring the downstream regulatory interactions of lncRNAs at the genome-scale. We present ETENLNC (End-To-End-Novel-Long-NonCoding), a user-friendly, integrative, open-source, scalable, and modular computational framework for identifying and analyzing lncRNAs from raw RNA-Seq data. ETENLNC employs six stringent filtration steps to identify novel lncRNAs, performs differential expression analysis of mRNA and lncRNA transcripts, and predicts regulatory interactions between lncRNAs, mRNAs, miRNAs, and proteins. We benchmarked ETENLNC against six existing tools and optimized it for desktop workstations and high-performance computing environments using data from three different species. ETENLNC is freely available on GitHub: https://github.com/EvolOMICS-TU/ETENLNC.</p></div>\",\"PeriodicalId\":10616,\"journal\":{\"name\":\"Computational Biology and Chemistry\",\"volume\":\"112 \",\"pages\":\"Article 108140\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Biology and Chemistry\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1476927124001282\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/6/30 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927124001282","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/30 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

长非编码 RNA（lncRNA）通过与 DNA、RNA 和蛋白质的各种相互作用，在调控基因表达和维护基因组完整性方面发挥着至关重要的作用。各种高通量平台提供的大规模序列数据为鉴定、预测和注释 lncRNA 提供了可能。因此，对能够识别已知 lncRNA、预测新型 lncRNA 并推断 lncRNA 在基因组尺度上的下游调控相互作用的综合计算框架的需求日益增长。我们提出了 ETENLNC（End-To-End-Novel-Long-NonCoding），这是一个用户友好、集成、开源、可扩展和模块化的计算框架，用于从原始 RNA-Seq 数据中识别和分析 lncRNA。ETENLNC 采用六个严格的过滤步骤来识别新型 lncRNA，对 mRNA 和 lncRNA 转录本进行差异表达分析，并预测 lncRNA、mRNA、miRNA 和蛋白质之间的调控相互作用。我们利用来自三个不同物种的数据，将 ETENLNC 与六种现有工具进行了基准测试，并针对台式工作站和高性能计算环境对其进行了优化。ETENLNC 可在 GitHub 上免费获取：https://github.com/EvolOMICS-TU/ETENLNC。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ETENLNC: An end to end lncRNA identification and analysis framework to facilitate construction of known and novel lncRNA regulatory networks

Long non-coding RNAs (lncRNAs) play crucial roles in the regulation of gene expression and maintenance of genomic integrity through various interactions with DNA, RNA, and proteins. The availability of large-scale sequence data from various high-throughput platforms has opened possibilities to identify, predict, and functionally annotate lncRNAs. As a result, there is a growing demand for an integrative computational framework capable of identifying known lncRNAs, predicting novel lncRNAs, and inferring the downstream regulatory interactions of lncRNAs at the genome-scale. We present ETENLNC (End-To-End-Novel-Long-NonCoding), a user-friendly, integrative, open-source, scalable, and modular computational framework for identifying and analyzing lncRNAs from raw RNA-Seq data. ETENLNC employs six stringent filtration steps to identify novel lncRNAs, performs differential expression analysis of mRNA and lncRNA transcripts, and predicts regulatory interactions between lncRNAs, mRNAs, miRNAs, and proteins. We benchmarked ETENLNC against six existing tools and optimized it for desktop workstations and high-performance computing environments using data from three different species. ETENLNC is freely available on GitHub: https://github.com/EvolOMICS-TU/ETENLNC.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computational Biology and Chemistry 生物-计算机：跨学科应用

CiteScore

6.10

自引率

3.20%

发文量

142

审稿时长

24 days

期刊介绍： Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered. Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered. Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.