Genome-wide identification and evolutionary analysis of long non-coding RNAs in cereals

Ying Sun, W. L. Rogers, K. Devos, Liming Cai, R. Malmberg
{"title":"Genome-wide identification and evolutionary analysis of long non-coding RNAs in cereals","authors":"Ying Sun, W. L. Rogers, K. Devos, Liming Cai, R. Malmberg","doi":"10.1109/ICCABS.2016.7802791","DOIUrl":null,"url":null,"abstract":"We identified lncRNA candidates in four economically important cereals (Poaceae): 7,196 in Zea mays, 1,974 in Sorghum bicolor, 4,236 in Setaria italica and 2,542 in Oryza sativa, using computational methods; we then compared these RNAs across the species. Our approach involved screening a reference-guided transcriptome assembly of RNA-Seq data for RNAs that were at least 200 bases in length with at most 70 amino acids in open reading frames and with a lack of homology in the Uniprot database. A sequence composition analysis of the lncRNA candidates, in comparison to protein-coding transcripts, highlighted distinctive features, including a low GC content, a paucity of introns and a hexamer usage bias, consistent with what has been found for mammalian lncRNAs. RepeatMasker identified from 1% (rice) to 19% (maize) of the candidate lncRNAs as being transcribed from transposable elements, based on a dataset with 3,853 transposable elements. We compared the candidate lncRNAs with 25,141 miRNAs from miRBase, and found that less than 1% of them could be potential miRNA precursors. The cross-species comparisons, which included a sequence- and structure-based lncRNA homology search, synteny analysis, and lncRNA secondary structure prediction, uncovered some limited sequence similarity. In sub-regions, we predicted conserved secondary structures using covariation analysis. We used the comparative sequence and synteny analyses to predict the existence of lncRNAs in S. italica; experimental tests confirmed the presence of these RNAs. Our results are consistent with a model of very rapid evolution of lncRNAs.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"12 1","pages":"1"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCABS.2016.7802791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We identified lncRNA candidates in four economically important cereals (Poaceae): 7,196 in Zea mays, 1,974 in Sorghum bicolor, 4,236 in Setaria italica and 2,542 in Oryza sativa, using computational methods; we then compared these RNAs across the species. Our approach involved screening a reference-guided transcriptome assembly of RNA-Seq data for RNAs that were at least 200 bases in length with at most 70 amino acids in open reading frames and with a lack of homology in the Uniprot database. A sequence composition analysis of the lncRNA candidates, in comparison to protein-coding transcripts, highlighted distinctive features, including a low GC content, a paucity of introns and a hexamer usage bias, consistent with what has been found for mammalian lncRNAs. RepeatMasker identified from 1% (rice) to 19% (maize) of the candidate lncRNAs as being transcribed from transposable elements, based on a dataset with 3,853 transposable elements. We compared the candidate lncRNAs with 25,141 miRNAs from miRBase, and found that less than 1% of them could be potential miRNA precursors. The cross-species comparisons, which included a sequence- and structure-based lncRNA homology search, synteny analysis, and lncRNA secondary structure prediction, uncovered some limited sequence similarity. In sub-regions, we predicted conserved secondary structures using covariation analysis. We used the comparative sequence and synteny analyses to predict the existence of lncRNAs in S. italica; experimental tests confirmed the presence of these RNAs. Our results are consistent with a model of very rapid evolution of lncRNAs.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
谷物长链非编码rna的全基因组鉴定与进化分析
利用计算方法,我们在4种具有重要经济价值的谷物(禾科)中确定了候选lncRNA:玉米7196个,双色高粱1974个,意大利狗尾草4236个,水稻2542个;然后我们比较了不同物种的rna。我们的方法包括筛选长度至少为200个碱基、开放阅读框中最多70个氨基酸且在Uniprot数据库中缺乏同源性的RNA-Seq数据的参考引导转录组组装。与蛋白质编码转录本相比,lncRNA候选序列的序列组成分析突出了其独特的特征,包括GC含量低、内含子缺乏和六聚体使用偏倚,与哺乳动物lncRNA的发现一致。基于3853个转座元件的数据集,RepeatMasker鉴定出1%(水稻)到19%(玉米)的候选lncrna是从转座元件转录而来的。我们将候选lncrna与miRBase中的25141个miRNA进行了比较,发现其中不到1%可能是潜在的miRNA前体。跨物种比较,包括基于序列和结构的lncRNA同源性搜索、synsynanalysis和lncRNA二级结构预测,发现了一些有限的序列相似性。在子区域,我们使用共变分析预测保守的二级结构。我们使用比较序列和合成分析来预测意大利葡萄中lncrna的存在;实验测试证实了这些rna的存在。我们的结果与lncrna非常快速进化的模型一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Computational Advances in Bio and Medical Sciences: 11th International Conference, ICCABS 2021, Virtual Event, December 16–18, 2021, Revised Selected Papers Computational Advances in Bio and Medical Sciences: 10th International Conference, ICCABS 2020, Virtual Event, December 10-12, 2020, Revised Selected Papers Single-Cell Gene Regulatory Network Analysis Reveals Potential Mechanisms of Action of Antimalarials Against SARS-CoV-2 Computational Study of Action Potential Generation in Urethral Smooth Muscle Cell DNA Read Feature Importance Using Machine Learning for Read Alignment Categories
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1