通过多参考物种选择分析新生转录组的生物学途径

Chun-Cheng Liu, Chien-Ming Chen, Cing-Han Yang, Tun-Wen Pai, P. Lim, S. Phang, Sze-Wan Poong, Kok-Keong Lee
{"title":"通过多参考物种选择分析新生转录组的生物学途径","authors":"Chun-Cheng Liu, Chien-Ming Chen, Cing-Han Yang, Tun-Wen Pai, P. Lim, S. Phang, Sze-Wan Poong, Kok-Keong Lee","doi":"10.1109/CISIS.2016.73","DOIUrl":null,"url":null,"abstract":"For de novo transcriptome analysis, choosing a closest reference model specie in terms of evolutionary distance is a general approach for gene mapping and genome annotations. However, not every selected reference model species possesses comprehensive genome annotations and curated information, and the total number of mapped genes from the selected reference species could not be fully expected either. Due to inefficient mapped genes from the selected reference model species, the following functional pathway analysis on transcriptome datasets would be seriously affected. To solve this problem, we proposed an improved approach based on multiple reference model species selection, especially for KEGG pathway analysis on differentially expressed genes. Applying union operations on individually mapped genes from different selected species, we could significantly promote the integrity of gene mapping results in KEGG pathways and provide realistic P-values for each identified pathway. Furthermore, based on mapped genes and KGML datasets, we applied various gray-levels, colors and shapes to present gene expression conditions on each biological pathway. Taking NGS transcriptomic datasets from an unknown Antarctic green alga species as an experimental example and selecting three published known species including Chlamydomonas reinhardtii, Chlorella variabilis, and Coccomyxa subellipsoidea as candidate reference species, we compared the results of pathway enrichment analysis by adopting different selections of reference species. We found that integrating all mapped genes from various model species provided a better result compared to using any single reference species. Some missed important biological pathways could be retrieved under an identical threshold setting of P-value, such as Ribosome, Pyrimidine metabolism and ABC transporters pathways. Therefore, we believe appropriate selection of multiple reference species is necessary and significant for transcriptome analysis on de novo species.","PeriodicalId":249236,"journal":{"name":"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Biological Pathway Analysis for De Novo Transcriptomes through Multiple Reference Species Selections\",\"authors\":\"Chun-Cheng Liu, Chien-Ming Chen, Cing-Han Yang, Tun-Wen Pai, P. Lim, S. Phang, Sze-Wan Poong, Kok-Keong Lee\",\"doi\":\"10.1109/CISIS.2016.73\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For de novo transcriptome analysis, choosing a closest reference model specie in terms of evolutionary distance is a general approach for gene mapping and genome annotations. However, not every selected reference model species possesses comprehensive genome annotations and curated information, and the total number of mapped genes from the selected reference species could not be fully expected either. Due to inefficient mapped genes from the selected reference model species, the following functional pathway analysis on transcriptome datasets would be seriously affected. To solve this problem, we proposed an improved approach based on multiple reference model species selection, especially for KEGG pathway analysis on differentially expressed genes. Applying union operations on individually mapped genes from different selected species, we could significantly promote the integrity of gene mapping results in KEGG pathways and provide realistic P-values for each identified pathway. Furthermore, based on mapped genes and KGML datasets, we applied various gray-levels, colors and shapes to present gene expression conditions on each biological pathway. Taking NGS transcriptomic datasets from an unknown Antarctic green alga species as an experimental example and selecting three published known species including Chlamydomonas reinhardtii, Chlorella variabilis, and Coccomyxa subellipsoidea as candidate reference species, we compared the results of pathway enrichment analysis by adopting different selections of reference species. We found that integrating all mapped genes from various model species provided a better result compared to using any single reference species. Some missed important biological pathways could be retrieved under an identical threshold setting of P-value, such as Ribosome, Pyrimidine metabolism and ABC transporters pathways. Therefore, we believe appropriate selection of multiple reference species is necessary and significant for transcriptome analysis on de novo species.\",\"PeriodicalId\":249236,\"journal\":{\"name\":\"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)\",\"volume\":\"189 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISIS.2016.73\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISIS.2016.73","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

对于从头转录组分析,根据进化距离选择最接近的参考模型物种是基因定位和基因组注释的一般方法。然而,并不是每一个选择的参考模型物种都具有全面的基因组注释和整理信息,并且所选择的参考物种的基因图谱总数也不能完全预期。由于所选参考模式物种的基因定位效率低下,将严重影响转录组数据集的后续功能通路分析。为了解决这一问题,我们提出了一种基于多参考模型物种选择的改进方法,特别是对差异表达基因的KEGG通路分析。对来自不同选择物种的单个定位基因进行联合操作,我们可以显著提高KEGG途径基因定位结果的完整性,并为每个已鉴定的途径提供真实的p值。此外,基于已定位的基因和KGML数据集,我们应用了不同的灰度、颜色和形状来表示每个生物途径上的基因表达条件。以一种未知南极绿藻的NGS转录组数据为例,选取已发表的3种已知物种莱茵衣藻(Chlamydomonas reinhardtii)、小球藻(Chlorella variabilis)和Coccomyxa subbellipsoidea作为候选参考物种,采用不同的参考物种选择,比较途径富集分析结果。我们发现整合来自不同模式物种的所有定位基因比使用任何单一参考物种提供了更好的结果。在相同的p值阈值设置下,可以检索到一些遗漏的重要生物学途径,如核糖体、嘧啶代谢和ABC转运蛋白途径。因此,我们认为适当选择多个参考物种对新生物种的转录组分析是必要的和重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Biological Pathway Analysis for De Novo Transcriptomes through Multiple Reference Species Selections
For de novo transcriptome analysis, choosing a closest reference model specie in terms of evolutionary distance is a general approach for gene mapping and genome annotations. However, not every selected reference model species possesses comprehensive genome annotations and curated information, and the total number of mapped genes from the selected reference species could not be fully expected either. Due to inefficient mapped genes from the selected reference model species, the following functional pathway analysis on transcriptome datasets would be seriously affected. To solve this problem, we proposed an improved approach based on multiple reference model species selection, especially for KEGG pathway analysis on differentially expressed genes. Applying union operations on individually mapped genes from different selected species, we could significantly promote the integrity of gene mapping results in KEGG pathways and provide realistic P-values for each identified pathway. Furthermore, based on mapped genes and KGML datasets, we applied various gray-levels, colors and shapes to present gene expression conditions on each biological pathway. Taking NGS transcriptomic datasets from an unknown Antarctic green alga species as an experimental example and selecting three published known species including Chlamydomonas reinhardtii, Chlorella variabilis, and Coccomyxa subellipsoidea as candidate reference species, we compared the results of pathway enrichment analysis by adopting different selections of reference species. We found that integrating all mapped genes from various model species provided a better result compared to using any single reference species. Some missed important biological pathways could be retrieved under an identical threshold setting of P-value, such as Ribosome, Pyrimidine metabolism and ABC transporters pathways. Therefore, we believe appropriate selection of multiple reference species is necessary and significant for transcriptome analysis on de novo species.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
3D Model Generation of Cattle by Shape-from-Silhouette Method for ICT Agriculture Improvement of Mesh Free Deforming Analysis for Maxillofacial Palpation on a Virtual Training System A Proposal of Coding Rule Learning Function in Java Programming Learning Assistant System 3D Model Data Retrieval System Using KAZE Feature for Accepting 2D Image as Query Flexible Screen Sharing System between PC and Tablet for Collaborative Activities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1