用系统发育挖掘解决基因树和物种树问题

Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI:10.1142/9781860947292_0032

Xiaoxu Han

{"title":"用系统发育挖掘解决基因树和物种树问题","authors":"Xiaoxu Han","doi":"10.1142/9781860947292_0032","DOIUrl":null,"url":null,"abstract":"The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of widely distributed orthologous genes selected from genome data to conduct phylogenetic analysis. The random concatenation mechanism prevents us from the further investigations of the inner structures of the gene data set employed to infer the phylogenetic trees and locates the most phylogenetically informative genes. In this work, a phylogenomic mining approach is described to gain knowledge from a gene data set by clustering genes in the gene set through a self-organizing map (SOM) to explore the gene dataset inner structures. From this, the most phylogenetically informative gene set is created by picking the maximum entropy gene from each cluster to infer phylogenetic trees by phylogenetic analysis. Using the same data set, the phylogenetic mining approach performs better than the random gene concatenation approach.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"120 1","pages":"287-296"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Resolving the Gene Tree and Species Tree Problem by Phylogenetic Mining\",\"authors\":\"Xiaoxu Han\",\"doi\":\"10.1142/9781860947292_0032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of widely distributed orthologous genes selected from genome data to conduct phylogenetic analysis. The random concatenation mechanism prevents us from the further investigations of the inner structures of the gene data set employed to infer the phylogenetic trees and locates the most phylogenetically informative genes. In this work, a phylogenomic mining approach is described to gain knowledge from a gene data set by clustering genes in the gene set through a self-organizing map (SOM) to explore the gene dataset inner structures. From this, the most phylogenetically informative gene set is created by picking the maximum entropy gene from each cluster to infer phylogenetic trees by phylogenetic analysis. Using the same data set, the phylogenetic mining approach performs better than the random gene concatenation approach.\",\"PeriodicalId\":74513,\"journal\":{\"name\":\"Proceedings of the ... Asia-Pacific bioinformatics conference\",\"volume\":\"120 1\",\"pages\":\"287-296\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... Asia-Pacific bioinformatics conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/9781860947292_0032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... Asia-Pacific bioinformatics conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/9781860947292_0032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

基因树和物种树问题仍然是系统基因组学的核心问题。为了克服这一问题，人们采用基因串联方法，从基因组数据中选择一组分布广泛的同源基因，随机组合一定数量的基因进行系统发育分析。随机连接机制使我们无法进一步研究用于推断系统发育树和定位最具系统发育信息基因的基因数据集的内部结构。在这项工作中，描述了一种系统基因组挖掘方法，通过自组织图谱(SOM)对基因集中的基因进行聚类，以探索基因数据集的内部结构，从而从基因数据集中获得知识。在此基础上，从每个聚类中选取熵值最大的基因，通过系统发育分析推断出系统发育树，从而得到系统发育信息量最大的基因集。使用相同的数据集，系统发育挖掘方法比随机基因连接方法性能更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Resolving the Gene Tree and Species Tree Problem by Phylogenetic Mining

The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of widely distributed orthologous genes selected from genome data to conduct phylogenetic analysis. The random concatenation mechanism prevents us from the further investigations of the inner structures of the gene data set employed to infer the phylogenetic trees and locates the most phylogenetically informative genes. In this work, a phylogenomic mining approach is described to gain knowledge from a gene data set by clustering genes in the gene set through a self-organizing map (SOM) to explore the gene dataset inner structures. From this, the most phylogenetically informative gene set is created by picking the maximum entropy gene from each cluster to infer phylogenetic trees by phylogenetic analysis. Using the same data set, the phylogenetic mining approach performs better than the random gene concatenation approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the ... Asia-Pacific bioinformatics conference

自引率

0.00%

发文量

期刊最新文献

Tuning Privacy-Utility Tradeoff in Genomic Studies Using Selective SNP Hiding. The Future of Bioinformatics CHEMICAL COMPOUND CLASSIFICATION WITH AUTOMATICALLY MINED STRUCTURE PATTERNS. Predicting Nucleolar Proteins Using Support-Vector Machines Proceedings of the 6th Asia-Pacific Bioinformatics Conference, APBC 2008, 14-17 January 2008, Kyoto, Japan