无监督降维,然后用感知器进行监督学习,改进了DNA微阵列基因表达数据的条件分类

L. Conde, Á. Mateos, Javier Herrero, J. Dopazo
{"title":"无监督降维,然后用感知器进行监督学习,改进了DNA微阵列基因表达数据的条件分类","authors":"L. Conde, Á. Mateos, Javier Herrero, J. Dopazo","doi":"10.1109/NNSP.2002.1030019","DOIUrl":null,"url":null,"abstract":"This manuscript describes a combined approach of unsupervised clustering followed by supervised learning that provides an efficient classification of conditions in DNA array gene expression experiments (different cell lines including some cancer types, in the cases shown). Firstly the dimensionality of the dataset of gene expression profiles is reduced to a number of non-redundant clusters of co-expressing genes using an unsupervised clustering algorithm, the Self Organizing Tree Algorithm (SOTA), a hierarchical version of Self Organizing Maps (SOM). Then, the average values of these clusters are used for the training of a perception that produces a very efficient classification of the conditions. This way of reducing the dimensionality of the data set seems to perform better than other ones previously proposed such as PCA. In addition, the weights that connect the gene clusters to the different experimental conditions can be used to assess the relative importance of the genes in the definition of these classes. Finally, Gene Ontology (GO) terms are used to infer a possible biological role for these groups of genes and to asses the validity of the classification from a biological point of view.","PeriodicalId":117945,"journal":{"name":"Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing","volume":"4 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Unsupervised reduction of the dimensionality followed by supervised learning with a perceptron improves the classification of conditions in DNA microarray gene expression data\",\"authors\":\"L. Conde, Á. Mateos, Javier Herrero, J. Dopazo\",\"doi\":\"10.1109/NNSP.2002.1030019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This manuscript describes a combined approach of unsupervised clustering followed by supervised learning that provides an efficient classification of conditions in DNA array gene expression experiments (different cell lines including some cancer types, in the cases shown). Firstly the dimensionality of the dataset of gene expression profiles is reduced to a number of non-redundant clusters of co-expressing genes using an unsupervised clustering algorithm, the Self Organizing Tree Algorithm (SOTA), a hierarchical version of Self Organizing Maps (SOM). Then, the average values of these clusters are used for the training of a perception that produces a very efficient classification of the conditions. This way of reducing the dimensionality of the data set seems to perform better than other ones previously proposed such as PCA. In addition, the weights that connect the gene clusters to the different experimental conditions can be used to assess the relative importance of the genes in the definition of these classes. Finally, Gene Ontology (GO) terms are used to infer a possible biological role for these groups of genes and to asses the validity of the classification from a biological point of view.\",\"PeriodicalId\":117945,\"journal\":{\"name\":\"Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing\",\"volume\":\"4 8\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NNSP.2002.1030019\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NNSP.2002.1030019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

本文描述了一种结合无监督聚类和监督学习的方法,该方法在DNA阵列基因表达实验中提供了一种有效的条件分类(不同的细胞系,包括一些癌症类型,在所示的情况下)。首先,使用一种无监督聚类算法,即自组织树算法(SOTA),将基因表达谱数据集的维数降为一些非冗余的共表达基因簇,自组织树算法是自组织地图(SOM)的分层版本。然后,这些聚类的平均值用于训练产生非常有效的条件分类的感知。这种降低数据集维数的方法似乎比以前提出的其他方法(如PCA)性能更好。此外,将基因簇与不同实验条件联系起来的权重可以用来评估这些类别定义中基因的相对重要性。最后,基因本体(GO)术语用于推断这些基因组可能的生物学作用,并从生物学的角度评估分类的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Unsupervised reduction of the dimensionality followed by supervised learning with a perceptron improves the classification of conditions in DNA microarray gene expression data
This manuscript describes a combined approach of unsupervised clustering followed by supervised learning that provides an efficient classification of conditions in DNA array gene expression experiments (different cell lines including some cancer types, in the cases shown). Firstly the dimensionality of the dataset of gene expression profiles is reduced to a number of non-redundant clusters of co-expressing genes using an unsupervised clustering algorithm, the Self Organizing Tree Algorithm (SOTA), a hierarchical version of Self Organizing Maps (SOM). Then, the average values of these clusters are used for the training of a perception that produces a very efficient classification of the conditions. This way of reducing the dimensionality of the data set seems to perform better than other ones previously proposed such as PCA. In addition, the weights that connect the gene clusters to the different experimental conditions can be used to assess the relative importance of the genes in the definition of these classes. Finally, Gene Ontology (GO) terms are used to infer a possible biological role for these groups of genes and to asses the validity of the classification from a biological point of view.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fusion of multiple experts in multimodal biometric personal identity verification systems A new SOLPN-based rate control algorithm for MPEG video coding Analog implementation for networks of integrate-and-fire neurons with adaptive local connectivity Removal of residual crosstalk components in blind source separation using LMS filters Functional connectivity modelling in fMRI based on causal networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1