Gene Selection for Cancer Classification from Microarray Data Using Data Overlap Measure

Saeed Sarbazi-Azad, M. S. Abadeh
{"title":"Gene Selection for Cancer Classification from Microarray Data Using Data Overlap Measure","authors":"Saeed Sarbazi-Azad, M. S. Abadeh","doi":"10.1109/ICBME.2018.8703565","DOIUrl":null,"url":null,"abstract":"Cancer detection is one of the major applications of clinical microarray data. High dimensionality is one of the important challenges in microarrays. Most of genes in microarrays have no importance or contribution on the class prediction and on the other side a lot of resources and memory are needed to processing this amount of genes. Thus the reduction in number of dimensions seems to be staple to predict cancer. In this paper a gene selection method using data complexity measures on microarray gene expression cancer data is presented. Two overlap measures as data complexity measure namely fisher discriminant ratio and attribute efficiency are applied to ranking the genes and afterward the high rank genes are considered as important ones to contribute in cancer diagnosis. Five well-known binary microarray cancer data are considered for evaluation and also the applied classifiers are Decision Tree (DT), naïve bayes (NB) and K-Nearest Neighbor (KNN). Two approaches that were considered are fisher-based and (attribute +fisher)-based gene selection. The results indicate that the model created by genes selected by fisher-based method can detect the cancerous samples with high accuracy.","PeriodicalId":338286,"journal":{"name":"2018 25th National and 3rd International Iranian Conference on Biomedical Engineering (ICBME)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 25th National and 3rd International Iranian Conference on Biomedical Engineering (ICBME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBME.2018.8703565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Cancer detection is one of the major applications of clinical microarray data. High dimensionality is one of the important challenges in microarrays. Most of genes in microarrays have no importance or contribution on the class prediction and on the other side a lot of resources and memory are needed to processing this amount of genes. Thus the reduction in number of dimensions seems to be staple to predict cancer. In this paper a gene selection method using data complexity measures on microarray gene expression cancer data is presented. Two overlap measures as data complexity measure namely fisher discriminant ratio and attribute efficiency are applied to ranking the genes and afterward the high rank genes are considered as important ones to contribute in cancer diagnosis. Five well-known binary microarray cancer data are considered for evaluation and also the applied classifiers are Decision Tree (DT), naïve bayes (NB) and K-Nearest Neighbor (KNN). Two approaches that were considered are fisher-based and (attribute +fisher)-based gene selection. The results indicate that the model created by genes selected by fisher-based method can detect the cancerous samples with high accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用数据重叠度量从微阵列数据中进行癌症分类的基因选择
肿瘤检测是临床微阵列数据的主要应用之一。高维是微阵列的重要挑战之一。微阵列中的大多数基因对分类预测没有重要意义或贡献,另一方面,处理这些基因需要大量的资源和内存。因此,维数的减少似乎是预测癌症的主要依据。本文提出了一种基于数据复杂度的基因选择方法。采用fisher判别比和属性效率两种重叠度量作为数据复杂度度量,对基因进行排序,然后将高秩基因视为对癌症诊断有重要贡献的基因。考虑了五种众所周知的二进制微阵列癌症数据进行评估,并应用分类器是决策树(DT), naïve贝叶斯(NB)和k -最近邻(KNN)。考虑的两种方法是基于fisher和(属性+fisher)的基因选择。结果表明,基于fisher方法选择的基因所建立的模型能够较好地检测出肿瘤样本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Adaptive beamforming with automatic diagonal loading in medical ultrasound imaging Design of a Low Noise Low Power Amplifier for Biomedical Applications Synthesis, Characterization and Electrospinning of Novel Chitosan Derivative for Tissue Engineering Applications Automatic segmentation of prostate in MR images using deep learning and multi-atlas techniques Effects of temperature distribution in the tissue around the tumor on the quality of hyperthermia
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1