Ensemble learning-based classification of microarray cancer data on tree-based features

IF 1.2 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Cognitive Computation and Systems Pub Date : 2021-02-25 DOI:10.1049/ccs2.12003
Guesh Dagnew, B.H. Shekar
{"title":"Ensemble learning-based classification of microarray cancer data on tree-based features","authors":"Guesh Dagnew,&nbsp;B.H. Shekar","doi":"10.1049/ccs2.12003","DOIUrl":null,"url":null,"abstract":"<p>Cancer is a group of related diseases with high mortality rate characterized by abnormal cell growth which attacks the body tissues. Microarray cancer data is a prominent research topic across many disciplines focused to address problems related to the higher curse of dimensionality, a small number of samples, noisy data and imbalance class. A random forest (RF) tree-based feature selection and ensemble learning based on hard voting and soft voting is proposed to classify microarray cancer data using six different base classifiers. The selected features due to RF tree are submitted to the base classifiers as the training set. Then, an ensemble learning method is applied to the base classifiers in which case each base classifier predicts class label individually. The final prediction is carried out hard and soft voting techniques that use majority voting and weighted probability on the test set. The proposed ensemble learning method is validated on eight different standard microarray cancer datasets, of which three of the datasets are binary class and the remaining five datasets are multi-class datasets. Experimental results of the proposed method show 1.00 classification accuracy on six of the datasets and 0.96 on two of the datasets.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2021-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12003","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Computation and Systems","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ccs2.12003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 12

Abstract

Cancer is a group of related diseases with high mortality rate characterized by abnormal cell growth which attacks the body tissues. Microarray cancer data is a prominent research topic across many disciplines focused to address problems related to the higher curse of dimensionality, a small number of samples, noisy data and imbalance class. A random forest (RF) tree-based feature selection and ensemble learning based on hard voting and soft voting is proposed to classify microarray cancer data using six different base classifiers. The selected features due to RF tree are submitted to the base classifiers as the training set. Then, an ensemble learning method is applied to the base classifiers in which case each base classifier predicts class label individually. The final prediction is carried out hard and soft voting techniques that use majority voting and weighted probability on the test set. The proposed ensemble learning method is validated on eight different standard microarray cancer datasets, of which three of the datasets are binary class and the remaining five datasets are multi-class datasets. Experimental results of the proposed method show 1.00 classification accuracy on six of the datasets and 0.96 on two of the datasets.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于树状特征的集成学习微阵列癌症数据分类
癌症是一类以细胞生长异常为特征,以攻击机体组织为特征的高死亡率的相关疾病。微阵列癌症数据是一个跨多个学科的突出研究课题,致力于解决与维数高、样本数量少、噪声数据和不平衡类相关的问题。提出了一种基于随机森林(RF)树的特征选择和基于硬投票和软投票的集成学习方法,使用六种不同的基分类器对微阵列癌症数据进行分类。通过RF树选择的特征作为训练集提交给基分类器。然后,将集成学习方法应用于基分类器,每个基分类器单独预测类标签。最终的预测采用硬投票和软投票技术,分别对测试集使用多数投票和加权概率。在8个不同的标准微阵列癌症数据集上验证了所提出的集成学习方法,其中3个数据集为二分类数据集,其余5个数据集为多分类数据集。实验结果表明,该方法在6个数据集上的分类准确率为1.00,在2个数据集上的分类准确率为0.96。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Cognitive Computation and Systems
Cognitive Computation and Systems Computer Science-Computer Science Applications
CiteScore
2.50
自引率
0.00%
发文量
39
审稿时长
10 weeks
期刊最新文献
EF-CorrCA: A multi-modal EEG-fNIRS subject independent model to assess speech quality on brain activity using correlated component analysis Detection of autism spectrum disorder using multi-scale enhanced graph convolutional network Evolving usability heuristics for visualising Augmented Reality/Mixed Reality applications using cognitive model of information processing and fuzzy analytical hierarchy process Emotion classification with multi-modal physiological signals using multi-attention-based neural network Optimisation of deep neural network model using Reptile meta learning approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1