Krill Herd Optimization algorithm for cancer feature selection and random forest technique for classification

R. R. Rani, D. Ramyachitra
{"title":"Krill Herd Optimization algorithm for cancer feature selection and random forest technique for classification","authors":"R. R. Rani, D. Ramyachitra","doi":"10.1109/ICSESS.2017.8342875","DOIUrl":null,"url":null,"abstract":"The Cancer Feature Selection and classification problem is one of the prevalent tasks in computational molecular biology. Detecting a gene or list of genes which cause cancer can be acknowledged using the feature selection and classification which leads to giving a faultless treatment for patient and drug discovery of the particular gene. The feature selection and classification of cancer using microarray gene expression data is a computationally difficult task. Even now, the computation of gene selection and classification is a challenging area to provide an exact biological related gene that causes cancer. In this work, three methods have been proposed. One is the Fish Swarm Optimization algorithm along with both Support Vector Machine and Random Forest technique for cancer feature selection and classification. But the above methods have reduced very few features from the datasets. Thus, they are considered as an existing method for this work. Now, the second proposed method namely an enhanced Krill Herd Optimization (KHO) technique was employed for selecting the genes and Random Forest (RF) Technique was employed to classify the cancer types. The Random Forest classification has been used because of its accurate classification accuracy. First, the subset of features is selected using KHO and the Random Forest classification is applied to the selected features. Ten different gene microarray cancer datasets were used to evaluate the efficiency of the proposed. The proposed KHO/RF method is compared with other well-known existing methods like PSO/SVM, PSO/RF, FSO/SVM and FSO/RF. As an outcome, the proposed method outperforms the other existing methods with 100% accuracy of results for most datasets.","PeriodicalId":179815,"journal":{"name":"2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS.2017.8342875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

The Cancer Feature Selection and classification problem is one of the prevalent tasks in computational molecular biology. Detecting a gene or list of genes which cause cancer can be acknowledged using the feature selection and classification which leads to giving a faultless treatment for patient and drug discovery of the particular gene. The feature selection and classification of cancer using microarray gene expression data is a computationally difficult task. Even now, the computation of gene selection and classification is a challenging area to provide an exact biological related gene that causes cancer. In this work, three methods have been proposed. One is the Fish Swarm Optimization algorithm along with both Support Vector Machine and Random Forest technique for cancer feature selection and classification. But the above methods have reduced very few features from the datasets. Thus, they are considered as an existing method for this work. Now, the second proposed method namely an enhanced Krill Herd Optimization (KHO) technique was employed for selecting the genes and Random Forest (RF) Technique was employed to classify the cancer types. The Random Forest classification has been used because of its accurate classification accuracy. First, the subset of features is selected using KHO and the Random Forest classification is applied to the selected features. Ten different gene microarray cancer datasets were used to evaluate the efficiency of the proposed. The proposed KHO/RF method is compared with other well-known existing methods like PSO/SVM, PSO/RF, FSO/SVM and FSO/RF. As an outcome, the proposed method outperforms the other existing methods with 100% accuracy of results for most datasets.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
磷虾群优化算法的癌症特征选择和随机森林技术的分类
肿瘤特征选择与分类问题是计算分子生物学研究的热点问题之一。使用特征选择和分类可以识别导致癌症的基因或基因列表,从而为特定基因的患者和药物发现提供完美的治疗。利用微阵列基因表达数据对癌症进行特征选择和分类是一项计算困难的任务。即使是现在,基因选择和分类的计算仍然是一个具有挑战性的领域,无法提供导致癌症的确切生物学相关基因。在这项工作中,提出了三种方法。一种是鱼群优化算法,结合支持向量机和随机森林技术进行癌症特征选择和分类。但上述方法从数据集中减少的特征很少。因此,它们被认为是这项工作的现有方法。目前,第二种方法即增强型磷虾群优化(KHO)技术用于基因选择,随机森林(RF)技术用于癌症类型分类。随机森林分类法因其准确的分类精度而得到广泛应用。首先,使用KHO选择特征子集,并对所选特征应用随机森林分类。使用10个不同的基因微阵列癌症数据集来评估所提出的效率。将KHO/RF方法与现有的PSO/SVM、PSO/RF、FSO/SVM和FSO/RF等方法进行了比较。结果表明,对于大多数数据集,该方法的结果准确率为100%,优于其他现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Critical analysis of feature model evolution A key technology survey and summary of dynamic network visualization Soft decision strategy design for signal demodulation in IEEE 802.11 protocol suite based wireless communication process A prediction method based on improved ridge regression SuperedgeRank algorithm and its application for core technology identification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1