利用有效的分类和特征选择技术预测乳腺癌复发

Ahmed Iqbal Pritom, Md. Ahadur Rahman Munshi, S. Sabab, Shihabuzzaman Shihab
{"title":"利用有效的分类和特征选择技术预测乳腺癌复发","authors":"Ahmed Iqbal Pritom, Md. Ahadur Rahman Munshi, S. Sabab, Shihabuzzaman Shihab","doi":"10.1109/ICCITECHN.2016.7860215","DOIUrl":null,"url":null,"abstract":"Breast cancer is a major threat for middle aged women throughout the world and currently this is the second most threatening cause of cancer death in women. But early detection and prevention can significantly reduce the chances of death. An important fact regarding breast cancer prognosis is to optimize the probability of cancer recurrence. This paper aims at finding breast cancer recurrence probability using different data mining techniques. We also provide a noble approach in order to improve the accuracy of those models. Cancer patient's data were collected from Wisconsin dataset of UCI machine learning Repository. This dataset contained total 35 attributes in which we applied Naive Bayes, C4.5 Decision Tree and Support Vector Machine (SVM) classification algorithms and calculated their prediction accuracy. An efficient feature selection algorithm helped us to improve the accuracy of each model by reducing some lower ranked attributes. Not only the contributions of these attributes are very less, but their addition also misguides the classification algorithms. After a careful selection of upper ranked attributes we found a much improved accuracy rate for all three algorithms.","PeriodicalId":287635,"journal":{"name":"2016 19th International Conference on Computer and Information Technology (ICCIT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"62","resultStr":"{\"title\":\"Predicting breast cancer recurrence using effective classification and feature selection technique\",\"authors\":\"Ahmed Iqbal Pritom, Md. Ahadur Rahman Munshi, S. Sabab, Shihabuzzaman Shihab\",\"doi\":\"10.1109/ICCITECHN.2016.7860215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is a major threat for middle aged women throughout the world and currently this is the second most threatening cause of cancer death in women. But early detection and prevention can significantly reduce the chances of death. An important fact regarding breast cancer prognosis is to optimize the probability of cancer recurrence. This paper aims at finding breast cancer recurrence probability using different data mining techniques. We also provide a noble approach in order to improve the accuracy of those models. Cancer patient's data were collected from Wisconsin dataset of UCI machine learning Repository. This dataset contained total 35 attributes in which we applied Naive Bayes, C4.5 Decision Tree and Support Vector Machine (SVM) classification algorithms and calculated their prediction accuracy. An efficient feature selection algorithm helped us to improve the accuracy of each model by reducing some lower ranked attributes. Not only the contributions of these attributes are very less, but their addition also misguides the classification algorithms. After a careful selection of upper ranked attributes we found a much improved accuracy rate for all three algorithms.\",\"PeriodicalId\":287635,\"journal\":{\"name\":\"2016 19th International Conference on Computer and Information Technology (ICCIT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"62\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 19th International Conference on Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCITECHN.2016.7860215\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 19th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITECHN.2016.7860215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 62

摘要

乳腺癌是全世界中年妇女的主要威胁,目前是妇女癌症死亡的第二大威胁原因。但早期发现和预防可以显著降低死亡几率。关于乳腺癌预后的一个重要事实是优化癌症复发的概率。本文旨在利用不同的数据挖掘技术寻找乳腺癌的复发概率。为了提高这些模型的准确性,我们还提供了一种高贵的方法。癌症患者的数据来自UCI机器学习知识库的威斯康星数据集。该数据集共包含35个属性,我们分别应用了朴素贝叶斯、C4.5决策树和支持向量机(SVM)分类算法,并计算了它们的预测精度。一种高效的特征选择算法通过减少一些排名较低的属性来帮助我们提高每个模型的准确性。不仅这些属性的贡献很小,而且它们的添加还会误导分类算法。在仔细选择排名靠前的属性后,我们发现所有三种算法的准确率都大大提高了。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Predicting breast cancer recurrence using effective classification and feature selection technique
Breast cancer is a major threat for middle aged women throughout the world and currently this is the second most threatening cause of cancer death in women. But early detection and prevention can significantly reduce the chances of death. An important fact regarding breast cancer prognosis is to optimize the probability of cancer recurrence. This paper aims at finding breast cancer recurrence probability using different data mining techniques. We also provide a noble approach in order to improve the accuracy of those models. Cancer patient's data were collected from Wisconsin dataset of UCI machine learning Repository. This dataset contained total 35 attributes in which we applied Naive Bayes, C4.5 Decision Tree and Support Vector Machine (SVM) classification algorithms and calculated their prediction accuracy. An efficient feature selection algorithm helped us to improve the accuracy of each model by reducing some lower ranked attributes. Not only the contributions of these attributes are very less, but their addition also misguides the classification algorithms. After a careful selection of upper ranked attributes we found a much improved accuracy rate for all three algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Modeling of solar photovoltaic system using MATLAB/Simulink Traffic sign recognition using hybrid features descriptor and artificial neural network classifier Accuracy analysis of recommendation system using singular value decomposition Performance analysis of supervised machine learning algorithms for text classification Fatigue testing of MEMS device developed by MetalMUMPs fabrication process
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1