利用有效的分类和特征选择技术预测乳腺癌复发

2016 19th International Conference on Computer and Information Technology (ICCIT) Pub Date : 2016-12-01 DOI:10.1109/ICCITECHN.2016.7860215

Ahmed Iqbal Pritom, Md. Ahadur Rahman Munshi, S. Sabab, Shihabuzzaman Shihab

{"title":"利用有效的分类和特征选择技术预测乳腺癌复发","authors":"Ahmed Iqbal Pritom, Md. Ahadur Rahman Munshi, S. Sabab, Shihabuzzaman Shihab","doi":"10.1109/ICCITECHN.2016.7860215","DOIUrl":null,"url":null,"abstract":"Breast cancer is a major threat for middle aged women throughout the world and currently this is the second most threatening cause of cancer death in women. But early detection and prevention can significantly reduce the chances of death. An important fact regarding breast cancer prognosis is to optimize the probability of cancer recurrence. This paper aims at finding breast cancer recurrence probability using different data mining techniques. We also provide a noble approach in order to improve the accuracy of those models. Cancer patient's data were collected from Wisconsin dataset of UCI machine learning Repository. This dataset contained total 35 attributes in which we applied Naive Bayes, C4.5 Decision Tree and Support Vector Machine (SVM) classification algorithms and calculated their prediction accuracy. An efficient feature selection algorithm helped us to improve the accuracy of each model by reducing some lower ranked attributes. Not only the contributions of these attributes are very less, but their addition also misguides the classification algorithms. After a careful selection of upper ranked attributes we found a much improved accuracy rate for all three algorithms.","PeriodicalId":287635,"journal":{"name":"2016 19th International Conference on Computer and Information Technology (ICCIT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"62","resultStr":"{\"title\":\"Predicting breast cancer recurrence using effective classification and feature selection technique\",\"authors\":\"Ahmed Iqbal Pritom, Md. Ahadur Rahman Munshi, S. Sabab, Shihabuzzaman Shihab\",\"doi\":\"10.1109/ICCITECHN.2016.7860215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is a major threat for middle aged women throughout the world and currently this is the second most threatening cause of cancer death in women. But early detection and prevention can significantly reduce the chances of death. An important fact regarding breast cancer prognosis is to optimize the probability of cancer recurrence. This paper aims at finding breast cancer recurrence probability using different data mining techniques. We also provide a noble approach in order to improve the accuracy of those models. Cancer patient's data were collected from Wisconsin dataset of UCI machine learning Repository. This dataset contained total 35 attributes in which we applied Naive Bayes, C4.5 Decision Tree and Support Vector Machine (SVM) classification algorithms and calculated their prediction accuracy. An efficient feature selection algorithm helped us to improve the accuracy of each model by reducing some lower ranked attributes. Not only the contributions of these attributes are very less, but their addition also misguides the classification algorithms. After a careful selection of upper ranked attributes we found a much improved accuracy rate for all three algorithms.\",\"PeriodicalId\":287635,\"journal\":{\"name\":\"2016 19th International Conference on Computer and Information Technology (ICCIT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"62\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 19th International Conference on Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCITECHN.2016.7860215\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 19th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITECHN.2016.7860215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 62

摘要

乳腺癌是全世界中年妇女的主要威胁，目前是妇女癌症死亡的第二大威胁原因。但早期发现和预防可以显著降低死亡几率。关于乳腺癌预后的一个重要事实是优化癌症复发的概率。本文旨在利用不同的数据挖掘技术寻找乳腺癌的复发概率。为了提高这些模型的准确性，我们还提供了一种高贵的方法。癌症患者的数据来自UCI机器学习知识库的威斯康星数据集。该数据集共包含35个属性，我们分别应用了朴素贝叶斯、C4.5决策树和支持向量机(SVM)分类算法，并计算了它们的预测精度。一种高效的特征选择算法通过减少一些排名较低的属性来帮助我们提高每个模型的准确性。不仅这些属性的贡献很小，而且它们的添加还会误导分类算法。在仔细选择排名靠前的属性后，我们发现所有三种算法的准确率都大大提高了。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Predicting breast cancer recurrence using effective classification and feature selection technique

Breast cancer is a major threat for middle aged women throughout the world and currently this is the second most threatening cause of cancer death in women. But early detection and prevention can significantly reduce the chances of death. An important fact regarding breast cancer prognosis is to optimize the probability of cancer recurrence. This paper aims at finding breast cancer recurrence probability using different data mining techniques. We also provide a noble approach in order to improve the accuracy of those models. Cancer patient's data were collected from Wisconsin dataset of UCI machine learning Repository. This dataset contained total 35 attributes in which we applied Naive Bayes, C4.5 Decision Tree and Support Vector Machine (SVM) classification algorithms and calculated their prediction accuracy. An efficient feature selection algorithm helped us to improve the accuracy of each model by reducing some lower ranked attributes. Not only the contributions of these attributes are very less, but their addition also misguides the classification algorithms. After a careful selection of upper ranked attributes we found a much improved accuracy rate for all three algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 19th International Conference on Computer and Information Technology (ICCIT)

自引率

0.00%

发文量