Efficient breast cancer detection using sequential feature selection techniques

2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS) Pub Date : 2015-12-01 DOI:10.1109/INTELCIS.2015.7397261

Taha M. Mohamed

{"title":"Efficient breast cancer detection using sequential feature selection techniques","authors":"Taha M. Mohamed","doi":"10.1109/INTELCIS.2015.7397261","DOIUrl":null,"url":null,"abstract":"Breast cancer is one of the most dangerous cancers in the world especially in the Arab countries and Egypt. Due to the large spreading of the disease, automatic recognition systems can help physicians to classify the tumors as benign or malignant. However, performing a lot of pathological analysis consumes time and money. In this paper, we propose an algorithm for decreasing the number of features required to detect the tumor. Two classifiers are chosen to test the classification accuracy; linear and quadratic. The experimental results show that, there are strong correlations between the features in the data set. When using the sequential feature selection algorithm, results show that, discarding more than 50% of the features has no significant loss on classification accuracy when using the quadratic discriminate classifier. Additionally, only four PCA components can be used with the same accuracy as using nine components when being classified by the linear discriminate classifier. Additionally, the outliers in the data set have no notable effect on the classification accuracy. The data set is proved to be homogenous using the k-means clustering algorithm.","PeriodicalId":6478,"journal":{"name":"2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)","volume":"11 1","pages":"458-464"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INTELCIS.2015.7397261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Breast cancer is one of the most dangerous cancers in the world especially in the Arab countries and Egypt. Due to the large spreading of the disease, automatic recognition systems can help physicians to classify the tumors as benign or malignant. However, performing a lot of pathological analysis consumes time and money. In this paper, we propose an algorithm for decreasing the number of features required to detect the tumor. Two classifiers are chosen to test the classification accuracy; linear and quadratic. The experimental results show that, there are strong correlations between the features in the data set. When using the sequential feature selection algorithm, results show that, discarding more than 50% of the features has no significant loss on classification accuracy when using the quadratic discriminate classifier. Additionally, only four PCA components can be used with the same accuracy as using nine components when being classified by the linear discriminate classifier. Additionally, the outliers in the data set have no notable effect on the classification accuracy. The data set is proved to be homogenous using the k-means clustering algorithm.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用序列特征选择技术的高效乳腺癌检测

乳腺癌是世界上最危险的癌症之一，尤其是在阿拉伯国家和埃及。由于这种疾病的广泛传播，自动识别系统可以帮助医生将肿瘤分类为良性或恶性。然而，进行大量的病理分析需要耗费时间和金钱。在本文中，我们提出了一种算法来减少检测肿瘤所需的特征数量。选择两种分类器来测试分类精度;线性的和二次的。实验结果表明，数据集中的特征之间存在很强的相关性。当使用顺序特征选择算法时，结果表明，使用二次判别分类器时，丢弃50%以上的特征对分类精度没有明显损失。此外，当被线性判别分类器分类时，仅使用四个PCA分量就可以获得与使用九个分量相同的精度。此外，数据集中的异常值对分类精度没有显著影响。利用k均值聚类算法证明了数据集是齐次的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)

自引率

0.00%

发文量

期刊最新文献

On the use of probabilistic model-checking for the verification of prognostics applications Prospective, knowledge based clinical risk analysis: The OPT-model Partial deduction in predicate calculus as a tool for artificial intelligence problem complexity decreasing XML summarization: A survey Finding the pin in the haystack: A Bot Traceback service for public clouds