{"title":"Parametric optimization and comparative study of machine learning and deep learning algorithms for breast cancer diagnosis.","authors":"Parul Jain, Shalini Aggarwal, Sufiyan Adam, Mohsin Imam","doi":"10.3233/BD-240018","DOIUrl":null,"url":null,"abstract":"<p><p>Breast Cancer is the leading form of cancer found in women and a major cause of increased mortality rates among them. However, manual diagnosis of the disease is time-consuming and often limited by the availability of screening systems. Thus, there is a pressing need for an automatic diagnosis system that can quickly detect cancer in its early stages. Data mining and machine learning techniques have emerged as valuable tools in developing such a system. In this study we investigated the performance of several machine learning models on the Wisconsin Breast Cancer (original) dataset with a particular emphasis on finding which models perform the best for breast cancer diagnosis. The study also explores the contrast between the proposed ANN methodology and conventional machine learning techniques. The comparison between the methods employed in the current study and those utilized in earlier research on the Wisconsin Breast Cancer dataset is also compared. The findings of this study are in line with those of previous studies which also highlighted the efficacy of SVM, Decision Tree, CART, ANN, and ELM ANN for breast cancer detection. Several classifiers achieved high accuracy, precision and F1 scores for benign and malignant tumours, respectively. It is also found that models with hyperparameter adjustment performed better than those without and boosting methods like as XGBoost, Adaboost, and Gradient Boost consistently performed well across benign and malignant tumours. The study emphasizes the significance of hyperparameter tuning and the efficacy of boosting algorithms in addressing the complexity and nonlinearity of data. Using the Wisconsin Breast Cancer (original) dataset, a detailed summary of the current status of research on breast cancer diagnosis is provided.</p>","PeriodicalId":9224,"journal":{"name":"Breast disease","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11492030/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Breast disease","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/BD-240018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Breast Cancer is the leading form of cancer found in women and a major cause of increased mortality rates among them. However, manual diagnosis of the disease is time-consuming and often limited by the availability of screening systems. Thus, there is a pressing need for an automatic diagnosis system that can quickly detect cancer in its early stages. Data mining and machine learning techniques have emerged as valuable tools in developing such a system. In this study we investigated the performance of several machine learning models on the Wisconsin Breast Cancer (original) dataset with a particular emphasis on finding which models perform the best for breast cancer diagnosis. The study also explores the contrast between the proposed ANN methodology and conventional machine learning techniques. The comparison between the methods employed in the current study and those utilized in earlier research on the Wisconsin Breast Cancer dataset is also compared. The findings of this study are in line with those of previous studies which also highlighted the efficacy of SVM, Decision Tree, CART, ANN, and ELM ANN for breast cancer detection. Several classifiers achieved high accuracy, precision and F1 scores for benign and malignant tumours, respectively. It is also found that models with hyperparameter adjustment performed better than those without and boosting methods like as XGBoost, Adaboost, and Gradient Boost consistently performed well across benign and malignant tumours. The study emphasizes the significance of hyperparameter tuning and the efficacy of boosting algorithms in addressing the complexity and nonlinearity of data. Using the Wisconsin Breast Cancer (original) dataset, a detailed summary of the current status of research on breast cancer diagnosis is provided.
乳腺癌是女性最常见的癌症,也是导致女性死亡率上升的主要原因。然而,人工诊断这种疾病非常耗时,而且往往受到筛查系统的限制。因此,迫切需要一种能够在癌症早期阶段快速检测出癌症的自动诊断系统。数据挖掘和机器学习技术已成为开发此类系统的重要工具。在这项研究中,我们对威斯康星乳腺癌(原始)数据集上的几种机器学习模型的性能进行了调查,重点是找出哪些模型在乳腺癌诊断中表现最佳。这项研究还探讨了拟议的 ANN 方法与传统机器学习技术之间的对比。本研究中采用的方法与早期研究中在威斯康星乳腺癌数据集上采用的方法也进行了比较。本研究的结果与之前的研究结果一致,之前的研究也强调了 SVM、决策树、CART、ANN 和 ELM ANN 在乳腺癌检测方面的功效。几种分类器对良性和恶性肿瘤分别达到了较高的准确度、精确度和 F1 分数。研究还发现,有超参数调整的模型比没有超参数调整的模型表现更好,而 XGBoost、Adaboost 和 Gradient Boost 等增强方法在良性肿瘤和恶性肿瘤中的表现也一直很好。这项研究强调了超参数调整的重要性以及提升算法在解决数据的复杂性和非线性方面的功效。利用威斯康星乳腺癌(原始)数据集,详细总结了乳腺癌诊断的研究现状。
期刊介绍:
The recent expansion of work in the field of breast cancer inevitably will hasten discoveries that will have impact on patient outcome. The breadth of this research that spans basic science, clinical medicine, epidemiology, and public policy poses difficulties for investigators. Not only is it necessary to be facile in comprehending ideas from many disciplines, but also important to understand the public implications of these discoveries. Breast Disease publishes review issues devoted to an in-depth analysis of the scientific and public implications of recent research on a specific problem in breast cancer. Thus, the reviews will not only discuss recent discoveries but will also reflect on their impact in breast cancer research or clinical management.