对诊断为原发性乳腺癌的患者使用机器学习的监督分类表型方法

i-manager's Journal on Computer Science Pub Date : 1900-01-01 DOI:10.26634/jcom.11.1.19374

A. Bashir, Ullah Burhan, Sardar Fouzia, Junaid Hazrat, Zaman Khan Gul

{"title":"对诊断为原发性乳腺癌的患者使用机器学习的监督分类表型方法","authors":"A. Bashir, Ullah Burhan, Sardar Fouzia, Junaid Hazrat, Zaman Khan Gul","doi":"10.26634/jcom.11.1.19374","DOIUrl":null,"url":null,"abstract":"This paper presents a methodology for the early detection and diagnosis of breast cancer using the Wisconsin dataset. The methodology involves four main steps, including data collection, preprocessing, feature selection, and classification. Fine needle aspiration technique is used to extract the ultrasound image features of breast cancer, and preprocessing is performed to eliminate outliers, null values, and noise. Redundant parameters are removed during the feature selection process to improve accuracy. Six machine learning algorithms, including Logistic Regression, Support Vector Machine, K-Nearest Neighbor, Random Forest, Decision Tree, and Gaussian Naive Bayes, are employed for the classification of the breast cancer dataset. Support Vector Machine and K-Nearest Neighbor achieved the highest accuracy, with Logistic Regression, Gaussian Naive Bayes, Random Forest, and Decision Tree having lower accuracy scores. The proposed methodology could aid in the timely detection and diagnosis of breast cancer, and help doctors in selecting the optimal clinical treatment plan for their patients. Further work will be carried out to investigate the effectiveness of additional preprocessing algorithms in improving the classification accuracy of the breast cancer dataset.","PeriodicalId":130578,"journal":{"name":"i-manager's Journal on Computer Science","volume":"32 15","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A supervised classification phenotyping approach using machine learning for patients diagnosed with primary breast cancer\",\"authors\":\"A. Bashir, Ullah Burhan, Sardar Fouzia, Junaid Hazrat, Zaman Khan Gul\",\"doi\":\"10.26634/jcom.11.1.19374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a methodology for the early detection and diagnosis of breast cancer using the Wisconsin dataset. The methodology involves four main steps, including data collection, preprocessing, feature selection, and classification. Fine needle aspiration technique is used to extract the ultrasound image features of breast cancer, and preprocessing is performed to eliminate outliers, null values, and noise. Redundant parameters are removed during the feature selection process to improve accuracy. Six machine learning algorithms, including Logistic Regression, Support Vector Machine, K-Nearest Neighbor, Random Forest, Decision Tree, and Gaussian Naive Bayes, are employed for the classification of the breast cancer dataset. Support Vector Machine and K-Nearest Neighbor achieved the highest accuracy, with Logistic Regression, Gaussian Naive Bayes, Random Forest, and Decision Tree having lower accuracy scores. The proposed methodology could aid in the timely detection and diagnosis of breast cancer, and help doctors in selecting the optimal clinical treatment plan for their patients. Further work will be carried out to investigate the effectiveness of additional preprocessing algorithms in improving the classification accuracy of the breast cancer dataset.\",\"PeriodicalId\":130578,\"journal\":{\"name\":\"i-manager's Journal on Computer Science\",\"volume\":\"32 15\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"i-manager's Journal on Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.26634/jcom.11.1.19374\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"i-manager's Journal on Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26634/jcom.11.1.19374","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文提出了一种使用威斯康辛数据集进行乳腺癌早期检测和诊断的方法。该方法包括数据收集、预处理、特征选择和分类四个主要步骤。采用细针穿刺技术提取乳腺癌超声图像特征，并对其进行预处理，去除异常值、零值和噪声。在特征选择过程中去除冗余参数，提高准确性。采用Logistic回归、支持向量机、k近邻、随机森林、决策树、高斯朴素贝叶斯等6种机器学习算法对乳腺癌数据集进行分类。支持向量机和k近邻的准确率最高，逻辑回归、高斯朴素贝叶斯、随机森林和决策树的准确率较低。所提出的方法有助于乳腺癌的及时发现和诊断，帮助医生为患者选择最佳的临床治疗方案。研究人员将进一步研究其他预处理算法在提高乳腺癌数据集分类准确性方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A supervised classification phenotyping approach using machine learning for patients diagnosed with primary breast cancer

This paper presents a methodology for the early detection and diagnosis of breast cancer using the Wisconsin dataset. The methodology involves four main steps, including data collection, preprocessing, feature selection, and classification. Fine needle aspiration technique is used to extract the ultrasound image features of breast cancer, and preprocessing is performed to eliminate outliers, null values, and noise. Redundant parameters are removed during the feature selection process to improve accuracy. Six machine learning algorithms, including Logistic Regression, Support Vector Machine, K-Nearest Neighbor, Random Forest, Decision Tree, and Gaussian Naive Bayes, are employed for the classification of the breast cancer dataset. Support Vector Machine and K-Nearest Neighbor achieved the highest accuracy, with Logistic Regression, Gaussian Naive Bayes, Random Forest, and Decision Tree having lower accuracy scores. The proposed methodology could aid in the timely detection and diagnosis of breast cancer, and help doctors in selecting the optimal clinical treatment plan for their patients. Further work will be carried out to investigate the effectiveness of additional preprocessing algorithms in improving the classification accuracy of the breast cancer dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

i-manager's Journal on Computer Science

自引率

0.00%

发文量