Breast Cancer Detection with Revamped Dataset Using Machine Learning Techniques

Sundarambal Balaraman, Ramesh Ramamoorthy, R. Krishnamoorthi
{"title":"Breast Cancer Detection with Revamped Dataset Using Machine Learning Techniques","authors":"Sundarambal Balaraman, Ramesh Ramamoorthy, R. Krishnamoorthi","doi":"10.1166/jmihi.2021.3892","DOIUrl":null,"url":null,"abstract":"Machine learning is a current topic of interest in research and industry, with the implementation of novel strategies all the time. The main purpose of this research activity is to determine the efficiency of machine learning techniques in the detection research of breast cancer. The\n incidence and mortality of breast cancer in women are increasing day by day. Worldwide, researchers have worked hard to help clinicians provide the best model for detecting diagnosis and breast cancer. In this work, learning UCI machine Wisconsin breast cancer data from a set of databases,\n model, and analyze the performance of existing work use, compared to the same data set. The dataset is analyzed, and the revamped dataset is constructed by eliminating redundant features and appending new features essential for prediction. Logistic regression, K nearest neighbors (KNN), support\n vector machine (SVM), decision trees, random forest, XGBoost, using a machine learning algorithm, such as re-organized data set of artificial neural network AdaBoost, 8 one of prediction build the model application (ANN). Standard to analyze the accuracy rate. In the experiment, these classifications\n have been shown to work for breast cancer with >97% accuracy. Logistic regression, XGBoost and Adaboost, stand on top with 99.28 percent accuracy. The experiment also, the balanced data set of removal outliers and balance, shows that have a significant impact on the model’s prediction\n performance.","PeriodicalId":393031,"journal":{"name":"J. Medical Imaging Health Informatics","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Medical Imaging Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1166/jmihi.2021.3892","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning is a current topic of interest in research and industry, with the implementation of novel strategies all the time. The main purpose of this research activity is to determine the efficiency of machine learning techniques in the detection research of breast cancer. The incidence and mortality of breast cancer in women are increasing day by day. Worldwide, researchers have worked hard to help clinicians provide the best model for detecting diagnosis and breast cancer. In this work, learning UCI machine Wisconsin breast cancer data from a set of databases, model, and analyze the performance of existing work use, compared to the same data set. The dataset is analyzed, and the revamped dataset is constructed by eliminating redundant features and appending new features essential for prediction. Logistic regression, K nearest neighbors (KNN), support vector machine (SVM), decision trees, random forest, XGBoost, using a machine learning algorithm, such as re-organized data set of artificial neural network AdaBoost, 8 one of prediction build the model application (ANN). Standard to analyze the accuracy rate. In the experiment, these classifications have been shown to work for breast cancer with >97% accuracy. Logistic regression, XGBoost and Adaboost, stand on top with 99.28 percent accuracy. The experiment also, the balanced data set of removal outliers and balance, shows that have a significant impact on the model’s prediction performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用机器学习技术改进数据集的乳腺癌检测
机器学习是当前研究和工业界感兴趣的话题,一直在实施新的策略。这项研究活动的主要目的是确定机器学习技术在乳腺癌检测研究中的效率。妇女乳腺癌的发病率和死亡率日益增加。在世界范围内,研究人员一直在努力帮助临床医生提供检测诊断和乳腺癌的最佳模型。在本工作中,UCI机器从一组数据库中学习威斯康星乳腺癌数据,建立模型,并分析现有工作使用的性能,对比相同的数据集。对数据集进行分析,剔除冗余特征,添加预测所需的新特征,构建改进后的数据集。逻辑回归、K近邻(KNN)、支持向量机(SVM)、决策树、随机森林、XGBoost、采用机器学习等算法重组数据集的人工神经网络AdaBoost、8预测构建模型应用(ANN)之一。标准来分析准确率。在实验中,这些分类已被证明对乳腺癌有效,准确率为97%。逻辑回归,XGBoost和Adaboost,以99.28%的准确率位居榜首。实验还表明,平衡数据集的去除异常值和平衡,对模型的预测性能有显著影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
相关文献
Design and Implementation of Autonomous UAV Tracking System Using GPS and GPRS
IF 0 Advances in Intelligent Systems and ComputingPub Date : 2018-07-10 DOI: 10.1007/978-981-13-0224-4_39
Devang Thakkar, Pruthvish Rajput, Rahul Dubey, R. Parekh
Design and implementation of marine dumping area's monitoring system based on GPS/GPRS
IF 0 2010 Sixth International Conference on Natural ComputationPub Date : 2010-09-23 DOI: 10.1109/ICNC.2010.5582679
Guoliang Zou, Bing He, Dongmei Huang, X. Kong
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Application Value of CT Perfusion Imaging in Patients with Posterior Circulation Hyperacute Cerebral Infarction An Operative Acute Brain Tumor Recognition by Jointure Inward Unswerving Probabilistic Neural Network Classifier Making Semi-Automatic Segmentation Method to be Automatic Using Deep Learning for Biventricular Segmentation Improved Wavelet Filter Bank Selection for Effective Feature Extraction in Alzheimer Classification An Efficient Approach to Detect Meningioma Brain Tumor Using Adaptive Neuro Fuzzy Inference System Method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1