P. Dadheech, Vijay H. Kalmani, S. R. Dogiwal, V. Sharma, Ankit Kumar, S. Pandey
{"title":"Breast cancer prediction using supervised machine learning techniques","authors":"P. Dadheech, Vijay H. Kalmani, S. R. Dogiwal, V. Sharma, Ankit Kumar, S. Pandey","doi":"10.47974/jios-1348","DOIUrl":null,"url":null,"abstract":"Breast cancer is one of the most prevalent diseases in India’s urban regions and the second most common in the country’s rural parts. In India, a woman is diagnosed with breast cancer growth every four minutes, and a woman dies from breast cancer sickness every thirteen minutes. Over half of breast cancer patients in India are diagnosed with stage 3 or 4 illness, which has extremely low survival rates; hence, an urgent need exists for a rapid detection strategy. To forecast if a patient is at risk for breast cancer, we utilise the classification techniques of machine learning, in which the machine learning model learns from the previous information and can anticipate on the new information that is generated by the data. To create a model using Logistic Regression, Support Vector Machines, and Random Forests, this dataset was collected from the UCI repository and studied in this study. The primary goal is to improve the accuracy, precision, and sensitivity of all the algorithms that are used to categorise data in terms of the competency and viability of each and every algorithm. Random Forest has been shown to be the most accurate in classifying breast cancer, with a precision of 98.60 percent in tests. The Scientific Python Development Environment is used to carry out this machine learning study, which is written in the python programming language.","PeriodicalId":46518,"journal":{"name":"JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES","volume":"1 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47974/jios-1348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Breast cancer is one of the most prevalent diseases in India’s urban regions and the second most common in the country’s rural parts. In India, a woman is diagnosed with breast cancer growth every four minutes, and a woman dies from breast cancer sickness every thirteen minutes. Over half of breast cancer patients in India are diagnosed with stage 3 or 4 illness, which has extremely low survival rates; hence, an urgent need exists for a rapid detection strategy. To forecast if a patient is at risk for breast cancer, we utilise the classification techniques of machine learning, in which the machine learning model learns from the previous information and can anticipate on the new information that is generated by the data. To create a model using Logistic Regression, Support Vector Machines, and Random Forests, this dataset was collected from the UCI repository and studied in this study. The primary goal is to improve the accuracy, precision, and sensitivity of all the algorithms that are used to categorise data in terms of the competency and viability of each and every algorithm. Random Forest has been shown to be the most accurate in classifying breast cancer, with a precision of 98.60 percent in tests. The Scientific Python Development Environment is used to carry out this machine learning study, which is written in the python programming language.
乳腺癌是印度城市地区最常见的疾病之一,也是该国农村地区第二大常见疾病。在印度,每4分钟就有一名女性被诊断出患有乳腺癌,每13分钟就有一名女性死于乳腺癌。在印度,超过一半的乳腺癌患者被诊断为3期或4期,生存率极低;因此,迫切需要一种快速检测战略。为了预测患者是否有患乳腺癌的风险,我们利用机器学习的分类技术,其中机器学习模型从以前的信息中学习,并可以预测由数据生成的新信息。为了使用逻辑回归、支持向量机和随机森林来创建模型,本研究从UCI存储库中收集了该数据集并进行了研究。主要目标是根据每个算法的能力和可行性来提高用于对数据进行分类的所有算法的准确性、精度和灵敏度。随机森林已被证明是乳腺癌分类最准确的方法,在测试中准确率达到98.60%。本机器学习研究使用Scientific Python Development Environment进行,使用Python编程语言编写。