Zazil Ibarra-Cuevas, Jose Nunez-Varela, Alberto Nunez-Varela, Francisco E. Martinez-Perez, Sandra E. Nava-Muñoz, Cesar A. Ramirez-Gamez, Hector G. Perez-Gonzalez
{"title":"利用特征选择确定乳腺癌的相关风险因素","authors":"Zazil Ibarra-Cuevas, Jose Nunez-Varela, Alberto Nunez-Varela, Francisco E. Martinez-Perez, Sandra E. Nava-Muñoz, Cesar A. Ramirez-Gamez, Hector G. Perez-Gonzalez","doi":"10.1134/s0361768823080091","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Breast cancer is a serious threat to women’s health worldwide. Although the exact causes of this disease are still unknown, it is known that the incidence of breast cancer is associated with risk factors. Risk factors in cancer are any genetic, reproductive, hormonal, physical, biological, or lifestyle-related conditions that increase the likelihood of developing breast cancer. This research aims to identify the most relevant risk factors in patients with breast cancer in a dataset by following the <i>Knowledge Discovery in Databases</i> process. To determine the relevance of risk factors, this research implements two feature selection methods: the <i>Chi-Squared test</i> and <i>Mutual Information</i>; and seven classifiers are used to validate the results obtained. Our results show that the risk factors identified as the most relevant are related to the age of the patient, her menopausal status, whether she had undergone hormonal therapy, and her type of menopause.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"18 1","pages":""},"PeriodicalIF":0.7000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Determination of Relevant Risk Factors for Breast Cancer Using Feature Selection\",\"authors\":\"Zazil Ibarra-Cuevas, Jose Nunez-Varela, Alberto Nunez-Varela, Francisco E. Martinez-Perez, Sandra E. Nava-Muñoz, Cesar A. Ramirez-Gamez, Hector G. Perez-Gonzalez\",\"doi\":\"10.1134/s0361768823080091\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Abstract</h3><p>Breast cancer is a serious threat to women’s health worldwide. Although the exact causes of this disease are still unknown, it is known that the incidence of breast cancer is associated with risk factors. Risk factors in cancer are any genetic, reproductive, hormonal, physical, biological, or lifestyle-related conditions that increase the likelihood of developing breast cancer. This research aims to identify the most relevant risk factors in patients with breast cancer in a dataset by following the <i>Knowledge Discovery in Databases</i> process. To determine the relevance of risk factors, this research implements two feature selection methods: the <i>Chi-Squared test</i> and <i>Mutual Information</i>; and seven classifiers are used to validate the results obtained. Our results show that the risk factors identified as the most relevant are related to the age of the patient, her menopausal status, whether she had undergone hormonal therapy, and her type of menopause.</p>\",\"PeriodicalId\":54555,\"journal\":{\"name\":\"Programming and Computer Software\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2024-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Programming and Computer Software\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1134/s0361768823080091\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Programming and Computer Software","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1134/s0361768823080091","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Determination of Relevant Risk Factors for Breast Cancer Using Feature Selection
Abstract
Breast cancer is a serious threat to women’s health worldwide. Although the exact causes of this disease are still unknown, it is known that the incidence of breast cancer is associated with risk factors. Risk factors in cancer are any genetic, reproductive, hormonal, physical, biological, or lifestyle-related conditions that increase the likelihood of developing breast cancer. This research aims to identify the most relevant risk factors in patients with breast cancer in a dataset by following the Knowledge Discovery in Databases process. To determine the relevance of risk factors, this research implements two feature selection methods: the Chi-Squared test and Mutual Information; and seven classifiers are used to validate the results obtained. Our results show that the risk factors identified as the most relevant are related to the age of the patient, her menopausal status, whether she had undergone hormonal therapy, and her type of menopause.
期刊介绍:
Programming and Computer Software is a peer reviewed journal devoted to problems in all areas of computer science: operating systems, compiler technology, software engineering, artificial intelligence, etc.