Chukwuebuka Joseph Ejiyi, Zhen Qin, Happy Monday, Makuachukwu Bennedith Ejiyi, Chiagoziem Ukwuoma, Thomas Ugochukwu Ejiyi, Victor Kwaku Agbesi, Amarachi Agu, Chiduzie Orakwue
{"title":"Breast cancer diagnosis and management guided by data augmentation, utilizing an integrated framework of SHAP and random augmentation","authors":"Chukwuebuka Joseph Ejiyi, Zhen Qin, Happy Monday, Makuachukwu Bennedith Ejiyi, Chiagoziem Ukwuoma, Thomas Ugochukwu Ejiyi, Victor Kwaku Agbesi, Amarachi Agu, Chiduzie Orakwue","doi":"10.1002/biof.1995","DOIUrl":null,"url":null,"abstract":"<p>Recent research indicates that early detection of breast cancer (BC) is critical in achieving favorable treatment outcomes and reducing the mortality rate associated with it. With the difficulty in obtaining a balanced dataset that is primarily sourced for the diagnosis of the disease, many researchers have relied on data augmentation techniques, thereby having varying datasets with varying quality and results. The dataset we focused on in this study is crafted from SHapley Additive exPlanations (SHAP)-augmentation and random augmentation (RA) approaches to dealing with imbalanced data. This was carried out on the Wisconsin BC dataset and the effectiveness of this approach to the diagnosis of BC was checked using six machine-learning algorithms. RA synthetically generated some parts of the dataset while SHAP helped in assessing the quality of the attributes, which were selected and used for the training of the models. The result from our analysis shows that the performance of the models used generally increased to more than 3% for most of the models using the dataset obtained by the integration of SHAP and RA. Additionally, after diagnosis, it is important to focus on providing quality care to ensure the best possible outcomes for patients. The need for proper management of the disease state is crucial so as to reduce the recurrence of the disease and other associated complications. Thus the interpretability provided by SHAP enlightens the management strategies in this study focusing on the quality of care given to the patient and how timely the care is.</p>","PeriodicalId":8923,"journal":{"name":"BioFactors","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BioFactors","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/biof.1995","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Recent research indicates that early detection of breast cancer (BC) is critical in achieving favorable treatment outcomes and reducing the mortality rate associated with it. With the difficulty in obtaining a balanced dataset that is primarily sourced for the diagnosis of the disease, many researchers have relied on data augmentation techniques, thereby having varying datasets with varying quality and results. The dataset we focused on in this study is crafted from SHapley Additive exPlanations (SHAP)-augmentation and random augmentation (RA) approaches to dealing with imbalanced data. This was carried out on the Wisconsin BC dataset and the effectiveness of this approach to the diagnosis of BC was checked using six machine-learning algorithms. RA synthetically generated some parts of the dataset while SHAP helped in assessing the quality of the attributes, which were selected and used for the training of the models. The result from our analysis shows that the performance of the models used generally increased to more than 3% for most of the models using the dataset obtained by the integration of SHAP and RA. Additionally, after diagnosis, it is important to focus on providing quality care to ensure the best possible outcomes for patients. The need for proper management of the disease state is crucial so as to reduce the recurrence of the disease and other associated complications. Thus the interpretability provided by SHAP enlightens the management strategies in this study focusing on the quality of care given to the patient and how timely the care is.
期刊介绍:
BioFactors, a journal of the International Union of Biochemistry and Molecular Biology, is devoted to the rapid publication of highly significant original research articles and reviews in experimental biology in health and disease.
The word “biofactors” refers to the many compounds that regulate biological functions. Biological factors comprise many molecules produced or modified by living organisms, and present in many essential systems like the blood, the nervous or immunological systems. A non-exhaustive list of biological factors includes neurotransmitters, cytokines, chemokines, hormones, coagulation factors, transcription factors, signaling molecules, receptor ligands and many more. In the group of biofactors we can accommodate several classical molecules not synthetized in the body such as vitamins, micronutrients or essential trace elements.
In keeping with this unified view of biochemistry, BioFactors publishes research dealing with the identification of new substances and the elucidation of their functions at the biophysical, biochemical, cellular and human level as well as studies revealing novel functions of already known biofactors. The journal encourages the submission of studies that use biochemistry, biophysics, cell and molecular biology and/or cell signaling approaches.