Breast cancer diagnosis and management guided by data augmentation, utilizing an integrated framework of SHAP and random augmentation

IF 5 3区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY BioFactors Pub Date : 2023-09-11 DOI:10.1002/biof.1995
Chukwuebuka Joseph Ejiyi, Zhen Qin, Happy Monday, Makuachukwu Bennedith Ejiyi, Chiagoziem Ukwuoma, Thomas Ugochukwu Ejiyi, Victor Kwaku Agbesi, Amarachi Agu, Chiduzie Orakwue
{"title":"Breast cancer diagnosis and management guided by data augmentation, utilizing an integrated framework of SHAP and random augmentation","authors":"Chukwuebuka Joseph Ejiyi,&nbsp;Zhen Qin,&nbsp;Happy Monday,&nbsp;Makuachukwu Bennedith Ejiyi,&nbsp;Chiagoziem Ukwuoma,&nbsp;Thomas Ugochukwu Ejiyi,&nbsp;Victor Kwaku Agbesi,&nbsp;Amarachi Agu,&nbsp;Chiduzie Orakwue","doi":"10.1002/biof.1995","DOIUrl":null,"url":null,"abstract":"<p>Recent research indicates that early detection of breast cancer (BC) is critical in achieving favorable treatment outcomes and reducing the mortality rate associated with it. With the difficulty in obtaining a balanced dataset that is primarily sourced for the diagnosis of the disease, many researchers have relied on data augmentation techniques, thereby having varying datasets with varying quality and results. The dataset we focused on in this study is crafted from SHapley Additive exPlanations (SHAP)-augmentation and random augmentation (RA) approaches to dealing with imbalanced data. This was carried out on the Wisconsin BC dataset and the effectiveness of this approach to the diagnosis of BC was checked using six machine-learning algorithms. RA synthetically generated some parts of the dataset while SHAP helped in assessing the quality of the attributes, which were selected and used for the training of the models. The result from our analysis shows that the performance of the models used generally increased to more than 3% for most of the models using the dataset obtained by the integration of SHAP and RA. Additionally, after diagnosis, it is important to focus on providing quality care to ensure the best possible outcomes for patients. The need for proper management of the disease state is crucial so as to reduce the recurrence of the disease and other associated complications. Thus the interpretability provided by SHAP enlightens the management strategies in this study focusing on the quality of care given to the patient and how timely the care is.</p>","PeriodicalId":8923,"journal":{"name":"BioFactors","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BioFactors","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/biof.1995","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Recent research indicates that early detection of breast cancer (BC) is critical in achieving favorable treatment outcomes and reducing the mortality rate associated with it. With the difficulty in obtaining a balanced dataset that is primarily sourced for the diagnosis of the disease, many researchers have relied on data augmentation techniques, thereby having varying datasets with varying quality and results. The dataset we focused on in this study is crafted from SHapley Additive exPlanations (SHAP)-augmentation and random augmentation (RA) approaches to dealing with imbalanced data. This was carried out on the Wisconsin BC dataset and the effectiveness of this approach to the diagnosis of BC was checked using six machine-learning algorithms. RA synthetically generated some parts of the dataset while SHAP helped in assessing the quality of the attributes, which were selected and used for the training of the models. The result from our analysis shows that the performance of the models used generally increased to more than 3% for most of the models using the dataset obtained by the integration of SHAP and RA. Additionally, after diagnosis, it is important to focus on providing quality care to ensure the best possible outcomes for patients. The need for proper management of the disease state is crucial so as to reduce the recurrence of the disease and other associated complications. Thus the interpretability provided by SHAP enlightens the management strategies in this study focusing on the quality of care given to the patient and how timely the care is.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用 SHAP 和随机扩增综合框架,以数据扩增指导乳腺癌诊断和管理。
最新研究表明,乳腺癌(BC)的早期检测对于获得良好的治疗效果和降低相关死亡率至关重要。由于难以获得主要用于疾病诊断的均衡数据集,许多研究人员依赖于数据增强技术,从而获得了质量和结果各不相同的数据集。在本研究中,我们重点关注的数据集是由 SHapley Additive exPlanations(SHAP)增强和随机增强(RA)方法精心制作而成的,用于处理不平衡数据。这项研究是在威斯康星 BC 数据集上进行的,并使用六种机器学习算法检验了这种方法对 BC 诊断的有效性。RA 合成了数据集的某些部分,而 SHAP 则帮助评估了属性的质量,这些属性被选中并用于模型的训练。我们的分析结果表明,使用 SHAP 和 RA 整合后获得的数据集,大多数模型的性能普遍提高了 3% 以上。此外,在确诊后,重要的是集中精力提供优质护理,以确保患者获得最佳治疗效果。为了减少疾病复发和其他相关并发症,对疾病状态进行适当管理至关重要。因此,SHAP 提供的可解释性为本研究中的管理策略提供了启示,其重点在于为患者提供的护理质量以及护理的及时性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BioFactors
BioFactors 生物-内分泌学与代谢
CiteScore
11.50
自引率
3.30%
发文量
96
审稿时长
6-12 weeks
期刊介绍: BioFactors, a journal of the International Union of Biochemistry and Molecular Biology, is devoted to the rapid publication of highly significant original research articles and reviews in experimental biology in health and disease. The word “biofactors” refers to the many compounds that regulate biological functions. Biological factors comprise many molecules produced or modified by living organisms, and present in many essential systems like the blood, the nervous or immunological systems. A non-exhaustive list of biological factors includes neurotransmitters, cytokines, chemokines, hormones, coagulation factors, transcription factors, signaling molecules, receptor ligands and many more. In the group of biofactors we can accommodate several classical molecules not synthetized in the body such as vitamins, micronutrients or essential trace elements. In keeping with this unified view of biochemistry, BioFactors publishes research dealing with the identification of new substances and the elucidation of their functions at the biophysical, biochemical, cellular and human level as well as studies revealing novel functions of already known biofactors. The journal encourages the submission of studies that use biochemistry, biophysics, cell and molecular biology and/or cell signaling approaches.
期刊最新文献
Construction of lysosome-related prognostic signature to predict the survival outcomes and selecting suitable drugs for patients with HNSCC. Navigating the immune landscape with plasma cells: A pan-cancer signature for precision immunotherapy. Machine learning models reveal ARHGAP11A's impact on lymph node metastasis and stemness in NSCLC. The carcinogenesis of esophageal squamous cell cancer is positively regulated by USP13 through WISP1 deubiquitination. Piperine: an emerging biofactor with anticancer efficacy and therapeutic potential.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1