Addressing data imbalance challenges in oral cavity histopathological whole slide images with advanced deep learning techniques

IF 1.6 Q2 ENGINEERING, MULTIDISCIPLINARY International Journal of System Assurance Engineering and Management Pub Date : 2024-07-26 DOI:10.1007/s13198-024-02440-6
Tabasum Majeed, Tariq Ahmad Masoodi, Muzafar Ahmad Macha, Muzafar Rasool Bhat, Khalid Muzaffar, Assif Assad
{"title":"Addressing data imbalance challenges in oral cavity histopathological whole slide images with advanced deep learning techniques","authors":"Tabasum Majeed, Tariq Ahmad Masoodi, Muzafar Ahmad Macha, Muzafar Rasool Bhat, Khalid Muzaffar, Assif Assad","doi":"10.1007/s13198-024-02440-6","DOIUrl":null,"url":null,"abstract":"<p>Oral Cavity Squamous Cell Carcinoma (OCSCC) represents a common form of head and neck cancer originating from the mucosal lining of the oral cavity, often detected in advanced stages. Traditional detection methods rely on analyzing hematoxylin and eosin (H&amp;E)-stained histopathological whole-slide images, which are time-consuming and require expert pathology skills. Hence, automated analysis is urgently needed to expedite diagnosis and improve patient outcomes. Deep learning, through automated feature extraction, offers a promising avenue for capturing high-level abstract features with greater accuracy than traditional methods. However, the imbalance in class distribution within datasets significantly affects the performance of deep learning models during training, necessitating specialized approaches. To address the issue, various methods have been proposed at both data and algorithmic levels. This study investigates strategies to mitigate class imbalance by employing a publicly available OCSCC imbalance dataset. We evaluated undersampling methods (Near Miss, Edited Nearest Neighbors) and oversampling techniques (SMOTE, Deep SMOTE, ADASYN) integrated with transfer learning across different imbalance ratios (0.1, 0.15, 0.20, 0.30). Our findings demonstrate the effectiveness of SMOTE in improving test performance, highlighting the efficacy of strategic oversampling combined with transfer learning in classifying imbalanced medical datasets. This enhances OCSCC diagnostic accuracy, streamlines clinical decisions, and reduces reliance on costly histopathological tests.\n</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02440-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Oral Cavity Squamous Cell Carcinoma (OCSCC) represents a common form of head and neck cancer originating from the mucosal lining of the oral cavity, often detected in advanced stages. Traditional detection methods rely on analyzing hematoxylin and eosin (H&E)-stained histopathological whole-slide images, which are time-consuming and require expert pathology skills. Hence, automated analysis is urgently needed to expedite diagnosis and improve patient outcomes. Deep learning, through automated feature extraction, offers a promising avenue for capturing high-level abstract features with greater accuracy than traditional methods. However, the imbalance in class distribution within datasets significantly affects the performance of deep learning models during training, necessitating specialized approaches. To address the issue, various methods have been proposed at both data and algorithmic levels. This study investigates strategies to mitigate class imbalance by employing a publicly available OCSCC imbalance dataset. We evaluated undersampling methods (Near Miss, Edited Nearest Neighbors) and oversampling techniques (SMOTE, Deep SMOTE, ADASYN) integrated with transfer learning across different imbalance ratios (0.1, 0.15, 0.20, 0.30). Our findings demonstrate the effectiveness of SMOTE in improving test performance, highlighting the efficacy of strategic oversampling combined with transfer learning in classifying imbalanced medical datasets. This enhances OCSCC diagnostic accuracy, streamlines clinical decisions, and reduces reliance on costly histopathological tests.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用先进的深度学习技术解决口腔组织病理学全切片图像中的数据不平衡难题
口腔鳞状细胞癌(OCSCC)是一种常见的头颈部癌症,起源于口腔黏膜,通常在晚期才被发现。传统的检测方法依赖于分析苏木精和伊红(H&E)染色的组织病理学全切片图像,这不仅耗时,而且需要专业的病理学技能。因此,迫切需要进行自动分析,以加快诊断速度,改善患者预后。与传统方法相比,深度学习通过自动特征提取,为捕捉高级抽象特征提供了一条前景广阔的途径,其准确性更高。然而,数据集内类别分布的不平衡严重影响了深度学习模型在训练过程中的表现,因此有必要采用专门的方法。为了解决这个问题,人们在数据和算法层面提出了各种方法。本研究采用公开的 OCSCC 失衡数据集,研究缓解类失衡的策略。我们评估了不同失衡率(0.1、0.15、0.20、0.30)下与迁移学习相结合的欠采样方法(Near Miss、Edited Nearest Neighbors)和超采样技术(SMOTE、Deep SMOTE、ADASYN)。我们的研究结果证明了 SMOTE 在提高测试性能方面的有效性,凸显了策略性超采样与迁移学习相结合在不平衡医疗数据集分类中的功效。这提高了 OCSCC 诊断的准确性,简化了临床决策,并减少了对昂贵的组织病理学测试的依赖。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.30
自引率
10.00%
发文量
252
期刊介绍: This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems. Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.
期刊最新文献
Vision-based gait analysis to detect Parkinson’s disease using hybrid Harris hawks and Arithmetic optimization algorithm with Random Forest classifier Zero crossing point detection in a distorted sinusoidal signal using random forest classifier FL-XGBTC: federated learning inspired with XG-boost tuned classifier for YouTube spam content detection A generalized product adoption model under random marketing conditions Assessing e-learning platforms in higher education with reference to student satisfaction: a PLS-SEM approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1