Tabasum Majeed, Tariq Ahmad Masoodi, Muzafar Ahmad Macha, Muzafar Rasool Bhat, Khalid Muzaffar, Assif Assad
{"title":"Addressing data imbalance challenges in oral cavity histopathological whole slide images with advanced deep learning techniques","authors":"Tabasum Majeed, Tariq Ahmad Masoodi, Muzafar Ahmad Macha, Muzafar Rasool Bhat, Khalid Muzaffar, Assif Assad","doi":"10.1007/s13198-024-02440-6","DOIUrl":null,"url":null,"abstract":"<p>Oral Cavity Squamous Cell Carcinoma (OCSCC) represents a common form of head and neck cancer originating from the mucosal lining of the oral cavity, often detected in advanced stages. Traditional detection methods rely on analyzing hematoxylin and eosin (H&E)-stained histopathological whole-slide images, which are time-consuming and require expert pathology skills. Hence, automated analysis is urgently needed to expedite diagnosis and improve patient outcomes. Deep learning, through automated feature extraction, offers a promising avenue for capturing high-level abstract features with greater accuracy than traditional methods. However, the imbalance in class distribution within datasets significantly affects the performance of deep learning models during training, necessitating specialized approaches. To address the issue, various methods have been proposed at both data and algorithmic levels. This study investigates strategies to mitigate class imbalance by employing a publicly available OCSCC imbalance dataset. We evaluated undersampling methods (Near Miss, Edited Nearest Neighbors) and oversampling techniques (SMOTE, Deep SMOTE, ADASYN) integrated with transfer learning across different imbalance ratios (0.1, 0.15, 0.20, 0.30). Our findings demonstrate the effectiveness of SMOTE in improving test performance, highlighting the efficacy of strategic oversampling combined with transfer learning in classifying imbalanced medical datasets. This enhances OCSCC diagnostic accuracy, streamlines clinical decisions, and reduces reliance on costly histopathological tests.\n</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02440-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Oral Cavity Squamous Cell Carcinoma (OCSCC) represents a common form of head and neck cancer originating from the mucosal lining of the oral cavity, often detected in advanced stages. Traditional detection methods rely on analyzing hematoxylin and eosin (H&E)-stained histopathological whole-slide images, which are time-consuming and require expert pathology skills. Hence, automated analysis is urgently needed to expedite diagnosis and improve patient outcomes. Deep learning, through automated feature extraction, offers a promising avenue for capturing high-level abstract features with greater accuracy than traditional methods. However, the imbalance in class distribution within datasets significantly affects the performance of deep learning models during training, necessitating specialized approaches. To address the issue, various methods have been proposed at both data and algorithmic levels. This study investigates strategies to mitigate class imbalance by employing a publicly available OCSCC imbalance dataset. We evaluated undersampling methods (Near Miss, Edited Nearest Neighbors) and oversampling techniques (SMOTE, Deep SMOTE, ADASYN) integrated with transfer learning across different imbalance ratios (0.1, 0.15, 0.20, 0.30). Our findings demonstrate the effectiveness of SMOTE in improving test performance, highlighting the efficacy of strategic oversampling combined with transfer learning in classifying imbalanced medical datasets. This enhances OCSCC diagnostic accuracy, streamlines clinical decisions, and reduces reliance on costly histopathological tests.
期刊介绍:
This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems.
Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.