Saqib ul Sabha, Assif Assad, Sadaf Shafi, Nusrat Mohi Ud Din, Rayees Ahmad Dar, Muzafar Rasool Bhat
{"title":"Imbalcbl: addressing deep learning challenges with small and imbalanced datasets","authors":"Saqib ul Sabha, Assif Assad, Sadaf Shafi, Nusrat Mohi Ud Din, Rayees Ahmad Dar, Muzafar Rasool Bhat","doi":"10.1007/s13198-024-02346-3","DOIUrl":null,"url":null,"abstract":"<p>Deep learning, while transformative for computer vision, frequently falters when confronted with small and imbalanced datasets. Despite substantial progress in this domain, prevailing models often underachieve under these constraints. Addressing this, we introduce an innovative contrast-based learning strategy for small and imbalanced data that significantly bolsters the proficiency of deep learning architectures on these challenging datasets. By ingeniously concatenating training images, the effective training dataset expands from <i>n</i> to <span>\\(n^2\\)</span>, affording richer data for model training, even when <i>n</i> is very small. Remarkably, our solution remains indifferent to specific loss functions or network architectures, endorsing its adaptability for diverse classification scenarios. Rigorously benchmarked against four benchmark datasets, our approach was juxtaposed with state-of-the-art oversampling paradigms. The empirical evidence underscores our method’s superior efficacy, outshining contemporaries across metrics like Balanced accuracy, F1 score, and Geometric mean. Noteworthy increments include 7–16% on the Covid-19 dataset, 4–20% for Honey bees, 1–6% on CIFAR-10, and 1–9% on FashionMNIST. In essence, our proposed method offers a potent remedy for the perennial issues stemming from scanty and skewed data in deep learning.</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02346-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning, while transformative for computer vision, frequently falters when confronted with small and imbalanced datasets. Despite substantial progress in this domain, prevailing models often underachieve under these constraints. Addressing this, we introduce an innovative contrast-based learning strategy for small and imbalanced data that significantly bolsters the proficiency of deep learning architectures on these challenging datasets. By ingeniously concatenating training images, the effective training dataset expands from n to \(n^2\), affording richer data for model training, even when n is very small. Remarkably, our solution remains indifferent to specific loss functions or network architectures, endorsing its adaptability for diverse classification scenarios. Rigorously benchmarked against four benchmark datasets, our approach was juxtaposed with state-of-the-art oversampling paradigms. The empirical evidence underscores our method’s superior efficacy, outshining contemporaries across metrics like Balanced accuracy, F1 score, and Geometric mean. Noteworthy increments include 7–16% on the Covid-19 dataset, 4–20% for Honey bees, 1–6% on CIFAR-10, and 1–9% on FashionMNIST. In essence, our proposed method offers a potent remedy for the perennial issues stemming from scanty and skewed data in deep learning.
期刊介绍:
This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems.
Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.