HSALC: hard sample aware label correction for medical image classification

IF 3 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Multimedia Tools and Applications Pub Date : 2024-09-02 DOI:10.1007/s11042-024-20114-0

Yangtao Wang, Yicheng Ye, Yanzhao Xie, Maobin Tang, Lisheng Fan

{"title":"HSALC: hard sample aware label correction for medical image classification","authors":"Yangtao Wang, Yicheng Ye, Yanzhao Xie, Maobin Tang, Lisheng Fan","doi":"10.1007/s11042-024-20114-0","DOIUrl":null,"url":null,"abstract":"Medical image automatic classification has always been a research hotspot, but the existing methods suffer from the label noise problem, which either discards those samples with noisy labels or produces wrong label correction, seriously preventing the medical image classification performance improvement. To address the above problems, in this paper, we propose a hard sample aware label correction (termed as HSALC) method for medical image classification. Our HSALC mainly consists of a sample division module, a clean\\(\\cdot \\)hard\\(\\cdot \\)noisy (termed as CHN) detection module and a label noise correction module. First, in the sample division module, we design a sample division criterion based on the training difficulty and training losses to divide all samples into three preliminary subsets: clean samples, hard samples and noisy samples. Second, in the CHN detection module, we add noise to the above clean samples and repeatedly adopt the sample division criterion to effectively detect all data, which helps obtain highly reliable clean samples, hard samples and noisy samples. Finally, in the label noise correction module, in order to make full use of each available sample, we train a correction model to purify and correct the wrong labels of noisy samples as much as possible, which brings a highly purified dataset. We conduct extensive experiments on five image datasets including three medical image datasets and two natural image datasets. Experimental results demonstrate that HSALC can greatly promote classification performance on noisily labeled datasets, especially with high noise ratios. The source code of this paper is publicly available at GitHub: https://github.com/YYC117/HSALC.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"60 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Tools and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11042-024-20114-0","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Medical image automatic classification has always been a research hotspot, but the existing methods suffer from the label noise problem, which either discards those samples with noisy labels or produces wrong label correction, seriously preventing the medical image classification performance improvement. To address the above problems, in this paper, we propose a hard sample aware label correction (termed as HSALC) method for medical image classification. Our HSALC mainly consists of a sample division module, a clean\(\cdot \)hard\(\cdot \)noisy (termed as CHN) detection module and a label noise correction module. First, in the sample division module, we design a sample division criterion based on the training difficulty and training losses to divide all samples into three preliminary subsets: clean samples, hard samples and noisy samples. Second, in the CHN detection module, we add noise to the above clean samples and repeatedly adopt the sample division criterion to effectively detect all data, which helps obtain highly reliable clean samples, hard samples and noisy samples. Finally, in the label noise correction module, in order to make full use of each available sample, we train a correction model to purify and correct the wrong labels of noisy samples as much as possible, which brings a highly purified dataset. We conduct extensive experiments on five image datasets including three medical image datasets and two natural image datasets. Experimental results demonstrate that HSALC can greatly promote classification performance on noisily labeled datasets, especially with high noise ratios. The source code of this paper is publicly available at GitHub: https://github.com/YYC117/HSALC.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

HSALC：用于医学图像分类的硬样本感知标签校正

医学影像自动分类一直是研究热点，但现有方法存在标签噪声问题，要么丢弃有噪声标签的样本，要么产生错误的标签校正，严重阻碍了医学影像分类性能的提高。针对上述问题，本文提出了一种用于医学图像分类的硬样本感知标签校正方法（简称 HSALC）。HSALC主要由样本划分模块、噪声检测模块和标签噪声校正模块组成。首先，在样本划分模块中，我们设计了一个基于训练难度和训练损失的样本划分准则，将所有样本初步划分为三个子集：干净样本、困难样本和噪声样本。其次，在 CHN 检测模块中，我们在上述干净样本中加入噪声，并反复采用样本划分准则对所有数据进行有效检测，这有助于获得高可靠性的干净样本、硬样本和噪声样本。最后，在标签噪声校正模块中，为了充分利用每一个可用样本，我们训练了一个校正模型，以尽可能净化和校正噪声样本的错误标签，从而带来一个高度纯化的数据集。我们在五个图像数据集上进行了大量实验，包括三个医学图像数据集和两个自然图像数据集。实验结果表明，HSALC 可以大大提高噪声标签数据集的分类性能，尤其是在高噪声比的情况下。本文的源代码可在 GitHub 上公开获取：https://github.com/YYC117/HSALC。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Multimedia Tools and Applications 工程技术-工程：电子与电气

CiteScore

7.20

自引率

16.70%

发文量

2439

审稿时长

9.2 months

期刊介绍： Multimedia Tools and Applications publishes original research articles on multimedia development and system support tools as well as case studies of multimedia applications. It also features experimental and survey articles. The journal is intended for academics, practitioners, scientists and engineers who are involved in multimedia system research, design and applications. All papers are peer reviewed. Specific areas of interest include: - Multimedia Tools: - Multimedia Applications: - Prototype multimedia systems and platforms