MgCNL: A Sample Separation Approach via Multi-Granularity Balls for Fault Diagnosis With the Interference of Noisy Labels

IF 6.4 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-10-07 DOI:10.1109/TASE.2024.3469000
Fir Dunkin;Xinde Li;Heqing Li;Guoliang Wu;Chuanfei Hu;Shuzhi Sam Ge
{"title":"MgCNL: A Sample Separation Approach via Multi-Granularity Balls for Fault Diagnosis With the Interference of Noisy Labels","authors":"Fir Dunkin;Xinde Li;Heqing Li;Guoliang Wu;Chuanfei Hu;Shuzhi Sam Ge","doi":"10.1109/TASE.2024.3469000","DOIUrl":null,"url":null,"abstract":"The fault diagnosis based on supervised learning has achieved remarkable results in the intelligent manufacturing, making it an important guarantee for long-term safe and stable operation in modern industry. However, the accuracy heavily relies on high-quality annotation labels, which are expensive to obtain, limiting the diagnosis models applicability in many scenarios. Although obtaining automatically annotated samples from annotators is a promising solution, the generated dataset is always containing incorrect labels (noisy labels), due to perceptual limitations, resulting in low or even invalid the accuracy of model. With the goal of handling this challenge, a diagnostic approach based on multi-granularity information fusion to combat noisy labels, called MgCNL, is proposed, to train the model with high-accuracy, without knowing the specific noise ratio. Specifically, inspired by granular-ball computing, a confidence evaluation method of labels is designed, so that samples with high confidence labels can be selected from dataset with noisy labels for supervised learning, thus avoiding the negative impact of incorrect labels on model performance. Finally, the efficacy was demonstrated on three datasets using different backbones: MgCNL successfully reduced the adverse impact of noisy labels, achieving significantly better results than other advanced methods in various noisy scenarios, which offers a competitive model training strategy for practitioners in intelligent manufacturing or industrial fault diagnosis who are hampered by the costs associated with sample labeling. Note to Practitioners—In modern industry, the cost of manual/expert annotation for high-quality data is is prohibitively expensive, and the data annotated by automatic annotators often contains noisy labels that seriously damages the accuracy of models, which makes many data-driven diagnosis models constrained by training data and difficult to put into practice, posing an urgent challenge to the automation and intelligence of the manufacturing industry. To address this challenge, this article proposed a robust training strategy called MgCNL, aimed at offsetting the negative impact of noisy labels, in the hope that automatic annotation strategy with lower cost can be more widely applied in model training tasks for industrial practice. MgCNL, based on multi-granularity information, can effectively select high-confidence samples from datasets for supervised learning, even under unknown proportions of noise labels, thus reducing the misleading impact of noisy labels on diagnostic models. As a result, MgCNL possesses the ability to robustly train high-accuracy diagnostic models in data with noisy labels, thus enabling automatic annotators to replace experts in dataset construction as a more economical and efficient potential technical approach. Meanwhile, MgCNL also brings value to datasets with uncertain labels, making them applicable without the need to invest significant human resources to verify label reliability.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"7748-7761"},"PeriodicalIF":6.4000,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10706599/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The fault diagnosis based on supervised learning has achieved remarkable results in the intelligent manufacturing, making it an important guarantee for long-term safe and stable operation in modern industry. However, the accuracy heavily relies on high-quality annotation labels, which are expensive to obtain, limiting the diagnosis models applicability in many scenarios. Although obtaining automatically annotated samples from annotators is a promising solution, the generated dataset is always containing incorrect labels (noisy labels), due to perceptual limitations, resulting in low or even invalid the accuracy of model. With the goal of handling this challenge, a diagnostic approach based on multi-granularity information fusion to combat noisy labels, called MgCNL, is proposed, to train the model with high-accuracy, without knowing the specific noise ratio. Specifically, inspired by granular-ball computing, a confidence evaluation method of labels is designed, so that samples with high confidence labels can be selected from dataset with noisy labels for supervised learning, thus avoiding the negative impact of incorrect labels on model performance. Finally, the efficacy was demonstrated on three datasets using different backbones: MgCNL successfully reduced the adverse impact of noisy labels, achieving significantly better results than other advanced methods in various noisy scenarios, which offers a competitive model training strategy for practitioners in intelligent manufacturing or industrial fault diagnosis who are hampered by the costs associated with sample labeling. Note to Practitioners—In modern industry, the cost of manual/expert annotation for high-quality data is is prohibitively expensive, and the data annotated by automatic annotators often contains noisy labels that seriously damages the accuracy of models, which makes many data-driven diagnosis models constrained by training data and difficult to put into practice, posing an urgent challenge to the automation and intelligence of the manufacturing industry. To address this challenge, this article proposed a robust training strategy called MgCNL, aimed at offsetting the negative impact of noisy labels, in the hope that automatic annotation strategy with lower cost can be more widely applied in model training tasks for industrial practice. MgCNL, based on multi-granularity information, can effectively select high-confidence samples from datasets for supervised learning, even under unknown proportions of noise labels, thus reducing the misleading impact of noisy labels on diagnostic models. As a result, MgCNL possesses the ability to robustly train high-accuracy diagnostic models in data with noisy labels, thus enabling automatic annotators to replace experts in dataset construction as a more economical and efficient potential technical approach. Meanwhile, MgCNL also brings value to datasets with uncertain labels, making them applicable without the need to invest significant human resources to verify label reliability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MgCNL:通过多粒度球进行样本分离的方法,用于受噪声标签干扰的故障诊断
基于监督学习的故障诊断在智能制造中取得了显著的效果,是现代工业长期安全稳定运行的重要保证。然而,准确率严重依赖于高质量的标注标签,而标注标签的获取成本很高,限制了诊断模型在许多场景中的适用性。虽然从标注者那里获得自动标注的样本是一个很有前途的解决方案,但由于感知的限制,生成的数据集总是包含不正确的标签(噪声标签),导致模型的准确性低甚至无效。为了应对这一挑战,提出了一种基于多粒度信息融合对抗噪声标签的诊断方法MgCNL,在不知道具体噪声比的情况下,高精度地训练模型。具体而言,受颗粒球计算的启发,设计了一种标签置信度评价方法,从带有噪声标签的数据集中选择具有高置信度标签的样本进行监督学习,避免了错误标签对模型性能的负面影响。最后,在使用不同主干的三个数据集上证明了该方法的有效性:MgCNL成功地降低了噪声标签的不利影响,在各种噪声场景下取得了比其他先进方法更好的结果,为智能制造或工业故障诊断从业者提供了一种有竞争力的模型训练策略,这些从业者受到样本标记相关成本的阻碍。在现代工业中,人工/专家对高质量数据进行标注的成本过高,而自动标注的数据往往含有噪声标签,严重影响了模型的准确性,这使得许多数据驱动的诊断模型受到训练数据的约束,难以实现,对制造业的自动化和智能化提出了迫切的挑战。为了解决这一挑战,本文提出了一种鲁棒性训练策略MgCNL,旨在抵消噪声标签的负面影响,希望成本更低的自动标注策略能够更广泛地应用于工业实践的模型训练任务中。MgCNL基于多粒度信息,即使在噪声标签比例未知的情况下,也可以有效地从数据集中选择高置信度的样本进行监督学习,从而减少了噪声标签对诊断模型的误导影响。因此,MgCNL具有在带有噪声标签的数据中鲁棒训练高精度诊断模型的能力,从而使自动注释器能够取代数据集构建中的专家,成为一种更经济高效的潜在技术方法。同时,MgCNL也为标签不确定的数据集带来了价值,使其无需投入大量人力资源来验证标签的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
期刊最新文献
Automated Action Generation based on Action Field for Robotic Garment Smoothing and Alignment Reinforcement learning-based distributed secondary frequency control and active power sharing in islanded microgrids with bandwidth-conscious memory-event-triggered mechanism Toward Reliable Imitation Learning with Limited Expert Demonstrations via Search-based Inverse Dynamic Learning C-CBF: Communication-Aware Control Barrier Functions for Resilient Multi-Robot Connectivity Extended State Observer-Based Predefined Time Composite Anti-Disturbance Control for Hydraulic Cutting Arm
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1