Rethinking Discrepancy Analysis: Anomaly Detection via Meta-Learning Powered Dual-Source Representation Differentiation

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-11-05 DOI:10.1109/TASE.2024.3486688

Muyan Yao;Dan Tao;Peng Qi;Ruipeng Gao

{"title":"Rethinking Discrepancy Analysis: Anomaly Detection via Meta-Learning Powered Dual-Source Representation Differentiation","authors":"Muyan Yao;Dan Tao;Peng Qi;Ruipeng Gao","doi":"10.1109/TASE.2024.3486688","DOIUrl":null,"url":null,"abstract":"Industrial environments pose distinctive challenges for anomaly detection, primarily stemming from the complexities associated with high dimensionality and the dynamic nature of data patterns over time. These properties determine that the model’s proper convergence on unlabeled data is unpromising, consequently leading to less efficient discrimination of anomalies in previous anomaly detection (AD) works. To address this problem, we present AnoDual, a novel, meta-learning AD framework. From the perspective of data reconstruction, we introduce the multi-memory enhanced VAE reconstructor M2ER, which learns to extract the most salient patterns in unlabeled noisy data through a self-supervised manner. This design eases impacts from potential anomalous components during data reconstruction, and enhances the discernibility of anomalies. To address performance degradation caused by the numerical deviation based AD scheme in most existing works, we design a dual-source self-supervised discriminator DSD, which examines characteristics in the domain of representations. This model actively assesses discrepancies between data pairs and representation pairs in parallel, and conducts AD on a fine-grained scale. In this way, anomalies that used to be unnoticed due to a less prominent numerical deviation can be spotted. Besides, we propose a meta-learning powered training pipeline to enable model training even when no real label is available, which is common in the industry. Extensive experiments on five large-scale real-world industrial datasets suggest that AnoDual achieves an average F1-Score with a substantial increment of 3.39 %, outperforming the latest state-of-the-art baseline. Note to Practitioners—A generative model plus a numerical threshold based detection approach currently takes a significant share in both academia and the industry. However, the performance of this workflow is not promising in actual applications, with multiple factors contributing to this situation. The proper convergence of such generative models is difficult when the training material contains noisy samples - an over-expressed generative model would result in less significant reconstruction discrepancies for anomalies that are hard to notice. In addition, selecting a numerical threshold, which is used to spot anomalies, requires multiple laborious attempts, and can hardly adapt to an ever-changing pattern in industrial environments. These circumstances make it challenging to apply prior works in practical production, which, in turn, urges the need to develop an effective methodology to address the need for industrial anomaly detection. This manuscript includes a novel, meta-learning powered framework AnoDual, which is tailored for industrial scenarios. This framework discards the conventional design of comparing the reconstruction error numerically, but introduces a solution based on the differentiation of the representations. Besides, the multi-head attention enhanced variational autoencoder also leads to a much more pronounced discrepancy for anomalous samples, which benefits their successful detection. Providing a flexible and robust way to detect anomalies on deployed IoT assets, this work can be further transformed to serve applications in many other domains.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"8579-8592"},"PeriodicalIF":6.4000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10744204/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Industrial environments pose distinctive challenges for anomaly detection, primarily stemming from the complexities associated with high dimensionality and the dynamic nature of data patterns over time. These properties determine that the model’s proper convergence on unlabeled data is unpromising, consequently leading to less efficient discrimination of anomalies in previous anomaly detection (AD) works. To address this problem, we present AnoDual, a novel, meta-learning AD framework. From the perspective of data reconstruction, we introduce the multi-memory enhanced VAE reconstructor M2ER, which learns to extract the most salient patterns in unlabeled noisy data through a self-supervised manner. This design eases impacts from potential anomalous components during data reconstruction, and enhances the discernibility of anomalies. To address performance degradation caused by the numerical deviation based AD scheme in most existing works, we design a dual-source self-supervised discriminator DSD, which examines characteristics in the domain of representations. This model actively assesses discrepancies between data pairs and representation pairs in parallel, and conducts AD on a fine-grained scale. In this way, anomalies that used to be unnoticed due to a less prominent numerical deviation can be spotted. Besides, we propose a meta-learning powered training pipeline to enable model training even when no real label is available, which is common in the industry. Extensive experiments on five large-scale real-world industrial datasets suggest that AnoDual achieves an average F1-Score with a substantial increment of 3.39 %, outperforming the latest state-of-the-art baseline. Note to Practitioners—A generative model plus a numerical threshold based detection approach currently takes a significant share in both academia and the industry. However, the performance of this workflow is not promising in actual applications, with multiple factors contributing to this situation. The proper convergence of such generative models is difficult when the training material contains noisy samples - an over-expressed generative model would result in less significant reconstruction discrepancies for anomalies that are hard to notice. In addition, selecting a numerical threshold, which is used to spot anomalies, requires multiple laborious attempts, and can hardly adapt to an ever-changing pattern in industrial environments. These circumstances make it challenging to apply prior works in practical production, which, in turn, urges the need to develop an effective methodology to address the need for industrial anomaly detection. This manuscript includes a novel, meta-learning powered framework AnoDual, which is tailored for industrial scenarios. This framework discards the conventional design of comparing the reconstruction error numerically, but introduces a solution based on the differentiation of the representations. Besides, the multi-head attention enhanced variational autoencoder also leads to a much more pronounced discrepancy for anomalous samples, which benefits their successful detection. Providing a flexible and robust way to detect anomalies on deployed IoT assets, this work can be further transformed to serve applications in many other domains.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

反思差异分析：通过元学习驱动的双源表征区分进行异常检测

工业环境对异常检测提出了独特的挑战，主要源于与高维和数据模式随时间变化的动态特性相关的复杂性。这些特性决定了该模型在未标记数据上的适当收敛性是没有希望的，从而导致在以前的异常检测（AD）工作中对异常的识别效率较低。为了解决这个问题，我们提出了AnoDual，一个新颖的元学习AD框架。从数据重构的角度，我们引入了多存储器增强VAE重构器M2ER，它通过自监督的方式学习提取未标记的噪声数据中最显著的模式。这种设计减轻了数据重建过程中潜在异常成分的影响，提高了异常的可识别性。为了解决大多数现有工作中基于数值偏差的AD方案所导致的性能下降问题，我们设计了一个双源自监督鉴别器DSD，用于检测表征域的特征。该模型并行地主动评估数据对和表示对之间的差异，并在细粒度范围内进行AD。通过这种方式，过去由于不太突出的数值偏差而被忽视的异常可以被发现。此外，我们提出了一个元学习驱动的训练管道，即使在没有真实标签的情况下也可以进行模型训练，这在行业中很常见。在五个大规模的真实工业数据集上进行的广泛实验表明，AnoDual达到了平均F1-Score，大幅增加了3.39%，优于最新的最先进的基线。从业者注意：生成模型加上基于数值阈值的检测方法目前在学术界和工业界都占有很大的份额。然而，这种工作流的性能在实际应用中并不理想，造成这种情况的因素很多。当训练材料中包含有噪声的样本时，这种生成模型的适当收敛是困难的-过度表达的生成模型将导致难以注意到的异常的不太显著的重建差异。此外，选择用于发现异常的数值阈值需要多次费力的尝试，并且很难适应工业环境中不断变化的模式。这些情况使得在实际生产中应用之前的工作具有挑战性，这反过来又促使需要开发一种有效的方法来解决工业异常检测的需求。本文包括一个新颖的、元学习驱动的框架AnoDual，它是为工业场景量身定制的。该框架抛弃了传统的数值比较重构误差的设计，引入了一种基于表示微分的解决方案。此外，多头注意力增强的变分自编码器也导致异常样本的差异更加明显，有利于其成功检测。提供了一种灵活而强大的方法来检测部署的物联网资产上的异常情况，这项工作可以进一步转化为服务于许多其他领域的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.