Zixuan Chen, Xiaohua Xie, Lingxiao Yang, Jian-Huang Lai
{"title":"Hard-Normal Example-Aware Template Mutual Matching for Industrial Anomaly Detection","authors":"Zixuan Chen, Xiaohua Xie, Lingxiao Yang, Jian-Huang Lai","doi":"10.1007/s11263-024-02323-0","DOIUrl":null,"url":null,"abstract":"<p>Anomaly detectors are widely used in industrial manufacturing to detect and localize unknown defects in query images. These detectors are trained on anomaly-free samples and have successfully distinguished anomalies from most normal samples. However, hard-normal examples are scattered and far apart from most normal samples, and thus they are often mistaken for anomalies by existing methods. To address this issue, we propose <b>H</b>ard-normal <b>E</b>xample-aware <b>T</b>emplate <b>M</b>utual <b>M</b>atching (HETMM), an efficient framework to build a robust prototype-based decision boundary. Specifically, <i>HETMM</i> employs the proposed <b>A</b>ffine-invariant <b>T</b>emplate <b>M</b>utual <b>M</b>atching (ATMM) to mitigate the affection brought by the affine transformations and easy-normal examples. By mutually matching the pixel-level prototypes within the patch-level search spaces between query and template set, <i>ATMM</i> can accurately distinguish between hard-normal examples and anomalies, achieving low false-positive and missed-detection rates. In addition, we also propose <i>PTS</i> to compress the original template set for speed-up. <i>PTS</i> selects cluster centres and hard-normal examples to preserve the original decision boundary, allowing this tiny set to achieve comparable performance to the original one. Extensive experiments demonstrate that <i>HETMM</i> outperforms state-of-the-art methods, while using a 60-sheet tiny set can achieve competitive performance and real-time inference speed (around 26.1 FPS) on a Quadro 8000 RTX GPU. <i>HETMM</i> is training-free and can be hot-updated by directly inserting novel samples into the template set, which can promptly address some incremental learning issues in industrial manufacturing.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"26 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-024-02323-0","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Anomaly detectors are widely used in industrial manufacturing to detect and localize unknown defects in query images. These detectors are trained on anomaly-free samples and have successfully distinguished anomalies from most normal samples. However, hard-normal examples are scattered and far apart from most normal samples, and thus they are often mistaken for anomalies by existing methods. To address this issue, we propose Hard-normal Example-aware Template Mutual Matching (HETMM), an efficient framework to build a robust prototype-based decision boundary. Specifically, HETMM employs the proposed Affine-invariant Template Mutual Matching (ATMM) to mitigate the affection brought by the affine transformations and easy-normal examples. By mutually matching the pixel-level prototypes within the patch-level search spaces between query and template set, ATMM can accurately distinguish between hard-normal examples and anomalies, achieving low false-positive and missed-detection rates. In addition, we also propose PTS to compress the original template set for speed-up. PTS selects cluster centres and hard-normal examples to preserve the original decision boundary, allowing this tiny set to achieve comparable performance to the original one. Extensive experiments demonstrate that HETMM outperforms state-of-the-art methods, while using a 60-sheet tiny set can achieve competitive performance and real-time inference speed (around 26.1 FPS) on a Quadro 8000 RTX GPU. HETMM is training-free and can be hot-updated by directly inserting novel samples into the template set, which can promptly address some incremental learning issues in industrial manufacturing.
期刊介绍:
The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs.
Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision.
Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community.
Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas.
In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives.
The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research.
Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.