利用稀缺数据改进现场视觉工具磨损监测的生成式人工智能方法

IF 5.9 2区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Intelligent Manufacturing Pub Date : 2024-05-10 DOI:10.1007/s10845-024-02379-2

Alberto Garcia-Perez, Maria Jose Gomez-Silva, Arturo de la Escalera-Hueso

{"title":"利用稀缺数据改进现场视觉工具磨损监测的生成式人工智能方法","authors":"Alberto Garcia-Perez, Maria Jose Gomez-Silva, Arturo de la Escalera-Hueso","doi":"10.1007/s10845-024-02379-2","DOIUrl":null,"url":null,"abstract":"<p>Most aerospace turbine casings are mechanised using a vertical lathe. This paper presents a tool wear monitoring system using computer vision that analyses tool inserts once that the machining process has been completed. By installing a camera in the robot magazine room and a tool cleaning device to remove chips and cooling residuals, a neat tool image can be acquired. A subsequent Deep Learning (DL) model classifies the tool as acceptable or not, avoiding the drawbacks of alternative computer vision algorithms based on edges, dedicated features etc. Such model was trained with a significantly reduced number of images, in order to minimise the costly process to acquire and classify images during production. This could be achieved by introducing a special lens and some generative Artificial Intelligence (AI) models. This paper proposes two novel architectures: SCWGAN-GP, Scalable Condition Wasserstein Generative Adversarial Network (WGAN) with Gradient Penalty, and Focal Stable Diffusion (FSD) model, with class injection and dedicated loss function, to artificially increase the number of images to train the DL model. In addition, a K|Lens special optics was used to get multiple views of the vertical lathe inserts as a means of further increase data augmentation by hardware with a reduced number of production samples. Given an initial dataset, the classification accuracy was increased from 80.0 % up to 96.0 % using the FSD model. We also found that using as low as 100 real images, our methodology can achieve up to 93.3 % accuracy. Using only 100 original images for each insert type and wear condition results in 93.3 % accuracy and up to 94.6 % if 200 images are used. This accuracy is considered to be within human inspector uncertainty for this use-case. Fine-tuning the FSD model, with nearly 1 billion training parameters, showed superior performance compared to the SCWGAN-GP model, with only 80 million parameters, besides of requiring less training samples to produced higher quality output images. Furthermore, the visualization of the output activation mapping confirms that the model takes a decision on the right image features. Time to create the dataset was reduced from 3 months to 2 days using generative AI. So our approach enables to create industrial dataset with minimum effort and significant time speed-up compared with the conventional approach of acquiring a large number of images that DL models usually requires to avoid over-fitting. Despite the good results, this methodology is only applicable to relatively simple cases, such as our inserts where the images are not complex.</p>","PeriodicalId":16193,"journal":{"name":"Journal of Intelligent Manufacturing","volume":"2016 1","pages":""},"PeriodicalIF":5.9000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Generative AI approach to improve in-situ vision tool wear monitoring with scarce data\",\"authors\":\"Alberto Garcia-Perez, Maria Jose Gomez-Silva, Arturo de la Escalera-Hueso\",\"doi\":\"10.1007/s10845-024-02379-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Most aerospace turbine casings are mechanised using a vertical lathe. This paper presents a tool wear monitoring system using computer vision that analyses tool inserts once that the machining process has been completed. By installing a camera in the robot magazine room and a tool cleaning device to remove chips and cooling residuals, a neat tool image can be acquired. A subsequent Deep Learning (DL) model classifies the tool as acceptable or not, avoiding the drawbacks of alternative computer vision algorithms based on edges, dedicated features etc. Such model was trained with a significantly reduced number of images, in order to minimise the costly process to acquire and classify images during production. This could be achieved by introducing a special lens and some generative Artificial Intelligence (AI) models. This paper proposes two novel architectures: SCWGAN-GP, Scalable Condition Wasserstein Generative Adversarial Network (WGAN) with Gradient Penalty, and Focal Stable Diffusion (FSD) model, with class injection and dedicated loss function, to artificially increase the number of images to train the DL model. In addition, a K|Lens special optics was used to get multiple views of the vertical lathe inserts as a means of further increase data augmentation by hardware with a reduced number of production samples. Given an initial dataset, the classification accuracy was increased from 80.0 % up to 96.0 % using the FSD model. We also found that using as low as 100 real images, our methodology can achieve up to 93.3 % accuracy. Using only 100 original images for each insert type and wear condition results in 93.3 % accuracy and up to 94.6 % if 200 images are used. This accuracy is considered to be within human inspector uncertainty for this use-case. Fine-tuning the FSD model, with nearly 1 billion training parameters, showed superior performance compared to the SCWGAN-GP model, with only 80 million parameters, besides of requiring less training samples to produced higher quality output images. Furthermore, the visualization of the output activation mapping confirms that the model takes a decision on the right image features. Time to create the dataset was reduced from 3 months to 2 days using generative AI. So our approach enables to create industrial dataset with minimum effort and significant time speed-up compared with the conventional approach of acquiring a large number of images that DL models usually requires to avoid over-fitting. Despite the good results, this methodology is only applicable to relatively simple cases, such as our inserts where the images are not complex.</p>\",\"PeriodicalId\":16193,\"journal\":{\"name\":\"Journal of Intelligent Manufacturing\",\"volume\":\"2016 1\",\"pages\":\"\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2024-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Intelligent Manufacturing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s10845-024-02379-2\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Manufacturing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s10845-024-02379-2","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

大多数航空涡轮机壳体都是使用立式车床进行机械加工的。本文介绍了一种利用计算机视觉的刀具磨损监测系统，该系统可在加工过程结束后分析刀具刀片。通过在机器人库房安装摄像头和刀具清洁装置来清除切屑和冷却残留物，可以获得整洁的刀具图像。随后的深度学习（DL）模型会对刀具进行合格与否的分类，避免了基于边缘、专用特征等其他计算机视觉算法的缺点。为了最大限度地减少生产过程中获取和分类图像的昂贵过程，该模型在训练时使用的图像数量大大减少。这可以通过引入特殊镜头和一些生成式人工智能（AI）模型来实现。本文提出了两种新型架构：SCWGAN-GP（带梯度惩罚的可扩展条件瓦瑟斯坦生成对抗网络 (WGAN)）和 Focal Stable Diffusion（FSD）模型，该模型具有类注入和专用损失函数，可人为增加用于训练 DL 模型的图像数量。此外，还使用了 K|Lens 特殊光学镜片来获取立式车床刀片的多个视图，从而在减少生产样本数量的情况下，通过硬件进一步增加数据量。使用 FSD 模型，初始数据集的分类准确率从 80.0% 提高到 96.0%。我们还发现，只要使用 100 张真实图像，我们的方法就能达到 93.3% 的准确率。对每种镶片类型和磨损情况仅使用 100 张原始图像，准确率可达 93.3%，如果使用 200 张图像，准确率可达 94.6%。在这种情况下，该精度被认为在人类检测不确定性范围之内。与仅有 8000 万个参数的 SCWGAN-GP 模型相比，使用近 10 亿个训练参数对 FSD 模型进行微调，除了需要更少的训练样本以生成更高质量的输出图像外，还显示出更优越的性能。此外，输出激活映射的可视化证实了该模型对正确的图像特征做出了决定。使用生成式人工智能，创建数据集的时间从 3 个月缩短到 2 天。因此，与 DL 模型通常需要获取大量图像以避免过度拟合的传统方法相比，我们的方法能够以最小的工作量创建工业数据集，并且大大加快了时间。尽管取得了良好的效果，但这种方法只适用于相对简单的情况，如我们的插入式图像，图像并不复杂。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Generative AI approach to improve in-situ vision tool wear monitoring with scarce data

Most aerospace turbine casings are mechanised using a vertical lathe. This paper presents a tool wear monitoring system using computer vision that analyses tool inserts once that the machining process has been completed. By installing a camera in the robot magazine room and a tool cleaning device to remove chips and cooling residuals, a neat tool image can be acquired. A subsequent Deep Learning (DL) model classifies the tool as acceptable or not, avoiding the drawbacks of alternative computer vision algorithms based on edges, dedicated features etc. Such model was trained with a significantly reduced number of images, in order to minimise the costly process to acquire and classify images during production. This could be achieved by introducing a special lens and some generative Artificial Intelligence (AI) models. This paper proposes two novel architectures: SCWGAN-GP, Scalable Condition Wasserstein Generative Adversarial Network (WGAN) with Gradient Penalty, and Focal Stable Diffusion (FSD) model, with class injection and dedicated loss function, to artificially increase the number of images to train the DL model. In addition, a K|Lens special optics was used to get multiple views of the vertical lathe inserts as a means of further increase data augmentation by hardware with a reduced number of production samples. Given an initial dataset, the classification accuracy was increased from 80.0 % up to 96.0 % using the FSD model. We also found that using as low as 100 real images, our methodology can achieve up to 93.3 % accuracy. Using only 100 original images for each insert type and wear condition results in 93.3 % accuracy and up to 94.6 % if 200 images are used. This accuracy is considered to be within human inspector uncertainty for this use-case. Fine-tuning the FSD model, with nearly 1 billion training parameters, showed superior performance compared to the SCWGAN-GP model, with only 80 million parameters, besides of requiring less training samples to produced higher quality output images. Furthermore, the visualization of the output activation mapping confirms that the model takes a decision on the right image features. Time to create the dataset was reduced from 3 months to 2 days using generative AI. So our approach enables to create industrial dataset with minimum effort and significant time speed-up compared with the conventional approach of acquiring a large number of images that DL models usually requires to avoid over-fitting. Despite the good results, this methodology is only applicable to relatively simple cases, such as our inserts where the images are not complex.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Intelligent Manufacturing 工程技术-工程：制造

CiteScore

19.30

自引率

9.60%

发文量

171

审稿时长

5.2 months

期刊介绍： The Journal of Nonlinear Engineering aims to be a platform for sharing original research results in theoretical, experimental, practical, and applied nonlinear phenomena within engineering. It serves as a forum to exchange ideas and applications of nonlinear problems across various engineering disciplines. Articles are considered for publication if they explore nonlinearities in engineering systems, offering realistic mathematical modeling, utilizing nonlinearity for new designs, stabilizing systems, understanding system behavior through nonlinearity, optimizing systems based on nonlinear interactions, and developing algorithms to harness and leverage nonlinear elements.