Shenghong Yan , Bo Chen , Han Gao , Caiwang Tan , Xiaoguo Song , Guodong Wang
{"title":"用于激光扫描焊接中焊点成形监测的交叉注意时间序列多特征融合视觉变换器","authors":"Shenghong Yan , Bo Chen , Han Gao , Caiwang Tan , Xiaoguo Song , Guodong Wang","doi":"10.1016/j.ymssp.2025.112531","DOIUrl":null,"url":null,"abstract":"<div><div>As laser scanning welding technology matures in engineering applications, it is a crucial step in developing diagnostics capable of monitoring weld joint forming and meeting the demands of increasingly structurally complex products. In this work, a unique multivariate time-series dataset encompassing keyhole and molten pool image streams was extracted from the collected visual signals. Keyhole and molten pool were respectively fed into a proposed Transformer-based model with two-branches, which incorporated multi-head self-attention and cross-attention mechanisms. The results show that the optimal architecture achieved an accuracy of 99.3%, which outperforms the previous state-of-the-art image-based models. The optimization and ablation experiments have also verified that the temporal characteristics of signals are one of the significant determining factors for the accuracy of laser scanning welding state recognition. The score maps of attention mechanism during the decision-making process demonstrate that the proposed model is able to accurately learn the time-series characteristics of keyhole and molten pool visual signals, exhibiting exceptional capability in effectively capturing fine-grained details of highly dynamic objects from visual signals under varying welding states. In summary, its excellent performance and visualization of the attention mechanism make it a promising diagnostic functional module as a novel strategy for laser scanning welded Joint formation monitoring.</div></div>","PeriodicalId":51124,"journal":{"name":"Mechanical Systems and Signal Processing","volume":"229 ","pages":"Article 112531"},"PeriodicalIF":7.9000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-attention time-series multi-feature fusion vision transformer for joint formation monitoring in laser scanning welding\",\"authors\":\"Shenghong Yan , Bo Chen , Han Gao , Caiwang Tan , Xiaoguo Song , Guodong Wang\",\"doi\":\"10.1016/j.ymssp.2025.112531\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>As laser scanning welding technology matures in engineering applications, it is a crucial step in developing diagnostics capable of monitoring weld joint forming and meeting the demands of increasingly structurally complex products. In this work, a unique multivariate time-series dataset encompassing keyhole and molten pool image streams was extracted from the collected visual signals. Keyhole and molten pool were respectively fed into a proposed Transformer-based model with two-branches, which incorporated multi-head self-attention and cross-attention mechanisms. The results show that the optimal architecture achieved an accuracy of 99.3%, which outperforms the previous state-of-the-art image-based models. The optimization and ablation experiments have also verified that the temporal characteristics of signals are one of the significant determining factors for the accuracy of laser scanning welding state recognition. The score maps of attention mechanism during the decision-making process demonstrate that the proposed model is able to accurately learn the time-series characteristics of keyhole and molten pool visual signals, exhibiting exceptional capability in effectively capturing fine-grained details of highly dynamic objects from visual signals under varying welding states. In summary, its excellent performance and visualization of the attention mechanism make it a promising diagnostic functional module as a novel strategy for laser scanning welded Joint formation monitoring.</div></div>\",\"PeriodicalId\":51124,\"journal\":{\"name\":\"Mechanical Systems and Signal Processing\",\"volume\":\"229 \",\"pages\":\"Article 112531\"},\"PeriodicalIF\":7.9000,\"publicationDate\":\"2025-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mechanical Systems and Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0888327025002328\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MECHANICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechanical Systems and Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888327025002328","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
Cross-attention time-series multi-feature fusion vision transformer for joint formation monitoring in laser scanning welding
As laser scanning welding technology matures in engineering applications, it is a crucial step in developing diagnostics capable of monitoring weld joint forming and meeting the demands of increasingly structurally complex products. In this work, a unique multivariate time-series dataset encompassing keyhole and molten pool image streams was extracted from the collected visual signals. Keyhole and molten pool were respectively fed into a proposed Transformer-based model with two-branches, which incorporated multi-head self-attention and cross-attention mechanisms. The results show that the optimal architecture achieved an accuracy of 99.3%, which outperforms the previous state-of-the-art image-based models. The optimization and ablation experiments have also verified that the temporal characteristics of signals are one of the significant determining factors for the accuracy of laser scanning welding state recognition. The score maps of attention mechanism during the decision-making process demonstrate that the proposed model is able to accurately learn the time-series characteristics of keyhole and molten pool visual signals, exhibiting exceptional capability in effectively capturing fine-grained details of highly dynamic objects from visual signals under varying welding states. In summary, its excellent performance and visualization of the attention mechanism make it a promising diagnostic functional module as a novel strategy for laser scanning welded Joint formation monitoring.
期刊介绍:
Journal Name: Mechanical Systems and Signal Processing (MSSP)
Interdisciplinary Focus:
Mechanical, Aerospace, and Civil Engineering
Purpose:Reporting scientific advancements of the highest quality
Arising from new techniques in sensing, instrumentation, signal processing, modelling, and control of dynamic systems