Pub Date : 2026-01-28DOI: 10.1109/tase.2026.3658586
Wen-Tao Zhang, Ming Chi, Zhi-Wei Liu, Jing-Zhe Xu
{"title":"Distributed Execution of Signal Temporal Logic Tasks in Open Networked Robotic System via Predefined-Time Control","authors":"Wen-Tao Zhang, Ming Chi, Zhi-Wei Liu, Jing-Zhe Xu","doi":"10.1109/tase.2026.3658586","DOIUrl":"https://doi.org/10.1109/tase.2026.3658586","url":null,"abstract":"","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"44 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-28DOI: 10.1109/TASE.2026.3658828
Zhikai Yao;Xianglong Liang;Fengchi Li;Jianyong Yao
Dynamic model identification for high-dimension hydraulic manipulators remains a significant challenge, with system nonlinearities and dynamic coupling being the primary obstacles to achieving high-accuracy control. To this end, this article introduces the Collaborative-Optimization-Independent-Control (COIC) framework. Within the COIC framework, an extended-state-observer-based joint-independent controller is adopted to handle unmodeled dynamics individually at each joint. Given the coupling among the unmodeled dynamics across different joints, the adjustment of observer gains for all joint-independent controllers is formulated as a collaborative game problem. Reinforcement learning is thereby introduced to solve this game problem and determine the collaborative Nash equilibrium, thereby enabling optimal observer gain configuration and enhancing overall control performance. Theoretical analysis confirms the Lyapunov stability of the joint-independent control system. Furthermore, it is demonstrated that updating the observer gains within the stable region yields (sub)optimal solutions corresponding to the collaborative Nash equilibrium. The effectiveness and advantages of the proposed COIC framework are validated through comparative experiments on a well-established six-degree-of-freedom (6-DOF) hydraulic manipulator platform. Note to Practitioners—The COIC framework is both conceptually intuitive and practically implementable, offering strong potential for deployment in complex industrial systems. Specifically, instead of relying on dynamic model identification or virtual decomposition of high-dimension hydraulic manipulators, the framework employs extended-state-observer-based joint-independent control to directly address unmodeled dynamics. By avoiding the need for precise model information or intricate decomposition procedures, the proposed method significantly simplifies implementation in real-world applications. Furthermore, the configuration of observer gains across all joint-independent controllers is formulated as a collaborative game, with reinforcement learning introduced to identify the collaborative Nash equilibrium. This strategy enables incremental optimization of observer gains, allowing each joint controller to effectively compensate for unmodeled dynamics. Simultaneously, the learning-based formulation enhances the transparency and interpretability of the control design, facilitating broader adoption in practice.
{"title":"Extended-State-Observer-Based Optimized Control of Hydraulic Manipulators","authors":"Zhikai Yao;Xianglong Liang;Fengchi Li;Jianyong Yao","doi":"10.1109/TASE.2026.3658828","DOIUrl":"10.1109/TASE.2026.3658828","url":null,"abstract":"Dynamic model identification for high-dimension hydraulic manipulators remains a significant challenge, with system nonlinearities and dynamic coupling being the primary obstacles to achieving high-accuracy control. To this end, this article introduces the Collaborative-Optimization-Independent-Control (COIC) framework. Within the COIC framework, an extended-state-observer-based joint-independent controller is adopted to handle unmodeled dynamics individually at each joint. Given the coupling among the unmodeled dynamics across different joints, the adjustment of observer gains for all joint-independent controllers is formulated as a collaborative game problem. Reinforcement learning is thereby introduced to solve this game problem and determine the collaborative Nash equilibrium, thereby enabling optimal observer gain configuration and enhancing overall control performance. Theoretical analysis confirms the Lyapunov stability of the joint-independent control system. Furthermore, it is demonstrated that updating the observer gains within the stable region yields (sub)optimal solutions corresponding to the collaborative Nash equilibrium. The effectiveness and advantages of the proposed COIC framework are validated through comparative experiments on a well-established six-degree-of-freedom (6-DOF) hydraulic manipulator platform. Note to Practitioners—The COIC framework is both conceptually intuitive and practically implementable, offering strong potential for deployment in complex industrial systems. Specifically, instead of relying on dynamic model identification or virtual decomposition of high-dimension hydraulic manipulators, the framework employs extended-state-observer-based joint-independent control to directly address unmodeled dynamics. By avoiding the need for precise model information or intricate decomposition procedures, the proposed method significantly simplifies implementation in real-world applications. Furthermore, the configuration of observer gains across all joint-independent controllers is formulated as a collaborative game, with reinforcement learning introduced to identify the collaborative Nash equilibrium. This strategy enables incremental optimization of observer gains, allowing each joint controller to effectively compensate for unmodeled dynamics. Simultaneously, the learning-based formulation enhances the transparency and interpretability of the control design, facilitating broader adoption in practice.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"23 ","pages":"4272-4284"},"PeriodicalIF":6.4,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-28DOI: 10.1109/TASE.2026.3659028
Hongxiang Xue;Wen Zheng;Shiwei Lin;Yi Wang;Lei Bao;Xiumei Wang
Most self-driving laboratories (SDLs) rely on the assumption of a static and fully known experimental environment prior to execution, neglecting potential pre-experimental deviations and human errors that may trigger cascading failures during operation.To address these limitations, this paper presents a multi-agent SDL framework incorporating pre-execution environment perception. The system is designed to emulate human-like manipulation intelligence, integrating natural language interaction, synthesis planning, dexterous environment perception, cognitive task planning, and adaptive robot execution into a closed-loop workflow. The system is coordinated through six specialized agents-task clarification, synthesis planning, environment perception, robot planning, robot execution, and feedback-collectively enabling end-to-end automation from user intent to experimental realization. Validation was conducted on a customized automation platform using hydrogel and silicone preparation tasks, supported by a dedicated dataset encompassing these processes and multiple types of potential anomalies inconsistent with predefined conditions. Experimental results demonstrate high accuracy in both synthesis planning and environment perception (with perception accuracy reaching 99.86%), along with significant improvements in task success rates: hydrogel preparation increased from 0.64 to 0.90, and silicone preparation from 0.60 to 0.92. These findings confirm that integrating environment perception with multi-agent collaboration effectively enhances system robustness, safety, and adaptability under unexpected anomalies, underscoring the framework’s potential for scalable deployment in real-world laboratory environments. Note to Practitioners—This work aims to develop an SDL capable of operating robustly under the imperfect conditions of real laboratory environments. We introduce a multi-agent framework that enables the SDL to perceive its surroundings before executing an experimental protocol. By integrating perception and planning agents, the system verifies whether the physical environment aligns with the operational requirements. It can also interact with users in natural language to clarify tasks and automatically identify anomalies that violate predefined conditions. We validated the effectiveness of this framework on a custom automated platform for hydrogel and silicone synthesis. The results show a substantial improvement in task success rate, translating directly into higher efficiency and reduced material consumption. Practically, embedding environmental perception and multi-agent verification into the experimental workflow enables the creation of autonomous systems that are not only more reliable and safer but also suitable for everyday research environments. The core architecture is platform-agnostic and can be applied to other laboratory automation and robotic manipulation tasks, particularly those where pre-execution validation is critical. In terms of
{"title":"Environment-Aware Multi-Agent Framework for Self-Driving Laboratories","authors":"Hongxiang Xue;Wen Zheng;Shiwei Lin;Yi Wang;Lei Bao;Xiumei Wang","doi":"10.1109/TASE.2026.3659028","DOIUrl":"10.1109/TASE.2026.3659028","url":null,"abstract":"Most self-driving laboratories (SDLs) rely on the assumption of a static and fully known experimental environment prior to execution, neglecting potential pre-experimental deviations and human errors that may trigger cascading failures during operation.To address these limitations, this paper presents a multi-agent SDL framework incorporating pre-execution environment perception. The system is designed to emulate human-like manipulation intelligence, integrating natural language interaction, synthesis planning, dexterous environment perception, cognitive task planning, and adaptive robot execution into a closed-loop workflow. The system is coordinated through six specialized agents-task clarification, synthesis planning, environment perception, robot planning, robot execution, and feedback-collectively enabling end-to-end automation from user intent to experimental realization. Validation was conducted on a customized automation platform using hydrogel and silicone preparation tasks, supported by a dedicated dataset encompassing these processes and multiple types of potential anomalies inconsistent with predefined conditions. Experimental results demonstrate high accuracy in both synthesis planning and environment perception (with perception accuracy reaching 99.86%), along with significant improvements in task success rates: hydrogel preparation increased from 0.64 to 0.90, and silicone preparation from 0.60 to 0.92. These findings confirm that integrating environment perception with multi-agent collaboration effectively enhances system robustness, safety, and adaptability under unexpected anomalies, underscoring the framework’s potential for scalable deployment in real-world laboratory environments. Note to Practitioners—This work aims to develop an SDL capable of operating robustly under the imperfect conditions of real laboratory environments. We introduce a multi-agent framework that enables the SDL to perceive its surroundings before executing an experimental protocol. By integrating perception and planning agents, the system verifies whether the physical environment aligns with the operational requirements. It can also interact with users in natural language to clarify tasks and automatically identify anomalies that violate predefined conditions. We validated the effectiveness of this framework on a custom automated platform for hydrogel and silicone synthesis. The results show a substantial improvement in task success rate, translating directly into higher efficiency and reduced material consumption. Practically, embedding environmental perception and multi-agent verification into the experimental workflow enables the creation of autonomous systems that are not only more reliable and safer but also suitable for everyday research environments. The core architecture is platform-agnostic and can be applied to other laboratory automation and robotic manipulation tasks, particularly those where pre-execution validation is critical. In terms of","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"23 ","pages":"4579-4589"},"PeriodicalIF":6.4,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-28DOI: 10.1109/TASE.2026.3658882
Shanrong Lin;Xiwei Liu
This article addresses general decay synchronization matter for multiplex and directed networks via delayed feedback control. Decay synchronization is regarded as a class of $psi $ -type synchronization, which derives from the generalizations of $psi $ -type function and $psi $ -type stability. By exploiting appropriate nonlinear control positioned on a portion of multiplex networks, synchronization for the network system is solved with decay rate. In comparison with previous multiplex networks, the present model contains asymmetric, with non-cooperative factors and not connected outer matrices, combined with negative elements of inner matrices in the article, which improves existing results well. We propose a synchronization method for multiplex network under this constraint from angle of inner matrices. It is proved that if weighted group of union new matrices for each dimension is strongly connected, then decay synchronization and anti-synchronization can be realized under delayed feedback controller. Moreover, some specific modes for synchronization are illustrated more precisely. In addition, reaction-diffusion systems are also further conducted as an application. Simulations are given for verifying the validity of gained results. Note to Practitioners—The motivation of this paper is generalizing decay (anti-) synchronization of multiplex networks based on a nonlinear control with feedback mechanism suffering effect of time delay. Previous literature on multi-weighted networks considered that outer matrices of network modeling were undirect, cooperative, with strong connectedness. On the contrary, each matrix of this present research can be direct, competitive, and even disconnected such that can described more practical networks. Unfortunately, previous strategies used would not work well under this general situation. Thus, one novel approach is proposed for solving the difficulty and challenging that how to address multiple matrices to guarantee decay synchronization. In virtue of a delayed feedback protocol with viewpoint of inner matrices, relevant decay synchronization criteria are obtained, and with illustrations of different decay rates, multiplex diffusion system is developed for obtaining more rules, which are demonstrated for the effectiveness by simulations.
{"title":"General Decay Synchronization of Multiplex Networks via Delayed Feedback Control","authors":"Shanrong Lin;Xiwei Liu","doi":"10.1109/TASE.2026.3658882","DOIUrl":"10.1109/TASE.2026.3658882","url":null,"abstract":"This article addresses general decay synchronization matter for multiplex and directed networks via delayed feedback control. Decay synchronization is regarded as a class of <inline-formula> <tex-math>$psi $ </tex-math></inline-formula>-type synchronization, which derives from the generalizations of <inline-formula> <tex-math>$psi $ </tex-math></inline-formula>-type function and <inline-formula> <tex-math>$psi $ </tex-math></inline-formula>-type stability. By exploiting appropriate nonlinear control positioned on a portion of multiplex networks, synchronization for the network system is solved with decay rate. In comparison with previous multiplex networks, the present model contains asymmetric, with non-cooperative factors and not connected outer matrices, combined with negative elements of inner matrices in the article, which improves existing results well. We propose a synchronization method for multiplex network under this constraint from angle of inner matrices. It is proved that if weighted group of union new matrices for each dimension is strongly connected, then decay synchronization and anti-synchronization can be realized under delayed feedback controller. Moreover, some specific modes for synchronization are illustrated more precisely. In addition, reaction-diffusion systems are also further conducted as an application. Simulations are given for verifying the validity of gained results. Note to Practitioners—The motivation of this paper is generalizing decay (anti-) synchronization of multiplex networks based on a nonlinear control with feedback mechanism suffering effect of time delay. Previous literature on multi-weighted networks considered that outer matrices of network modeling were undirect, cooperative, with strong connectedness. On the contrary, each matrix of this present research can be direct, competitive, and even disconnected such that can described more practical networks. Unfortunately, previous strategies used would not work well under this general situation. Thus, one novel approach is proposed for solving the difficulty and challenging that how to address multiple matrices to guarantee decay synchronization. In virtue of a delayed feedback protocol with viewpoint of inner matrices, relevant decay synchronization criteria are obtained, and with illustrations of different decay rates, multiplex diffusion system is developed for obtaining more rules, which are demonstrated for the effectiveness by simulations.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"23 ","pages":"4285-4300"},"PeriodicalIF":6.4,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1109/tase.2026.3658173
Huixin Jiang, Yana Yang, Changchun Hua, Junpeng Li
{"title":"A Novel ADP-based Neurooptimal Control Methodology for Teleoperation Systems under Interactive Shared-control Framework","authors":"Huixin Jiang, Yana Yang, Changchun Hua, Junpeng Li","doi":"10.1109/tase.2026.3658173","DOIUrl":"https://doi.org/10.1109/tase.2026.3658173","url":null,"abstract":"","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"102 1","pages":"1-1"},"PeriodicalIF":5.6,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146056058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-26DOI: 10.1109/TASE.2026.3657889
Wei Yang;Yumei Ma;Qing-Guo Wang;Jinpeng Yu
This paper presents a learning-enhanced predefined-time adaptive optimal control strategy for a quadrotor uncrewed aerial vehicle subject to disturbances. First, a predefined-time disturbance observer with a tunable convergence time bound is developed to ensure rapid and accurate estimation. Within a command filtered backstepping architecture, actor-critic neural networks are incorporated to achieve adaptive optimal control with learning capability for both position and attitude subsystems. Specifically, novel learning laws facilitate the rapid online updating of network weights, where the critic network approximates the value function while the actor network optimizes the control policy to minimize control cost. The proposed framework effectively compensates for disturbances and filtered error effects, ensuring that all tracking errors converge within a predefined time. Rigorous analysis establishes the predefined-time stability of the closed-loop system. Finally, comparative simulation results are provided to demonstrate the effectiveness of the proposed strategy. Note to Practitioners—This work addresses the challenge for quadrotors to execute precise, reliable tasks such as inspection or delivery under real-world disturbances (e.g., wind). Advanced controllers often require expert tuning to balance rapid response with energy efficiency. The learning mechanism enables real-time adjustment of control policies, where neural networks continuously adapt the policy to minimize energy consumption while achieving accurate and rapid trajectory tracking. This reduces reliance on a perfect quadrotor model and automates tuning. A key feature is the guaranteed convergence of tracking errors within a user-defined time, critical for time-sensitive operations. A developed disturbance observer estimates and compensates for disturbances like wind in real-time. This approach is suited for automation scenarios requiring high precision, rapid response, and disturbance rejection. Implementation requires adequate onboard computation and sensor accuracy. Future work will simplify tuning and extend to multi-quadrotor coordination.
{"title":"Learning-Enhanced Predefined-Time Adaptive Optimal Control for Quadrotors With Disturbances","authors":"Wei Yang;Yumei Ma;Qing-Guo Wang;Jinpeng Yu","doi":"10.1109/TASE.2026.3657889","DOIUrl":"10.1109/TASE.2026.3657889","url":null,"abstract":"This paper presents a learning-enhanced predefined-time adaptive optimal control strategy for a quadrotor uncrewed aerial vehicle subject to disturbances. First, a predefined-time disturbance observer with a tunable convergence time bound is developed to ensure rapid and accurate estimation. Within a command filtered backstepping architecture, actor-critic neural networks are incorporated to achieve adaptive optimal control with learning capability for both position and attitude subsystems. Specifically, novel learning laws facilitate the rapid online updating of network weights, where the critic network approximates the value function while the actor network optimizes the control policy to minimize control cost. The proposed framework effectively compensates for disturbances and filtered error effects, ensuring that all tracking errors converge within a predefined time. Rigorous analysis establishes the predefined-time stability of the closed-loop system. Finally, comparative simulation results are provided to demonstrate the effectiveness of the proposed strategy. Note to Practitioners—This work addresses the challenge for quadrotors to execute precise, reliable tasks such as inspection or delivery under real-world disturbances (e.g., wind). Advanced controllers often require expert tuning to balance rapid response with energy efficiency. The learning mechanism enables real-time adjustment of control policies, where neural networks continuously adapt the policy to minimize energy consumption while achieving accurate and rapid trajectory tracking. This reduces reliance on a perfect quadrotor model and automates tuning. A key feature is the guaranteed convergence of tracking errors within a user-defined time, critical for time-sensitive operations. A developed disturbance observer estimates and compensates for disturbances like wind in real-time. This approach is suited for automation scenarios requiring high precision, rapid response, and disturbance rejection. Implementation requires adequate onboard computation and sensor accuracy. Future work will simplify tuning and extend to multi-quadrotor coordination.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"23 ","pages":"4261-4271"},"PeriodicalIF":6.4,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146056057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1109/tase.2026.3657700
Hyungjin Kim, Aerim Hwang, Shing Chih Tsai, Chuljin Park
{"title":"Stochastic Kriging-assisted Controlled Random Search for Simulation Optimization and Its Application to Critical Dimension Measurement","authors":"Hyungjin Kim, Aerim Hwang, Shing Chih Tsai, Chuljin Park","doi":"10.1109/tase.2026.3657700","DOIUrl":"https://doi.org/10.1109/tase.2026.3657700","url":null,"abstract":"","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"117 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146042686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1109/TASE.2026.3657596
Shengwang An;Xinghui Dong
Large Vision-Language Models (LVLMs) mainly rely on template-generated textual descriptions to understand defects. This reliance impairs the performance of these models for Industrial Defect Detection (IDD) because they typically lack specialized knowledge. On the other hand, the majority of existing IDD methods only utilize the contrastive loss function for image-to-text feature alignment, which limits their ability to focus on defective regions. In addition, these methods usually use cosine similarity for contextual learning, which also restricts their ability to understand and adapt to complex contexts. To address these issues, we first collect a large-scale defect data set with textual descriptions, namely, the Text-Augmented Defect Data Set (TADD), to fine-tune an LVLM for defect description. We also propose a Self-prompted Generic Defect Diagnosis (including Defect Detection and Defect Description) LVLM, i.e., the SPGDD-GPT. This method can effectively utilize contextual information through a Multi-scale Self-prompted Memory Module (MSSPMM) and a Text-Driven Defect Focuser (TDDF) that we deliberately design, to adapt to unseen defect categories and focus on abnormal regions. Experimental results show that our method normally achieves the better performance than its counterparts across the 21 subsets of TADD under the 1-shot, 2-shot and 4-shot defect detection settings, demonstrating strong detection and generalization capabilities. The source code, model, and data set are available at <uri>https://github.com/INDTLab/SPGDD-GPT</uri>. The proposed method can also generate a textural description of the defects contained in each test image. These promising results should be due to the proposed MSSPMM and TDDF and the large-scale TADD. Note to Practitioners—The proposed SPGDD-GPT is developed on top of an LVLM. It is specifically designed for the few-shot defect diagnosis task, including defect detection and defect description, which requires only a small number of training images. In real-world scenarios, the TADD effectively addresses the lack of detailed textual descriptions in training data, significantly alleviating the challenge of scarce textual data commonly encountered by practitioners in the field of defect diagnosis. By integrating a Text-Driven Defect Focuser (TDDF) and a Multi-scale Self-prompted Memory Module (MSSPMM), the SPGDD-GPT improves the alignment between visual and textual information, thereby improving the adaptability and robustness of the model in various scenarios. The TDDF explicitly adjusts the distance between normal and abnormal text embeddings through boundary hyperparameters, and achieves precise defect detection by reducing the Euclidean distance between abnormal image features and abnormal text representations, while the MSSPMM uses multi-scale normal samples as self-prompts which allow the model to rapidly adapt to novel object categories with limited samples and effectively attend to defective regions. Furthe
{"title":"SPGDD-GPT: Image-Text-Driven Generic Defect Diagnosis Using a Self-Prompted Large Vision-Language Model","authors":"Shengwang An;Xinghui Dong","doi":"10.1109/TASE.2026.3657596","DOIUrl":"10.1109/TASE.2026.3657596","url":null,"abstract":"Large Vision-Language Models (LVLMs) mainly rely on template-generated textual descriptions to understand defects. This reliance impairs the performance of these models for Industrial Defect Detection (IDD) because they typically lack specialized knowledge. On the other hand, the majority of existing IDD methods only utilize the contrastive loss function for image-to-text feature alignment, which limits their ability to focus on defective regions. In addition, these methods usually use cosine similarity for contextual learning, which also restricts their ability to understand and adapt to complex contexts. To address these issues, we first collect a large-scale defect data set with textual descriptions, namely, the Text-Augmented Defect Data Set (TADD), to fine-tune an LVLM for defect description. We also propose a Self-prompted Generic Defect Diagnosis (including Defect Detection and Defect Description) LVLM, i.e., the SPGDD-GPT. This method can effectively utilize contextual information through a Multi-scale Self-prompted Memory Module (MSSPMM) and a Text-Driven Defect Focuser (TDDF) that we deliberately design, to adapt to unseen defect categories and focus on abnormal regions. Experimental results show that our method normally achieves the better performance than its counterparts across the 21 subsets of TADD under the 1-shot, 2-shot and 4-shot defect detection settings, demonstrating strong detection and generalization capabilities. The source code, model, and data set are available at <uri>https://github.com/INDTLab/SPGDD-GPT</uri>. The proposed method can also generate a textural description of the defects contained in each test image. These promising results should be due to the proposed MSSPMM and TDDF and the large-scale TADD. Note to Practitioners—The proposed SPGDD-GPT is developed on top of an LVLM. It is specifically designed for the few-shot defect diagnosis task, including defect detection and defect description, which requires only a small number of training images. In real-world scenarios, the TADD effectively addresses the lack of detailed textual descriptions in training data, significantly alleviating the challenge of scarce textual data commonly encountered by practitioners in the field of defect diagnosis. By integrating a Text-Driven Defect Focuser (TDDF) and a Multi-scale Self-prompted Memory Module (MSSPMM), the SPGDD-GPT improves the alignment between visual and textual information, thereby improving the adaptability and robustness of the model in various scenarios. The TDDF explicitly adjusts the distance between normal and abnormal text embeddings through boundary hyperparameters, and achieves precise defect detection by reducing the Euclidean distance between abnormal image features and abnormal text representations, while the MSSPMM uses multi-scale normal samples as self-prompts which allow the model to rapidly adapt to novel object categories with limited samples and effectively attend to defective regions. Furthe","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"23 ","pages":"4247-4260"},"PeriodicalIF":6.4,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146042683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1109/tase.2026.3657678
Aliakbar Davoodi, Ahmad W. Al-Dabbagh
{"title":"An Adaptive Graph-based Approach for Similarity Analysis and Early Classification of Alarm Floods","authors":"Aliakbar Davoodi, Ahmad W. Al-Dabbagh","doi":"10.1109/tase.2026.3657678","DOIUrl":"https://doi.org/10.1109/tase.2026.3657678","url":null,"abstract":"","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"61 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146042688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}