首页 > 最新文献

Robotics and Computer-integrated Manufacturing最新文献

英文 中文
Intelligent tool wear monitoring approach in milling of titanium alloys 钛合金铣削中刀具磨损智能监测方法
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-11 DOI: 10.1016/j.rcim.2025.103181
Shucai Yang, Runjie Jiang, Zekun Song, Dongqi Yu
Tool wear exerts a critical influence on machining stability and workpiece quality, making its accurate, intelligent monitoring indispensable for preventing tool failure and ensuring product consistency. Although direct assessment via wear imagery is possible, it requires interrupting the machining process and thus is impractical for real‐time production. A more viable solution is to leverage in‐process signals—such as vibration—to enable continuous monitoring. Here, we present a Signal processing method that Beluga whale optimization‐Successive variational mode decomposition (BWO‐SVMD) for noise suppression, followed by the S‐transform to produce high‐resolution time–frequency representations. Based on these denoised spectrograms, we develop an intelligent monitoring model that integrates a multi‐scale convolutional neural network (MSCNN), long short‐term memory (LSTM) units, and a channel–spatial attention mechanism. Experimental results demonstrate that our model achieves 96.25 % classification accuracy, a Kappa coefficient of 0.9686, and a total computation time of 320.64 s. Compared with CNN‐LSTM‐Attention, MSCNN‐Attention, and MSCNN‐LSTM baselines, it improves average accuracy by 1.89 %, 8.02 %, and 6.67 % and Kappa by 0.0732, 0.1374, and 0.2009, respectively. Although training time increases by 10.2 %–14.2 %, the substantial gains in predictive performance justify the additional computational cost.
刀具磨损对加工稳定性和工件质量有着至关重要的影响,刀具磨损的准确、智能监测是防止刀具失效和保证产品一致性不可或缺的手段。虽然通过磨损图像进行直接评估是可能的,但它需要中断加工过程,因此不适合实时生产。一个更可行的解决方案是利用过程中的信号(如振动)来实现连续监测。在这里,我们提出了一种信号处理方法,即白鲸优化-连续变分模态分解(BWO - SVMD)用于噪声抑制,然后进行S -变换以产生高分辨率时频表示。基于这些去噪的频谱图,我们开发了一个集成了多尺度卷积神经网络(MSCNN)、长短期记忆(LSTM)单元和通道空间注意机制的智能监测模型。实验结果表明,该模型的分类准确率为96.25%,Kappa系数为0.9686,总计算时间为320.64 s。与CNN - LSTM - Attention、MSCNN - Attention和MSCNN - LSTM基线相比,平均准确率分别提高了1.89%、8.02%和6.67%,Kappa分别提高了0.0732、0.1374和0.2009。虽然训练时间增加了10.2% - 14.2%,但预测性能的显著提高证明了额外的计算成本是合理的。
{"title":"Intelligent tool wear monitoring approach in milling of titanium alloys","authors":"Shucai Yang,&nbsp;Runjie Jiang,&nbsp;Zekun Song,&nbsp;Dongqi Yu","doi":"10.1016/j.rcim.2025.103181","DOIUrl":"10.1016/j.rcim.2025.103181","url":null,"abstract":"<div><div>Tool wear exerts a critical influence on machining stability and workpiece quality, making its accurate, intelligent monitoring indispensable for preventing tool failure and ensuring product consistency. Although direct assessment via wear imagery is possible, it requires interrupting the machining process and thus is impractical for real‐time production. A more viable solution is to leverage in‐process signals—such as vibration—to enable continuous monitoring. Here, we present a Signal processing method that Beluga whale optimization‐Successive variational mode decomposition (BWO‐SVMD) for noise suppression, followed by the S‐transform to produce high‐resolution time–frequency representations. Based on these denoised spectrograms, we develop an intelligent monitoring model that integrates a multi‐scale convolutional neural network (MSCNN), long short‐term memory (LSTM) units, and a channel–spatial attention mechanism. Experimental results demonstrate that our model achieves 96.25 % classification accuracy, a Kappa coefficient of 0.9686, and a total computation time of 320.64 s. Compared with CNN‐LSTM‐Attention, MSCNN‐Attention, and MSCNN‐LSTM baselines, it improves average accuracy by 1.89 %, 8.02 %, and 6.67 % and Kappa by 0.0732, 0.1374, and 0.2009, respectively. Although training time increases by 10.2 %–14.2 %, the substantial gains in predictive performance justify the additional computational cost.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103181"},"PeriodicalIF":11.4,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145498721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topology matrix-based interlocking path planning method for robotic additive manufacturing of thin-walled multi-rib structures 基于拓扑矩阵的薄壁多肋结构机器人增材制造联锁路径规划方法
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-11 DOI: 10.1016/j.rcim.2025.103182
Tao Zhao , Zhaoyang Yan , Xiaoyong Zhang , Runsheng Li , Kehong Wang , Shujun Chen
Thin-walled multi-rib structures are widely used in high-end manufacturing sectors such as aerospace and defense equipment due to their high strength-to-weight ratio. However, traditional manufacturing methods face challenges including prolonged processing cycles and low material utilization. Arc-based directed energy deposition (DED-Arc) technology, characterized by its high efficiency and flexibility, offers a novel approach for the rapid fabrication of thin-walled multi-rib structures. This study focuses on high-ribbed panels in thin-walled multi-rib structures, analyzing their common structural characteristics and proposing a unified path planning method based on an interlocking topology matrix. A standardized topological matrix data structure was developed to describe the medial-axis nodes and topological relationships of high-ribbed panels. A unified path search algorithm was designed based on the topological matrix, employing an alternating search strategy (X-direction for odd layers and Y-direction for even layers) to generate continuous deposition paths. By strategically offsetting the printing contours between adjacent layers, the method achieves topological dispersion and mutual interlocking of weak points across sliced layers. The cross-regions were specifically optimized to ensure overlap-free deposition paths and rational distribution of arc ignition/extinction positions, effectively reducing the number of arc ignition/extinction and improving forming quality. Deposition experiments on four typical thin-walled multi-rib structures demonstrated that the interlocking path planning method significantly enhances surface quality by mitigating height differences at arc ignition/extinction points and improving overlap at intersections., while maintaining overall height errors within 3 mm. The results demonstrate that the proposed method improves manufacturing efficiency and forming quality, supporting DED-Arc applications in lightweight structures.
薄壁多肋结构因其高强度重量比被广泛应用于航空航天、国防装备等高端制造领域。然而,传统的制造方法面临着加工周期长、材料利用率低等挑战。基于电弧的定向能沉积(ed - arc)技术以其高效、灵活的特点,为薄壁多肋结构的快速制造提供了一种新的方法。以薄壁多肋结构中的高肋板为研究对象,分析了其共同的结构特征,提出了一种基于互锁拓扑矩阵的统一路径规划方法。建立了一种标准化的拓扑矩阵数据结构来描述高肋板的中轴节点和拓扑关系。设计了基于拓扑矩阵的统一路径搜索算法,采用奇数层x方向和偶数层y方向交替搜索策略生成连续沉积路径。该方法通过在相邻层之间有策略地偏移打印轮廓,实现了切片层间薄弱点的拓扑分散和互锁。对交叉区域进行了针对性优化,保证了沉积路径无重叠,燃灭弧位置分布合理,有效减少了燃灭弧次数,提高了成形质量。在4种典型薄壁多肋结构上进行的沉积实验表明,联锁路径规划方法通过减小电弧起熄点高度差和改善交点重叠,显著提高了表面质量。,同时将整体高度误差控制在3mm以内。结果表明,该方法提高了制造效率和成形质量,支持了d - arc在轻量化结构中的应用。
{"title":"Topology matrix-based interlocking path planning method for robotic additive manufacturing of thin-walled multi-rib structures","authors":"Tao Zhao ,&nbsp;Zhaoyang Yan ,&nbsp;Xiaoyong Zhang ,&nbsp;Runsheng Li ,&nbsp;Kehong Wang ,&nbsp;Shujun Chen","doi":"10.1016/j.rcim.2025.103182","DOIUrl":"10.1016/j.rcim.2025.103182","url":null,"abstract":"<div><div>Thin-walled multi-rib structures are widely used in high-end manufacturing sectors such as aerospace and defense equipment due to their high strength-to-weight ratio. However, traditional manufacturing methods face challenges including prolonged processing cycles and low material utilization. Arc-based directed energy deposition (DED-Arc) technology, characterized by its high efficiency and flexibility, offers a novel approach for the rapid fabrication of thin-walled multi-rib structures. This study focuses on high-ribbed panels in thin-walled multi-rib structures, analyzing their common structural characteristics and proposing a unified path planning method based on an interlocking topology matrix. A standardized topological matrix data structure was developed to describe the medial-axis nodes and topological relationships of high-ribbed panels. A unified path search algorithm was designed based on the topological matrix, employing an alternating search strategy (X-direction for odd layers and Y-direction for even layers) to generate continuous deposition paths. By strategically offsetting the printing contours between adjacent layers, the method achieves topological dispersion and mutual interlocking of weak points across sliced layers. The cross-regions were specifically optimized to ensure overlap-free deposition paths and rational distribution of arc ignition/extinction positions, effectively reducing the number of arc ignition/extinction and improving forming quality. Deposition experiments on four typical thin-walled multi-rib structures demonstrated that the interlocking path planning method significantly enhances surface quality by mitigating height differences at arc ignition/extinction points and improving overlap at intersections., while maintaining overall height errors within 3 mm. The results demonstrate that the proposed method improves manufacturing efficiency and forming quality, supporting DED-Arc applications in lightweight structures.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103182"},"PeriodicalIF":11.4,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145509598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-empowered die design framework for giga-casting 用于千兆级铸造的ai支持的模具设计框架
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-10 DOI: 10.1016/j.rcim.2025.103179
Quanzhi Sun , Weipeng Liu , Tao Peng , Peng Zhao
Giga-casting has been rapidly developing in automotive industry since 2019, showing great advantages for lightweighting and production efficiency. Among many influential factors, die design is particularly critical for the quality of giga-casting components, as it governs the molten metal filling and solidification process. However, die design for giga-casting components faces significant challenges due to their large size, complex structure, and stringent performance requirements. The corresponding filling and solidification process have become increasingly complex to control, rendering traditional experience-based methods inadequate, which leads to time-consuming yet insufficient design. Recent engineering applications of Artificial Intelligence (AI) demonstrate great potential in complex product design, but how to effectively realize AI-empowered die design has received little attention. This paper conducts a comprehensive review of die design, identifies the key challenges and enabling factors of AI in this context, and elaborates on the proposed technical framework. The two major contributions are: 1) A four-stage evolution of casting die design is systematically analyzed to highlight existing research gaps. 2) A three-component technical framework of AI-empowered die design for giga-casting is proposed. The key enabling technologies and challenges in this framework are carefully discussed. It is envisioned that this study will establish a new procedure to improve die design efficiency.
自2019年以来,千兆铸造在汽车行业迅速发展,在轻量化和生产效率方面表现出巨大的优势。在众多影响因素中,模具设计对千兆铸造部件的质量尤为关键,因为它决定了熔融金属的填充和凝固过程。然而,千兆级铸造部件的模具设计由于其大尺寸、复杂结构和严格的性能要求而面临着重大挑战。相应的填充和凝固过程变得越来越复杂,难以控制,传统的基于经验的方法不适合,导致耗时且设计不足。近年来,人工智能在复杂产品设计中的工程应用显示出巨大的潜力,但如何有效地实现人工智能驱动的模具设计却很少受到关注。本文对模具设计进行了全面的回顾,确定了在这种情况下人工智能的关键挑战和使能因素,并详细阐述了拟议的技术框架。两个主要贡献是:1)系统分析了铸造模具设计的四个阶段演变,以突出现有的研究空白。2)提出了千兆级铸造人工智能模具设计的三组件技术框架。仔细讨论了该框架中的关键使能技术和挑战。预计本研究将为提高模具设计效率提供一种新的方法。
{"title":"AI-empowered die design framework for giga-casting","authors":"Quanzhi Sun ,&nbsp;Weipeng Liu ,&nbsp;Tao Peng ,&nbsp;Peng Zhao","doi":"10.1016/j.rcim.2025.103179","DOIUrl":"10.1016/j.rcim.2025.103179","url":null,"abstract":"<div><div>Giga-casting has been rapidly developing in automotive industry since 2019, showing great advantages for lightweighting and production efficiency. Among many influential factors, die design is particularly critical for the quality of giga-casting components, as it governs the molten metal filling and solidification process. However, die design for giga-casting components faces significant challenges due to their large size, complex structure, and stringent performance requirements. The corresponding filling and solidification process have become increasingly complex to control, rendering traditional experience-based methods inadequate, which leads to time-consuming yet insufficient design. Recent engineering applications of Artificial Intelligence (AI) demonstrate great potential in complex product design, but how to effectively realize AI-empowered die design has received little attention. This paper conducts a comprehensive review of die design, identifies the key challenges and enabling factors of AI in this context, and elaborates on the proposed technical framework. The two major contributions are: 1) A four-stage evolution of casting die design is systematically analyzed to highlight existing research gaps. 2) A three-component technical framework of AI-empowered die design for giga-casting is proposed. The key enabling technologies and challenges in this framework are carefully discussed. It is envisioned that this study will establish a new procedure to improve die design efficiency.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103179"},"PeriodicalIF":11.4,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145485537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse-VMICP: A weak feature point cloud registration algorithm for robotic vision measurement of large complex components 稀疏- vmicp:一种用于大型复杂部件机器人视觉测量的弱特征点云配准算法
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-06 DOI: 10.1016/j.rcim.2025.103175
Xiaozhi Feng , Tao Ding , Hao Wu , Di Li , Ning Jiang , Dahu Zhu
High-precision three-dimensional (3D) measurement of large complex components (LCCs) such as vehicle bodies provides data benchmark for subsequent robotized manufacturing processes. A huge challenge in LCCs measurement is to register the adjacent point clouds with partial overlap, especially when the point cloud geometric features are weak. Despite the existing sparse iterative closest point (Sparse-ICP) registration algorithm uses lp norm to reduce the influence of non-overlapping point clouds during the registration process, however sparse point pairs are prone to fall into local optimum, which causes the registration accuracy to be greatly affected by the initial pose. To overcome the challenging problem, we inherit the advantage of the Sparse-ICP algorithm that the point-to-point distance can suppress tangential slip in the smooth areas. On this basis, we introduce the constraint of point-to-plane distance variance minimization under sparse condition that can suppress the incorrect registration inclination caused by uneven point cloud density, and then present a hybrid algorithm termed as Sparse-VMICP for weak feature point cloud registration. The proposed algorithm aims to enhance the robotic vision measurement accuracy by suppressing registration inclination to adjust the local optimal solution. Robotic vision measurement experiments on two typical LCCs, including high-speed rail body and car bodywork are conducted to verify the superiority of the proposed algorithm. The results demonstrate that the proposed algorithm can effectively reduce the accumulated registration errors in large-scale metrology, compared with other state-of-the-art algorithms, and the stitching measurement accuracy of LCCs can reach 0.012 mm.
大型复杂部件(如车身)的高精度三维测量为后续的机器人制造过程提供了数据基准。在lcc测量中,存在部分重叠的相邻点云的配准是一个巨大的挑战,尤其是在点云几何特征较弱的情况下。尽管现有的稀疏迭代最近点配准算法在配准过程中使用lp范数来减少不重叠点云的影响,但稀疏点对容易陷入局部最优,这使得配准精度受到初始姿态的很大影响。为了克服这一难题,我们继承了稀疏icp算法的优点,即点对点距离可以抑制平滑区域的切向滑动。在此基础上,引入稀疏条件下的点面距离方差最小化约束,抑制了点云密度不均匀导致的配准错误倾向,提出了一种用于弱特征点云配准的稀疏- vmicp混合算法。该算法通过抑制配准倾斜度调整局部最优解来提高机器人视觉测量精度。在高铁车身和汽车车身两种典型lcc上进行了机器人视觉测量实验,验证了该算法的优越性。结果表明,与现有算法相比,该算法能有效降低大尺度计量中累积的配准误差,拼接测量精度可达0.012 mm。
{"title":"Sparse-VMICP: A weak feature point cloud registration algorithm for robotic vision measurement of large complex components","authors":"Xiaozhi Feng ,&nbsp;Tao Ding ,&nbsp;Hao Wu ,&nbsp;Di Li ,&nbsp;Ning Jiang ,&nbsp;Dahu Zhu","doi":"10.1016/j.rcim.2025.103175","DOIUrl":"10.1016/j.rcim.2025.103175","url":null,"abstract":"<div><div>High-precision three-dimensional (3D) measurement of large complex components (LCCs) such as vehicle bodies provides data benchmark for subsequent robotized manufacturing processes. A huge challenge in LCCs measurement is to register the adjacent point clouds with partial overlap, especially when the point cloud geometric features are weak. Despite the existing sparse iterative closest point (Sparse-ICP) registration algorithm uses <em>l<sub>p</sub></em> norm to reduce the influence of non-overlapping point clouds during the registration process, however sparse point pairs are prone to fall into local optimum, which causes the registration accuracy to be greatly affected by the initial pose. To overcome the challenging problem, we inherit the advantage of the Sparse-ICP algorithm that the point-to-point distance can suppress tangential slip in the smooth areas. On this basis, we introduce the constraint of point-to-plane distance variance minimization under sparse condition that can suppress the incorrect registration inclination caused by uneven point cloud density, and then present a hybrid algorithm termed as Sparse-VMICP for weak feature point cloud registration. The proposed algorithm aims to enhance the robotic vision measurement accuracy by suppressing registration inclination to adjust the local optimal solution. Robotic vision measurement experiments on two typical LCCs, including high-speed rail body and car bodywork are conducted to verify the superiority of the proposed algorithm. The results demonstrate that the proposed algorithm can effectively reduce the accumulated registration errors in large-scale metrology, compared with other state-of-the-art algorithms, and the stitching measurement accuracy of LCCs can reach 0.012 mm.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103175"},"PeriodicalIF":11.4,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145447346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From insight to autonomous execution: VLM-enhanced embodied agents towards digital twin-assisted human-robot collaborative assembly 从洞察到自主执行:vlm增强的具体代理到数字双辅助人机协作装配
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-05 DOI: 10.1016/j.rcim.2025.103176
Changchun Liu , JiaYe Song , Dunbing Tang , Liping Wang , Haihua Zhu , Qixiang Cai
In recent years, embodied intelligence has emerged as a practicable strategy for accomplishing human-level cognitive abilities, reasoning capacities, and execution capabilities within human-robot collaborative (HRC) assembly scenarios. As the physical instantiation of embodied intelligence, embodied agents remain largely in the exploratory phase; their practical application has yet to mature into a standardized paradigm. A key bottleneck lies in the lack of universally applicable enabling technologies, coupled with a disconnection from physical robot control systems. This deficiency necessitates repetitious training for a variety of functional models when operating in dynamic HRC environments, significantly hindering the ability of embodied agents to acclimate to complicated, dynamically changing collaborative settings. To address this challenge, this study proposes VLM-enhanced embodied agents, specifically tailored to support multimodal cognition, task reasoning, and autonomous execution in digital twin-assisted HRC assembly contexts. The framework is structured through several core steps to realize the full process closed loop from insight to autonomous execution of robots supported by embodied intelligent agents. First, a precise epsilon map relation between the embodied agent and the physical cobot is constructed, thereby enabling the digital characterization and functional capsulation of embodied agents. Building on this agent-based framework, a VLM is developed that integrates domain-specific knowledge with real-time scenario information. This dual-driven design endows the VLM with enhanced perceptual capabilities, allowing it to rapidly recognize and respond to dynamic changes in HRC scenarios. To provide a simulation and deduction engine for embodied reasoning of the assembly task, a digital twin model of the HRC scenario is built to serve as the “embodied brain”. Subsequently, these reasoning results are fed into the VLM serving as invoking parameters for the homologous sub-functional code module. This process facilitates the generation of complete robot motion code, enabling seamless physical execution and thus functioning as the “embodied neuron”. Finally, comparable experiments are conducted in an actual HRC assembly environment. The experimental results demonstrate that the proposed VLM-enhanced embodied agents have competitive advantages in multimodal cognition, task reasoning, and autonomous execution.
近年来,具身智能(embodied intelligence)作为一种可行的策略,在人机协作(HRC)装配场景中实现人类水平的认知能力、推理能力和执行能力。作为具身智能的物理实例,具身代理在很大程度上仍处于探索阶段;它们的实际应用尚未成熟为一个标准化的范例。一个关键的瓶颈在于缺乏普遍适用的使能技术,再加上与物理机器人控制系统的脱节。这一缺陷需要在动态HRC环境中对各种功能模型进行重复训练,这极大地阻碍了具身代理适应复杂、动态变化的协作环境的能力。为了应对这一挑战,本研究提出了vlm增强的具身代理,专门用于支持数字孪生辅助HRC装配环境中的多模态认知、任务推理和自主执行。该框架通过几个核心步骤来构建,以实现由具身智能代理支持的机器人从洞察到自主执行的全过程闭环。首先,构建了具身智能体与物理协作机器人之间精确的epsilon映射关系,从而实现了具身智能体的数字化表征和功能封装。在这个基于代理的框架的基础上,开发了一个集成了特定领域知识和实时场景信息的VLM。这种双驱动设计赋予VLM增强的感知能力,使其能够快速识别和响应HRC场景中的动态变化。为了为装配任务的具身推理提供仿真和推理引擎,构建了HRC场景的数字孪生模型作为“具身大脑”。然后,将这些推理结果作为相应子函数代码模块的调用参数馈送到VLM中。这个过程有利于生成完整的机器人运动代码,实现无缝的物理执行,从而起到“具身神经元”的作用。最后,在实际的HRC装配环境中进行了对比实验。实验结果表明,该方法在多模态认知、任务推理和自主执行等方面具有竞争优势。
{"title":"From insight to autonomous execution: VLM-enhanced embodied agents towards digital twin-assisted human-robot collaborative assembly","authors":"Changchun Liu ,&nbsp;JiaYe Song ,&nbsp;Dunbing Tang ,&nbsp;Liping Wang ,&nbsp;Haihua Zhu ,&nbsp;Qixiang Cai","doi":"10.1016/j.rcim.2025.103176","DOIUrl":"10.1016/j.rcim.2025.103176","url":null,"abstract":"<div><div>In recent years, embodied intelligence has emerged as a practicable strategy for accomplishing human-level cognitive abilities, reasoning capacities, and execution capabilities within human-robot collaborative (HRC) assembly scenarios. As the physical instantiation of embodied intelligence, embodied agents remain largely in the exploratory phase; their practical application has yet to mature into a standardized paradigm. A key bottleneck lies in the lack of universally applicable enabling technologies, coupled with a disconnection from physical robot control systems. This deficiency necessitates repetitious training for a variety of functional models when operating in dynamic HRC environments, significantly hindering the ability of embodied agents to acclimate to complicated, dynamically changing collaborative settings. To address this challenge, this study proposes VLM-enhanced embodied agents, specifically tailored to support multimodal cognition, task reasoning, and autonomous execution in digital twin-assisted HRC assembly contexts. The framework is structured through several core steps to realize the full process closed loop from insight to autonomous execution of robots supported by embodied intelligent agents. First, a precise epsilon map relation between the embodied agent and the physical cobot is constructed, thereby enabling the digital characterization and functional capsulation of embodied agents. Building on this agent-based framework, a VLM is developed that integrates domain-specific knowledge with real-time scenario information. This dual-driven design endows the VLM with enhanced perceptual capabilities, allowing it to rapidly recognize and respond to dynamic changes in HRC scenarios. To provide a simulation and deduction engine for embodied reasoning of the assembly task, a digital twin model of the HRC scenario is built to serve as the “embodied brain”. Subsequently, these reasoning results are fed into the VLM serving as invoking parameters for the homologous sub-functional code module. This process facilitates the generation of complete robot motion code, enabling seamless physical execution and thus functioning as the “embodied neuron”. Finally, comparable experiments are conducted in an actual HRC assembly environment. The experimental results demonstrate that the proposed VLM-enhanced embodied agents have competitive advantages in multimodal cognition, task reasoning, and autonomous execution.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103176"},"PeriodicalIF":11.4,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145441661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autonomous robotic screwdriving for high-mix manufacturing 用于高混合制造的自主机器人螺丝驱动
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-04 DOI: 10.1016/j.rcim.2025.103172
Omey M. Manyar, Rutvik Patel, Satyandra K. Gupta
Screwdriving is a crucial task routinely performed during assembly, yet most of the current automation techniques are focused on mass manufacturing environments where there is typically low part variability. However, a substantial portion of manufacturing falls under high-mix production that entails significant uncertainties due to limited fixtures and cost constraints on tooling, making them predominantly manual. In this paper, we present an autonomous mobile robotic screwdriving system suitable for high-mix, low-volume manufacturing applications and designed to operate under semi-structured conditions, handling hole pose uncertainties of up to 4 mm/3°in the hole pose. To enhance decision-making and operational efficiency, we develop a physics-informed machine-learning model that predicts nonlinear screw-tip dynamics in Cartesian space. Additionally, we propose a decision tree-based failure detection framework that identifies four distinct failure modes using force signals from the robot’s end effector. We further introduce a novel fifth failure mode, a time-based threshold for unsuccessful insertions, where our dynamics model is used to determine when to reattempt screwdriving. This integration of predictive modeling, real-time failure detection, and alert generation for human-in-the-loop decision-making improves system resilience. Our failure detection method achieves an F1-score of 0.94 on validation data and a perfect recall of 1.0 on testing. We validate our approach through screwdriving experiments on 10 real-world industrial parts using three different screw types, demonstrating the system’s robustness and adaptability in a high-mix setting.
旋紧螺丝是装配过程中的一项重要任务,但目前大多数自动化技术都集中在大规模制造环境中,这些环境通常具有较低的部件可变性。然而,很大一部分制造属于高混合生产,由于有限的夹具和工具的成本限制,需要显著的不确定性,使它们主要是手工的。在本文中,我们提出了一种适用于高混合,小批量制造应用的自主移动机器人螺丝刀系统,设计用于在半结构化条件下运行,处理孔位不确定性高达4mm /3°。为了提高决策和操作效率,我们开发了一个物理信息的机器学习模型,该模型可以预测笛卡尔空间中的非线性螺尖动力学。此外,我们提出了一个基于决策树的故障检测框架,该框架使用机器人末端执行器的力信号识别四种不同的故障模式。我们进一步引入了新的第五种失效模式,即插入失败的基于时间的阈值,其中我们的动力学模型用于确定何时重新尝试螺丝刀。这种预测建模、实时故障检测和人为决策警报生成的集成提高了系统的弹性。我们的故障检测方法在验证数据上的f1得分为0.94,在测试上的召回率为1.0。我们通过使用三种不同类型的螺丝在10个真实工业零件上进行螺丝实验来验证我们的方法,证明了系统在高混合环境下的鲁棒性和适应性。
{"title":"Autonomous robotic screwdriving for high-mix manufacturing","authors":"Omey M. Manyar,&nbsp;Rutvik Patel,&nbsp;Satyandra K. Gupta","doi":"10.1016/j.rcim.2025.103172","DOIUrl":"10.1016/j.rcim.2025.103172","url":null,"abstract":"<div><div>Screwdriving is a crucial task routinely performed during assembly, yet most of the current automation techniques are focused on mass manufacturing environments where there is typically low part variability. However, a substantial portion of manufacturing falls under high-mix production that entails significant uncertainties due to limited fixtures and cost constraints on tooling, making them predominantly manual. In this paper, we present an autonomous mobile robotic screwdriving system suitable for high-mix, low-volume manufacturing applications and designed to operate under semi-structured conditions, handling hole pose uncertainties of up to 4 mm/3°in the hole pose. To enhance decision-making and operational efficiency, we develop a physics-informed machine-learning model that predicts nonlinear screw-tip dynamics in Cartesian space. Additionally, we propose a decision tree-based failure detection framework that identifies four distinct failure modes using force signals from the robot’s end effector. We further introduce a novel fifth failure mode, a time-based threshold for unsuccessful insertions, where our dynamics model is used to determine when to reattempt screwdriving. This integration of predictive modeling, real-time failure detection, and alert generation for human-in-the-loop decision-making improves system resilience. Our failure detection method achieves an F1-score of 0.94 on validation data and a perfect recall of 1.0 on testing. We validate our approach through screwdriving experiments on 10 real-world industrial parts using three different screw types, demonstrating the system’s robustness and adaptability in a high-mix setting.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103172"},"PeriodicalIF":11.4,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145435021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification and three-dimensional absorption of time-varying potential chatter during robotic milling 机器人铣削过程中时变电位颤振的识别与三维吸收
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-01 DOI: 10.1016/j.rcim.2025.103173
Jiawei Wu , Rui Fu , Xiaowei Tang , Shihao Xin , Fangyu Peng , Chenyang Wang
Robotic milling constitutes an important component of robotized intelligent manufacturing, gaining increasing popularity for subtractive manufacturing of large components. Extensive efforts have been devoted to the analysis and suppression of robot chatter to enhance milling efficiency and quality. However, the dynamic characteristics of robots are highly pose-dependent, leading to time-varying low-frequency chatter. Meanwhile, the low-frequency chatter is continuously influenced by the action of vibration suppression devices, making it challenging to consistently track and suppress time-varying chatter. To address this, this paper proposes a new concept, the potential chatter mode, to more accurately describe the target mode that requires attention in online chatter suppression. Inspired by the modulation mechanism between modal vibrations and spindle rotation during robotic milling, a potential chatter mode identification framework is developed. By investigating the distribution pattern of vibration spectra under the modulation mechanism, and integrating filtering, demodulation, signal decomposition, and vibration energy evaluation, it achieves the online identification of the time-varying frequency of potential chatter. Furthermore, the potential chatter exhibits a three-dimensional time-varying direction, whereas the existing suppression devices are generally designed to operate in one or two directions. This paper develops a novel three-dimensional orthogonal adaptive vibration absorber (TO-AVA) based on magnetorheological elastomers (MRE). By incorporating a parallel negative stiffness mechanism and parameter design, the TO-AVA can handle the three-dimensional time-varying direction of potential chatter. Validation experiments of robotic milling are conducted, which involves various process parameters and time-varying potential chatter across different directions, frequencies, and states. The results demonstrate that the developed framework can accurately identify time-varying potential chatter and effectively suppress it using the TO-AVA.
机器人铣削加工是机器人智能制造的重要组成部分,在大型零件减法制造中越来越受欢迎。为了提高铣削效率和质量,对机器人颤振进行了大量的分析和抑制。然而,机器人的动态特性高度依赖于姿态,导致时变低频颤振。同时,低频颤振不断受到减振装置作用的影响,为持续跟踪和抑制时变颤振带来了挑战。针对这一问题,本文提出了潜在颤振模式的概念,以更准确地描述在线颤振抑制中需要注意的目标模式。基于机器人铣削过程中模态振动与主轴旋转之间的调制机制,提出了一种潜在颤振模态识别框架。通过研究调制机制下的振动频谱分布规律,将滤波、解调、信号分解、振动能量评价等集成在一起,实现了潜在颤振时变频率的在线辨识。此外,潜在颤振表现出三维时变方向,而现有的抑制装置通常设计为在一个或两个方向上工作。研制了一种基于磁流变弹性体(MRE)的三维正交自适应吸振器。通过并联负刚度机构和参数设计,TO-AVA可以处理三维时变方向的潜在颤振。针对不同工艺参数和不同方向、频率和状态的时变潜在颤振,进行了铣削机器人的验证实验。结果表明,该框架能够准确识别时变潜在颤振,并利用TO-AVA有效抑制时变潜在颤振。
{"title":"Identification and three-dimensional absorption of time-varying potential chatter during robotic milling","authors":"Jiawei Wu ,&nbsp;Rui Fu ,&nbsp;Xiaowei Tang ,&nbsp;Shihao Xin ,&nbsp;Fangyu Peng ,&nbsp;Chenyang Wang","doi":"10.1016/j.rcim.2025.103173","DOIUrl":"10.1016/j.rcim.2025.103173","url":null,"abstract":"<div><div>Robotic milling constitutes an important component of robotized intelligent manufacturing, gaining increasing popularity for subtractive manufacturing of large components. Extensive efforts have been devoted to the analysis and suppression of robot chatter to enhance milling efficiency and quality. However, the dynamic characteristics of robots are highly pose-dependent, leading to time-varying low-frequency chatter. Meanwhile, the low-frequency chatter is continuously influenced by the action of vibration suppression devices, making it challenging to consistently track and suppress time-varying chatter. To address this, this paper proposes a new concept, the potential chatter mode, to more accurately describe the target mode that requires attention in online chatter suppression. Inspired by the modulation mechanism between modal vibrations and spindle rotation during robotic milling, a potential chatter mode identification framework is developed. By investigating the distribution pattern of vibration spectra under the modulation mechanism, and integrating filtering, demodulation, signal decomposition, and vibration energy evaluation, it achieves the online identification of the time-varying frequency of potential chatter. Furthermore, the potential chatter exhibits a three-dimensional time-varying direction, whereas the existing suppression devices are generally designed to operate in one or two directions. This paper develops a novel three-dimensional orthogonal adaptive vibration absorber (TO-AVA) based on magnetorheological elastomers (MRE). By incorporating a parallel negative stiffness mechanism and parameter design, the TO-AVA can handle the three-dimensional time-varying direction of potential chatter. Validation experiments of robotic milling are conducted, which involves various process parameters and time-varying potential chatter across different directions, frequencies, and states. The results demonstrate that the developed framework can accurately identify time-varying potential chatter and effectively suppress it using the TO-AVA.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103173"},"PeriodicalIF":11.4,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145412100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
You are my eyes: Integrating human intelligence and LLMs in AR-assisted motion planning for industrial mobile robots 你是我的眼睛:工业移动机器人ar辅助运动规划中人类智能与llm的集成
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-10-30 DOI: 10.1016/j.rcim.2025.103174
Shuguang Liu , Jiacheng Xie , Xuewen Wang , Xiaojun Qiao
Robot operation follows a perception–decision–execution loop, where motion planning is a critical stage of decision-making that occurs after task planning to ensure precise and efficient execution. Under the demands of smart manufacturing and flexible production, motion planning for industrial robots in dynamic and unstructured environments is particularly important. Large Language Models (LLMs), with strong capabilities in language understanding and logical reasoning, have shown potential in robot motion planning, particularly when combined with Vision-Language Models (VLMs). However, existing approaches rely on the models’ intrinsic understanding, which is constrained by insufficient domain knowledge in industrial scenarios and often requires customized training and fine-tuning, resulting in high cost and poor generalizability. Industry 5.0 emphasizes a human-centric value orientation and a production model of human–robot collaboration. Against this backdrop, an Augmented Reality (AR)-assisted motion planning method for industrial mobile robots is proposed. The method transforms human perceptual results into the geometric and semantic information of key task elements through AR manual annotation, which is then input into LLMs as known conditions to enable motion planning in complex scenarios. It fully leverages human advantages in spatial perception and fundamentally avoids the limitations of LLMs in understanding industrial environments. Furthermore, a two-level motion planning architecture for industrial mobile robots is proposed to serve as planning constraints for LLMs, improving planning efficiency. A proof of concept (PoC) on mechanical equipment maintenance demonstrates the method’s feasibility and effectiveness in industrial tasks, while additional experiments substantiate its contributions of low cost, high reliability, and zero-shot transferability.
机器人的操作遵循感知-决策-执行的循环,其中运动规划是任务规划之后的关键决策阶段,是保证机器人精确高效执行的关键环节。在智能制造和柔性生产的需求下,工业机器人在动态和非结构化环境中的运动规划显得尤为重要。大型语言模型(llm)具有强大的语言理解和逻辑推理能力,在机器人运动规划中显示出潜力,特别是当与视觉语言模型(vlm)结合使用时。然而,现有的方法依赖于模型的内在理解,在工业场景中受领域知识不足的限制,往往需要定制化的训练和微调,导致成本高,泛化能力差。工业5.0强调以人为中心的价值取向和人机协作的生产模式。在此背景下,提出了一种增强现实(AR)辅助的工业移动机器人运动规划方法。该方法通过AR人工标注将人类感知结果转化为关键任务要素的几何和语义信息,然后作为已知条件输入到llm中,实现复杂场景下的运动规划。它充分发挥了人类在空间感知方面的优势,从根本上避免了法学硕士在理解产业环境方面的局限性。在此基础上,提出了工业移动机器人的两级运动规划体系结构,作为llm的规划约束,提高了规划效率。机械设备维护的概念验证(PoC)证明了该方法在工业任务中的可行性和有效性,而额外的实验证实了其低成本,高可靠性和零枪可转移性的贡献。
{"title":"You are my eyes: Integrating human intelligence and LLMs in AR-assisted motion planning for industrial mobile robots","authors":"Shuguang Liu ,&nbsp;Jiacheng Xie ,&nbsp;Xuewen Wang ,&nbsp;Xiaojun Qiao","doi":"10.1016/j.rcim.2025.103174","DOIUrl":"10.1016/j.rcim.2025.103174","url":null,"abstract":"<div><div>Robot operation follows a perception–decision–execution loop, where motion planning is a critical stage of decision-making that occurs after task planning to ensure precise and efficient execution. Under the demands of smart manufacturing and flexible production, motion planning for industrial robots in dynamic and unstructured environments is particularly important. Large Language Models (LLMs), with strong capabilities in language understanding and logical reasoning, have shown potential in robot motion planning, particularly when combined with Vision-Language Models (VLMs). However, existing approaches rely on the models’ intrinsic understanding, which is constrained by insufficient domain knowledge in industrial scenarios and often requires customized training and fine-tuning, resulting in high cost and poor generalizability. Industry 5.0 emphasizes a human-centric value orientation and a production model of human–robot collaboration. Against this backdrop, an Augmented Reality (AR)-assisted motion planning method for industrial mobile robots is proposed. The method transforms human perceptual results into the geometric and semantic information of key task elements through AR manual annotation, which is then input into LLMs as known conditions to enable motion planning in complex scenarios. It fully leverages human advantages in spatial perception and fundamentally avoids the limitations of LLMs in understanding industrial environments. Furthermore, a two-level motion planning architecture for industrial mobile robots is proposed to serve as planning constraints for LLMs, improving planning efficiency. A proof of concept (PoC) on mechanical equipment maintenance demonstrates the method’s feasibility and effectiveness in industrial tasks, while additional experiments substantiate its contributions of low cost, high reliability, and zero-shot transferability.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103174"},"PeriodicalIF":11.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145396649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stable grasp generation enabled by part segmentation for real-world robotic applications 通过零件分割实现真实机器人应用的稳定抓取生成
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-10-28 DOI: 10.1016/j.rcim.2025.103170
Zirui Guo, Xieyuanli Chen, Junkai Ren, Zhiqiang Zheng, Huimin Lu, Ruibin Guo
Robotic manipulation necessitates the capability of advanced perception and grasp generation. Previous approaches for object perception in manipulation mainly rely on original point clouds captured from vision sensors, which exhibit inherent limitations in view perspectives and lack of further analysis of the sensor data. This research introduces implicit representation to facilitate part segmentation from imaging sensors, generating 3D models with structural information that provide grasp generation algorithms with more useful information. Regarding the robotic grasp, prior methods mostly rely on deep learning, which presents satisfactory performance on particular datasets yet raises concerns considering their generalization performance. Instead, this article proposes a novel grasp generation method based on 3D part segmentation, which circumvents the reliance on deep learning techniques. Extensive experimental results show that our approach can proficiently generate approximate part segmentation and high success rate grasps for various objects. By integrating part segmentation with grasp generation, the robot achieves accurate autonomous manipulation as shown in the supplementary video.
机器人操作需要先进的感知和抓取能力。以前的操作对象感知方法主要依赖于从视觉传感器捕获的原始点云,这些方法在视角上存在固有的局限性,并且缺乏对传感器数据的进一步分析。本研究引入隐式表示,以方便从成像传感器中分割零件,生成具有结构信息的三维模型,为抓取生成算法提供更多有用的信息。对于机器人抓取,先前的方法大多依赖于深度学习,它在特定数据集上表现出令人满意的性能,但考虑到其泛化性能,存在一些问题。本文提出了一种新的基于三维零件分割的抓取生成方法,避免了对深度学习技术的依赖。大量的实验结果表明,我们的方法可以熟练地生成近似的零件分割,并且对各种对象的抓取成功率很高。通过将零件分割与抓取生成相结合,机器人实现了精确的自主操作,如补充视频所示。
{"title":"Stable grasp generation enabled by part segmentation for real-world robotic applications","authors":"Zirui Guo,&nbsp;Xieyuanli Chen,&nbsp;Junkai Ren,&nbsp;Zhiqiang Zheng,&nbsp;Huimin Lu,&nbsp;Ruibin Guo","doi":"10.1016/j.rcim.2025.103170","DOIUrl":"10.1016/j.rcim.2025.103170","url":null,"abstract":"<div><div>Robotic manipulation necessitates the capability of advanced perception and grasp generation. Previous approaches for object perception in manipulation mainly rely on original point clouds captured from vision sensors, which exhibit inherent limitations in view perspectives and lack of further analysis of the sensor data. This research introduces implicit representation to facilitate part segmentation from imaging sensors, generating 3D models with structural information that provide grasp generation algorithms with more useful information. Regarding the robotic grasp, prior methods mostly rely on deep learning, which presents satisfactory performance on particular datasets yet raises concerns considering their generalization performance. Instead, this article proposes a novel grasp generation method based on 3D part segmentation, which circumvents the reliance on deep learning techniques. Extensive experimental results show that our approach can proficiently generate approximate part segmentation and high success rate grasps for various objects. By integrating part segmentation with grasp generation, the robot achieves accurate autonomous manipulation as shown in the supplementary video.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103170"},"PeriodicalIF":11.4,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145382962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A two-stage framework for learning human-to-robot object handover policy from 4D spatiotemporal flow 基于四维时空流的人-机器人物体切换策略学习的两阶段框架
IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-10-24 DOI: 10.1016/j.rcim.2025.103171
Ruirui Zhong , Bingtao Hu , Zhihao Liu , Qiang Qin , Yixiong Feng , Xi Vincent Wang , Lihui Wang , Jianrong Tan
Natural and safe Human-to-Robot (H2R) object handover is a critical capability for effective Human–Robot Collaboration (HRC). However, learning a robust handover policy for this task is often hindered by the prohibitive cost of collecting physical robot demonstrations and the limitations of simplistic state representations that inadequately capture the complex dynamics of the interaction. To address these challenges, a two-stage learning framework is proposed that synthesizes substantially augmented, synthetically diverse handover demonstrations without requiring a physical robot and subsequently learns a handover policy from a rich 4D spatiotemporal flow. First, an offline, physical robot-free data-generation pipeline is introduced that produces augmented and diverse handover demonstrations, thereby eliminating the need for costly physical data collection. Second, a novel 4D spatiotemporal flow is defined as a comprehensive representation consisting of a skeletal kinematic flow that captures high-level motion dynamics and a geometric motion flow that characterizes fine-grained surface interactions. Finally, a diffusion-based policy conditioned on this spatiotemporal representation is developed to generate coherent and anticipatory robot actions. Extensive experiments demonstrate that the proposed method significantly outperforms state-of-the-art baselines in task success, efficiency, and motion quality, thereby paving the way for safer and more intuitive collaborative robots.
自然、安全的人-机器人(H2R)对象切换是实现有效人机协作(HRC)的关键能力。然而,为这项任务学习一个健壮的切换策略经常受到收集物理机器人演示的高昂成本和简单状态表示的限制的阻碍,这些限制不能充分捕捉交互的复杂动态。为了解决这些挑战,提出了一个两阶段的学习框架,该框架在不需要物理机器人的情况下综合了大量增强的、综合多样化的切换演示,随后从丰富的四维时空流中学习切换策略。首先,引入了一个离线的、不需要机器人的数据生成管道,该管道可以生成增强的、多样化的移交演示,从而消除了对昂贵的物理数据收集的需要。其次,一种新的四维时空流被定义为一种全面的表示,包括捕获高级运动动力学的骨骼运动学流和表征细粒度表面相互作用的几何运动流。最后,基于这种时空表征的扩散策略被开发出来,以产生连贯和预期的机器人动作。大量的实验表明,所提出的方法在任务成功、效率和运动质量方面明显优于最先进的基线,从而为更安全、更直观的协作机器人铺平了道路。
{"title":"A two-stage framework for learning human-to-robot object handover policy from 4D spatiotemporal flow","authors":"Ruirui Zhong ,&nbsp;Bingtao Hu ,&nbsp;Zhihao Liu ,&nbsp;Qiang Qin ,&nbsp;Yixiong Feng ,&nbsp;Xi Vincent Wang ,&nbsp;Lihui Wang ,&nbsp;Jianrong Tan","doi":"10.1016/j.rcim.2025.103171","DOIUrl":"10.1016/j.rcim.2025.103171","url":null,"abstract":"<div><div>Natural and safe Human-to-Robot (H2R) object handover is a critical capability for effective Human–Robot Collaboration (HRC). However, learning a robust handover policy for this task is often hindered by the prohibitive cost of collecting physical robot demonstrations and the limitations of simplistic state representations that inadequately capture the complex dynamics of the interaction. To address these challenges, a two-stage learning framework is proposed that synthesizes substantially augmented, synthetically diverse handover demonstrations without requiring a physical robot and subsequently learns a handover policy from a rich 4D spatiotemporal flow. First, an offline, physical robot-free data-generation pipeline is introduced that produces augmented and diverse handover demonstrations, thereby eliminating the need for costly physical data collection. Second, a novel 4D spatiotemporal flow is defined as a comprehensive representation consisting of a skeletal kinematic flow that captures high-level motion dynamics and a geometric motion flow that characterizes fine-grained surface interactions. Finally, a diffusion-based policy conditioned on this spatiotemporal representation is developed to generate coherent and anticipatory robot actions. Extensive experiments demonstrate that the proposed method significantly outperforms state-of-the-art baselines in task success, efficiency, and motion quality, thereby paving the way for safer and more intuitive collaborative robots.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103171"},"PeriodicalIF":11.4,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145362825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Robotics and Computer-integrated Manufacturing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1