基于并行阻抗感知策略的关联学习结构增强跨模态物体感知

IF 5.9 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Instrumentation and Measurement Pub Date : 2025-02-10 DOI:10.1109/TIM.2025.3534219

Zhibin Li;Yuxuan Zhang;Weirong Dong;Jing Yang;Jiansong Feng;Chengbi Zhang;Xiaolong Chen;Taihong Wang

{"title":"基于并行阻抗感知策略的关联学习结构增强跨模态物体感知","authors":"Zhibin Li;Yuxuan Zhang;Weirong Dong;Jing Yang;Jiansong Feng;Chengbi Zhang;Xiaolong Chen;Taihong Wang","doi":"10.1109/TIM.2025.3534219","DOIUrl":null,"url":null,"abstract":"Human beings can infer the shape and material characteristics of grasping objects based on multisensory information, which is still a technical challenge for modern robots. The cross-modal object perception mechanism holds promise to assist robots in effectively executing various operations or interactive tasks in complex applications, particularly in harsh visual scenes. Here, we present an associated learning architecture equipped with a parallel impedance sensing strategy, which enhances the perception of captured objects by integrating visual data with somatosensory data from frequency division multiplexing (FDM) parallel impedance and finger bending angles of the robotic hand. We design a cross-modal generative adversarial network (CGAN) in this architecture to achieve cross-modal feature learning for two types of sensory data, mimicking the psychological cognition of human senses. Additionally, the dynamic attention fusion mechanism is employed for feature transfer and fusion learning, enabling the network to adaptively adjust weights based on input cross-modal features, resulting in dynamic feature fusion. The architecture has undergone training and testing with ten categories of objects, successfully achieving cross-modal feature learning and fusion recognition of the two sensory data. Under low-quality image conditions, the recognition accuracy of attention fusion reaches up to 94.0%, significantly surpassing the accuracy of vision alone. This highlights the potential of our architecture to enhance robots to accurately perceive the outside world by integrating visual and somatosensory data, especially in challenging visual environments.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-11"},"PeriodicalIF":5.9000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Associated Learning Architecture Equipped With Parallel Impedance Sensing Strategy to Enhance Cross-Modal Object Perception\",\"authors\":\"Zhibin Li;Yuxuan Zhang;Weirong Dong;Jing Yang;Jiansong Feng;Chengbi Zhang;Xiaolong Chen;Taihong Wang\",\"doi\":\"10.1109/TIM.2025.3534219\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human beings can infer the shape and material characteristics of grasping objects based on multisensory information, which is still a technical challenge for modern robots. The cross-modal object perception mechanism holds promise to assist robots in effectively executing various operations or interactive tasks in complex applications, particularly in harsh visual scenes. Here, we present an associated learning architecture equipped with a parallel impedance sensing strategy, which enhances the perception of captured objects by integrating visual data with somatosensory data from frequency division multiplexing (FDM) parallel impedance and finger bending angles of the robotic hand. We design a cross-modal generative adversarial network (CGAN) in this architecture to achieve cross-modal feature learning for two types of sensory data, mimicking the psychological cognition of human senses. Additionally, the dynamic attention fusion mechanism is employed for feature transfer and fusion learning, enabling the network to adaptively adjust weights based on input cross-modal features, resulting in dynamic feature fusion. The architecture has undergone training and testing with ten categories of objects, successfully achieving cross-modal feature learning and fusion recognition of the two sensory data. Under low-quality image conditions, the recognition accuracy of attention fusion reaches up to 94.0%, significantly surpassing the accuracy of vision alone. This highlights the potential of our architecture to enhance robots to accurately perceive the outside world by integrating visual and somatosensory data, especially in challenging visual environments.\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":\"74 \",\"pages\":\"1-11\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2025-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10879100/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10879100/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

人类可以根据多感官信息推断抓取物体的形状和材料特征，这仍然是现代机器人面临的技术挑战。跨模态对象感知机制有望帮助机器人在复杂应用中有效地执行各种操作或交互任务，特别是在恶劣的视觉场景中。在这里，我们提出了一种配备并行阻抗传感策略的相关学习架构，该架构通过将视觉数据与来自频分复用（FDM）并行阻抗和机器人手手指弯曲角度的体感数据相结合，增强了对捕获物体的感知。我们在该架构中设计了一个跨模态生成对抗网络（CGAN），以实现两种类型感官数据的跨模态特征学习，模拟人类感官的心理认知。此外，采用动态注意力融合机制进行特征转移和融合学习，使网络能够根据输入的跨模态特征自适应调整权值，实现动态特征融合。该架构经过了十类物体的训练和测试，成功实现了两种感官数据的跨模态特征学习和融合识别。在低质量图像条件下，注意融合的识别准确率高达94.0%，明显超过单纯视觉的识别准确率。这突出了我们的架构的潜力，通过整合视觉和体感数据来增强机器人准确感知外部世界，特别是在具有挑战性的视觉环境中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Associated Learning Architecture Equipped With Parallel Impedance Sensing Strategy to Enhance Cross-Modal Object Perception

Human beings can infer the shape and material characteristics of grasping objects based on multisensory information, which is still a technical challenge for modern robots. The cross-modal object perception mechanism holds promise to assist robots in effectively executing various operations or interactive tasks in complex applications, particularly in harsh visual scenes. Here, we present an associated learning architecture equipped with a parallel impedance sensing strategy, which enhances the perception of captured objects by integrating visual data with somatosensory data from frequency division multiplexing (FDM) parallel impedance and finger bending angles of the robotic hand. We design a cross-modal generative adversarial network (CGAN) in this architecture to achieve cross-modal feature learning for two types of sensory data, mimicking the psychological cognition of human senses. Additionally, the dynamic attention fusion mechanism is employed for feature transfer and fusion learning, enabling the network to adaptively adjust weights based on input cross-modal features, resulting in dynamic feature fusion. The architecture has undergone training and testing with ten categories of objects, successfully achieving cross-modal feature learning and fusion recognition of the two sensory data. Under low-quality image conditions, the recognition accuracy of attention fusion reaches up to 94.0%, significantly surpassing the accuracy of vision alone. This highlights the potential of our architecture to enhance robots to accurately perceive the outside world by integrating visual and somatosensory data, especially in challenging visual environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Instrumentation and Measurement 工程技术-工程：电子与电气

CiteScore

9.00

自引率

23.20%

发文量

1294

审稿时长

3.9 months

期刊介绍： Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.