基于视觉 SLAM 的智能变电站机器人轻量级多模态语义框架

IF 1.9 4区计算机科学 Q3 ROBOTICS Robotica Pub Date : 2024-04-19 DOI:10.1017/s0263574724000511

Shaohu Li, Jason Gu, Zhijun Li, Shaofeng Li, Bixiang Guo, Shangbing Gao, Feng Zhao, Yuwei Yang, Guoxin Li, Lanfang Dong

{"title":"基于视觉 SLAM 的智能变电站机器人轻量级多模态语义框架","authors":"Shaohu Li, Jason Gu, Zhijun Li, Shaofeng Li, Bixiang Guo, Shangbing Gao, Feng Zhao, Yuwei Yang, Guoxin Li, Lanfang Dong","doi":"10.1017/s0263574724000511","DOIUrl":null,"url":null,"abstract":"Visual simultaneous localisation and mapping (vSLAM) has shown considerable promise in positioning and navigating across a variety of indoor and outdoor settings, significantly enhancing the mobility of robots employed in industrial and everyday services. Nonetheless, the prevalent reliance of vSLAM technology on the assumption of static environments has led to suboptimal performance in practical implementations, particularly in unstructured and dynamically noisy environments such as substations. Despite advancements in mitigating the influence of dynamic objects through the integration of geometric and semantic information, existing approaches have struggled to strike an equilibrium between performance and real-time responsiveness. This study introduces a lightweight, multi-modal semantic framework predicated on vSLAM, designed to enable intelligent robots to adeptly navigate the dynamic environments characteristic of substations. The framework notably enhances vSLAM performance by mitigating the impact of dynamic objects through a synergistic combination of object detection and instance segmentation techniques. Initially, an enhanced lightweight instance segmentation network is deployed to ensure both the real-time responsiveness and accuracy of the algorithm. Subsequently, the algorithm’s performance is further refined by amalgamating the outcomes of detection and segmentation processes. With a commitment to maximising performance, the framework also ensures the algorithm’s real-time capability. Assessments conducted on public datasets and through empirical experiments have demonstrated that the proposed method markedly improves both the accuracy and real-time performance of vSLAM in dynamic environments.","PeriodicalId":49593,"journal":{"name":"Robotica","volume":"91 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A visual SLAM-based lightweight multi-modal semantic framework for an intelligent substation robot\",\"authors\":\"Shaohu Li, Jason Gu, Zhijun Li, Shaofeng Li, Bixiang Guo, Shangbing Gao, Feng Zhao, Yuwei Yang, Guoxin Li, Lanfang Dong\",\"doi\":\"10.1017/s0263574724000511\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual simultaneous localisation and mapping (vSLAM) has shown considerable promise in positioning and navigating across a variety of indoor and outdoor settings, significantly enhancing the mobility of robots employed in industrial and everyday services. Nonetheless, the prevalent reliance of vSLAM technology on the assumption of static environments has led to suboptimal performance in practical implementations, particularly in unstructured and dynamically noisy environments such as substations. Despite advancements in mitigating the influence of dynamic objects through the integration of geometric and semantic information, existing approaches have struggled to strike an equilibrium between performance and real-time responsiveness. This study introduces a lightweight, multi-modal semantic framework predicated on vSLAM, designed to enable intelligent robots to adeptly navigate the dynamic environments characteristic of substations. The framework notably enhances vSLAM performance by mitigating the impact of dynamic objects through a synergistic combination of object detection and instance segmentation techniques. Initially, an enhanced lightweight instance segmentation network is deployed to ensure both the real-time responsiveness and accuracy of the algorithm. Subsequently, the algorithm’s performance is further refined by amalgamating the outcomes of detection and segmentation processes. With a commitment to maximising performance, the framework also ensures the algorithm’s real-time capability. Assessments conducted on public datasets and through empirical experiments have demonstrated that the proposed method markedly improves both the accuracy and real-time performance of vSLAM in dynamic environments.\",\"PeriodicalId\":49593,\"journal\":{\"name\":\"Robotica\",\"volume\":\"91 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotica\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1017/s0263574724000511\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotica","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1017/s0263574724000511","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

视觉同步定位和绘图（vSLAM）技术在各种室内外环境中的定位和导航方面显示出了巨大的潜力，大大提高了工业和日常服务机器人的机动性。然而，由于 vSLAM 技术普遍依赖于静态环境假设，因此在实际应用中，尤其是在变电站等非结构化和动态嘈杂的环境中，其性能并不理想。尽管通过整合几何和语义信息在减轻动态物体影响方面取得了进步，但现有方法仍难以在性能和实时响应能力之间取得平衡。本研究在 vSLAM 的基础上引入了一个轻量级、多模态语义框架，旨在使智能机器人能够熟练地在变电站特有的动态环境中导航。该框架通过对象检测和实例分割技术的协同组合，减轻了动态对象的影响，从而显著提高了 vSLAM 的性能。首先，部署了一个增强型轻量级实例分割网络，以确保算法的实时响应性和准确性。随后，通过合并检测和分割过程的结果，进一步完善算法的性能。由于致力于最大限度地提高性能，该框架还确保了算法的实时性。在公共数据集上进行的评估和实证实验表明，所提出的方法显著提高了 vSLAM 在动态环境中的准确性和实时性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A visual SLAM-based lightweight multi-modal semantic framework for an intelligent substation robot

Visual simultaneous localisation and mapping (vSLAM) has shown considerable promise in positioning and navigating across a variety of indoor and outdoor settings, significantly enhancing the mobility of robots employed in industrial and everyday services. Nonetheless, the prevalent reliance of vSLAM technology on the assumption of static environments has led to suboptimal performance in practical implementations, particularly in unstructured and dynamically noisy environments such as substations. Despite advancements in mitigating the influence of dynamic objects through the integration of geometric and semantic information, existing approaches have struggled to strike an equilibrium between performance and real-time responsiveness. This study introduces a lightweight, multi-modal semantic framework predicated on vSLAM, designed to enable intelligent robots to adeptly navigate the dynamic environments characteristic of substations. The framework notably enhances vSLAM performance by mitigating the impact of dynamic objects through a synergistic combination of object detection and instance segmentation techniques. Initially, an enhanced lightweight instance segmentation network is deployed to ensure both the real-time responsiveness and accuracy of the algorithm. Subsequently, the algorithm’s performance is further refined by amalgamating the outcomes of detection and segmentation processes. With a commitment to maximising performance, the framework also ensures the algorithm’s real-time capability. Assessments conducted on public datasets and through empirical experiments have demonstrated that the proposed method markedly improves both the accuracy and real-time performance of vSLAM in dynamic environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Robotica 工程技术-机器人学

CiteScore

4.50

自引率

22.20%

发文量

181

审稿时长

9.9 months

期刊介绍： Robotica is a forum for the multidisciplinary subject of robotics and encourages developments, applications and research in this important field of automation and robotics with regard to industry, health, education and economic and social aspects of relevance. Coverage includes activities in hostile environments, applications in the service and manufacturing industries, biological robotics, dynamics and kinematics involved in robot design and uses, on-line robots, robot task planning, rehabilitation robotics, sensory perception, software in the widest sense, particularly in respect of programming languages and links with CAD/CAM systems, telerobotics and various other areas. In addition, interest is focused on various Artificial Intelligence topics of theoretical and practical interest.