{"title":"TICMapNet:用于矢量化高清地图学习的紧密耦合时态融合管道","authors":"Wenzhao Qiu;Shanmin Pang;Hao Zhang;Jianwu Fang;Jianru Xue","doi":"10.1109/LRA.2024.3490384","DOIUrl":null,"url":null,"abstract":"High-Definition (HD) map construction is essential for autonomous driving to accurately understand the surrounding environment. Most existing methods rely on single-frame inputs to predict local map, which often fail to effectively capture the temporal correlations between frames. This limitation results in discontinuities and instability in the generated map.To tackle this limitation, we propose a \n<italic>Ti</i>\nghtly \n<italic>C</i>\noupled temporal fusion \n<italic>Map</i>\n \n<italic>Net</i>\nwork (TICMapNet). TICMapNet breaks down the fusion process into three sub-problems: PV feature alignment, BEV feature adjustment, and Query feature fusion. By doing so, we effectively integrate temporal information at different stages through three plug-and-play modules, using the proposed tightly coupled strategy. Unlike traditional methods, our approach does not rely on camera extrinsic parameters, offering a new perspective for addressing the visual fusion challenge in the field of object detection. Experimental results show that TICMapNet significantly improves upon its single-frame baseline model, achieving at least a 7.0% increase in mAP using just two consecutive frames on the nuScenes dataset, while also showing generalizability across other tasks.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11289-11296"},"PeriodicalIF":4.6000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TICMapNet: A Tightly Coupled Temporal Fusion Pipeline for Vectorized HD Map Learning\",\"authors\":\"Wenzhao Qiu;Shanmin Pang;Hao Zhang;Jianwu Fang;Jianru Xue\",\"doi\":\"10.1109/LRA.2024.3490384\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-Definition (HD) map construction is essential for autonomous driving to accurately understand the surrounding environment. Most existing methods rely on single-frame inputs to predict local map, which often fail to effectively capture the temporal correlations between frames. This limitation results in discontinuities and instability in the generated map.To tackle this limitation, we propose a \\n<italic>Ti</i>\\nghtly \\n<italic>C</i>\\noupled temporal fusion \\n<italic>Map</i>\\n \\n<italic>Net</i>\\nwork (TICMapNet). TICMapNet breaks down the fusion process into three sub-problems: PV feature alignment, BEV feature adjustment, and Query feature fusion. By doing so, we effectively integrate temporal information at different stages through three plug-and-play modules, using the proposed tightly coupled strategy. Unlike traditional methods, our approach does not rely on camera extrinsic parameters, offering a new perspective for addressing the visual fusion challenge in the field of object detection. Experimental results show that TICMapNet significantly improves upon its single-frame baseline model, achieving at least a 7.0% increase in mAP using just two consecutive frames on the nuScenes dataset, while also showing generalizability across other tasks.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"9 12\",\"pages\":\"11289-11296\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10740793/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10740793/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
TICMapNet: A Tightly Coupled Temporal Fusion Pipeline for Vectorized HD Map Learning
High-Definition (HD) map construction is essential for autonomous driving to accurately understand the surrounding environment. Most existing methods rely on single-frame inputs to predict local map, which often fail to effectively capture the temporal correlations between frames. This limitation results in discontinuities and instability in the generated map.To tackle this limitation, we propose a
Ti
ghtly
C
oupled temporal fusion
Map
Net
work (TICMapNet). TICMapNet breaks down the fusion process into three sub-problems: PV feature alignment, BEV feature adjustment, and Query feature fusion. By doing so, we effectively integrate temporal information at different stages through three plug-and-play modules, using the proposed tightly coupled strategy. Unlike traditional methods, our approach does not rely on camera extrinsic parameters, offering a new perspective for addressing the visual fusion challenge in the field of object detection. Experimental results show that TICMapNet significantly improves upon its single-frame baseline model, achieving at least a 7.0% increase in mAP using just two consecutive frames on the nuScenes dataset, while also showing generalizability across other tasks.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.