{"title":"AVCPNet: An AAV-Vehicle Collaborative Perception Network for 3-D Object Detection","authors":"Yuchao Wang;Zhirui Wang;Peirui Cheng;Pengju Tian;Ziyang Yuan;Jing Tian;Wensheng Wang;Liangjin Zhao","doi":"10.1109/TGRS.2025.3546669","DOIUrl":null,"url":null,"abstract":"With the advancement of collaborative perception, the role of autonomous aerial vehicle (AAV)–vehicle collaborative perception has become increasingly significant. The demand for collaborative perception from various perspectives to construct comprehensive perceptual information is rising. However, challenges emerge due to differences in the field of view (FOV) between cross-domain agents and their varying sensitivities to image information. Furthermore, accurate depth information is essential for collaboration to transform image features into bird’s eye view (BEV) features. To address these challenges, we propose a framework specifically designed for aerial-ground collaboration. First, to address the deficiency of datasets for aerial-ground collaboration, we have developed a virtual dataset named V2U-COO for our research. Second, we design a cross-domain cross-adaptation (CDCA) module to align the target information obtained from different domains, thereby achieving more accurate perception results. Finally, we introduce a collaborative depth optimization (CDO) module to obtain more precise depth estimation results, leading to more accurate perception results. We conduct extensive experiments on both our virtual dataset and a public dataset to validate the effectiveness of our framework. Our method resolves the feature fusion issue under significant height differences, a challenge that previous BEV generation methods struggled to address effectively. Our experiments on the V2U-COO and DAIR-V2X datasets demonstrate improvements in detection accuracy of 6.1% and 2.7%, respectively. Our code will be released at <uri>https://github.com/wyccoo/uvcp</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10909254/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
With the advancement of collaborative perception, the role of autonomous aerial vehicle (AAV)–vehicle collaborative perception has become increasingly significant. The demand for collaborative perception from various perspectives to construct comprehensive perceptual information is rising. However, challenges emerge due to differences in the field of view (FOV) between cross-domain agents and their varying sensitivities to image information. Furthermore, accurate depth information is essential for collaboration to transform image features into bird’s eye view (BEV) features. To address these challenges, we propose a framework specifically designed for aerial-ground collaboration. First, to address the deficiency of datasets for aerial-ground collaboration, we have developed a virtual dataset named V2U-COO for our research. Second, we design a cross-domain cross-adaptation (CDCA) module to align the target information obtained from different domains, thereby achieving more accurate perception results. Finally, we introduce a collaborative depth optimization (CDO) module to obtain more precise depth estimation results, leading to more accurate perception results. We conduct extensive experiments on both our virtual dataset and a public dataset to validate the effectiveness of our framework. Our method resolves the feature fusion issue under significant height differences, a challenge that previous BEV generation methods struggled to address effectively. Our experiments on the V2U-COO and DAIR-V2X datasets demonstrate improvements in detection accuracy of 6.1% and 2.7%, respectively. Our code will be released at https://github.com/wyccoo/uvcp.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.