{"title":"TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection","authors":"Lei Cheng;Siyang Cao","doi":"10.1109/TRS.2025.3537604","DOIUrl":null,"url":null,"abstract":"Despite significant advancements in environment perception capabilities for autonomous driving and intelligent robotics, cameras and LiDARs remain notoriously unreliable in low-light conditions and adverse weather, which limits their effectiveness. Radar serves as a reliable and low-cost sensor that can effectively complement these limitations. However, radar-based object detection has been underexplored due to the inherent weaknesses of radar data, such as low resolution, high noise, and lack of visual information. In this article, we present TransRAD, a novel 3-D radar object detection model designed to address these challenges by leveraging the retentive vision transformer (RMT) to more effectively learn features from information-dense radar range-Azimuth–Doppler (RAD) data. Our approach leverages the retentive Manhattan self-attention (MaSA) mechanism provided by RMT to incorporate explicit spatial priors, thereby enabling more accurate alignment with the spatial saliency characteristics of radar targets in RAD data and achieving precise 3-D radar detection across RAD dimensions. Furthermore, we propose location-aware nonmaximum suppression (LA-NMS) to effectively mitigate the common issue of duplicate bounding boxes in deep radar object detection. The experimental results demonstrate that TransRAD outperforms state-of-the-art (SOTA) methods in both 2-D and 3-D radar detection tasks, achieving higher accuracy, faster inference speed, and reduced computational complexity. Code is available at <uri>https://github.com/radar-lab/TransRAD</uri>.","PeriodicalId":100645,"journal":{"name":"IEEE Transactions on Radar Systems","volume":"3 ","pages":"303-317"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Radar Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10869508/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Despite significant advancements in environment perception capabilities for autonomous driving and intelligent robotics, cameras and LiDARs remain notoriously unreliable in low-light conditions and adverse weather, which limits their effectiveness. Radar serves as a reliable and low-cost sensor that can effectively complement these limitations. However, radar-based object detection has been underexplored due to the inherent weaknesses of radar data, such as low resolution, high noise, and lack of visual information. In this article, we present TransRAD, a novel 3-D radar object detection model designed to address these challenges by leveraging the retentive vision transformer (RMT) to more effectively learn features from information-dense radar range-Azimuth–Doppler (RAD) data. Our approach leverages the retentive Manhattan self-attention (MaSA) mechanism provided by RMT to incorporate explicit spatial priors, thereby enabling more accurate alignment with the spatial saliency characteristics of radar targets in RAD data and achieving precise 3-D radar detection across RAD dimensions. Furthermore, we propose location-aware nonmaximum suppression (LA-NMS) to effectively mitigate the common issue of duplicate bounding boxes in deep radar object detection. The experimental results demonstrate that TransRAD outperforms state-of-the-art (SOTA) methods in both 2-D and 3-D radar detection tasks, achieving higher accuracy, faster inference speed, and reduced computational complexity. Code is available at https://github.com/radar-lab/TransRAD.