Qing-Huang Song, Boyuan Wang, Yuandong Ma, Mengjie Hu, Chun Liu
{"title":"DL-YOLOX: Real-time object detection via adjustable dilated enhancement for autonomous driving scene","authors":"Qing-Huang Song, Boyuan Wang, Yuandong Ma, Mengjie Hu, Chun Liu","doi":"10.1177/01423312241239020","DOIUrl":null,"url":null,"abstract":"In the domain of autonomous driving, object detection presents several complex challenges, particularly concerning the accurate identification of small and salient objects. This paper introduces DL-YOLOX (Dilated Enhancement YOLOX), which flexibly uses dilated convolution to enhance features to achieve the purpose of improving small objects and silent objects. As we all know, a large receptive field covers a larger area and has greater contextual information, which is more advantageous for detecting large targets. A small receptive field helps capture local details and has better detection capabilities for detecting small targets. To bolster the representation of objects across various scales, we propose the integration of Dilated Adaptive Feature Fusion (DAFF) which has the ability to adaptively fuse features with different receptive fields. This innovative fusion mechanism allows for a more comprehensive understanding of objects, enabling improved detection accuracy even for objects of varying sizes. In addition, we tackle the issue of small object loss during feature propagation by introducing Stack Dilated Module (SDM), a powerful module that mitigates this phenomenon and contributes to better detection performance. Moreover, we endeavor to enhance small object detection further by replacing the conventional Intersection over Union (IoU) metric with Normalized Gaussian Wasserstein Distance (NWD), a novel distance metric that proves to be more effective in accurately gauging small object detection, thus elevating the precision of our algorithm. To thoroughly evaluate the robustness and generalization capabilities of our proposed method, we conduct extensive experiments on two benchmark datasets, namely MS COCO 2017 and BDD100K. The results from our evaluation not only affirm the significant improvements achieved in multi-scale object detection but also highlight the real-time capability of our approach. The impressive performance across these datasets demonstrates the promising potential of DL-YOLOX in revolutionizing object detection techniques in the context of autonomous driving.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":"85 5","pages":""},"PeriodicalIF":17.7000,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/01423312241239020","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In the domain of autonomous driving, object detection presents several complex challenges, particularly concerning the accurate identification of small and salient objects. This paper introduces DL-YOLOX (Dilated Enhancement YOLOX), which flexibly uses dilated convolution to enhance features to achieve the purpose of improving small objects and silent objects. As we all know, a large receptive field covers a larger area and has greater contextual information, which is more advantageous for detecting large targets. A small receptive field helps capture local details and has better detection capabilities for detecting small targets. To bolster the representation of objects across various scales, we propose the integration of Dilated Adaptive Feature Fusion (DAFF) which has the ability to adaptively fuse features with different receptive fields. This innovative fusion mechanism allows for a more comprehensive understanding of objects, enabling improved detection accuracy even for objects of varying sizes. In addition, we tackle the issue of small object loss during feature propagation by introducing Stack Dilated Module (SDM), a powerful module that mitigates this phenomenon and contributes to better detection performance. Moreover, we endeavor to enhance small object detection further by replacing the conventional Intersection over Union (IoU) metric with Normalized Gaussian Wasserstein Distance (NWD), a novel distance metric that proves to be more effective in accurately gauging small object detection, thus elevating the precision of our algorithm. To thoroughly evaluate the robustness and generalization capabilities of our proposed method, we conduct extensive experiments on two benchmark datasets, namely MS COCO 2017 and BDD100K. The results from our evaluation not only affirm the significant improvements achieved in multi-scale object detection but also highlight the real-time capability of our approach. The impressive performance across these datasets demonstrates the promising potential of DL-YOLOX in revolutionizing object detection techniques in the context of autonomous driving.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.