{"title":"Dual-Stream Multiscale Attention Monocular Depth Estimation Network","authors":"Ying Zou;Zhe Chen;Fuliang Yin","doi":"10.1109/JIOT.2025.3550580","DOIUrl":null,"url":null,"abstract":"Deep-learning-based methods have shown superior performance in monocular depth estimation tasks. However, the existing methods often overlook small-scale objects and vertical information while suffering from edge blurring and loss of low-texture information issues. To remedy the issue, a dual-stream multiscale attention network (DMA-Net) for monocular depth estimation is proposed, featuring an encoder-decoder pattern. Specifically, two scales of inputs are, respectively, fed into the pretrained ResNeXt-101 to extract diverse image features. Then, a multiscale attention feature fusion model is constructed, where self-attention dilated convolution blocks effectively capture multiscale global features with long-distance dependencies and feature fusion blocks promote information exchange between the two tributaries, further reinforcing features. Next, a guiding decoder is designed to refine the restored depth map by assistively integrating the outputs of each encoder layer, and exploit efficient channel attention network to recalibrate the meaningful information. Finally, vertical information extractor is utilized to capture vertical features for enhancing the restore ability of longitudinal depth details. Extensive experiments are conducted on the KITTI and NYU Depth V2 datasets, and the results show that the proposed DMA-Net outperforms all previous methods, achieving competitive results on the majority of the metrics.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 13","pages":"23073-23084"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10924248/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Deep-learning-based methods have shown superior performance in monocular depth estimation tasks. However, the existing methods often overlook small-scale objects and vertical information while suffering from edge blurring and loss of low-texture information issues. To remedy the issue, a dual-stream multiscale attention network (DMA-Net) for monocular depth estimation is proposed, featuring an encoder-decoder pattern. Specifically, two scales of inputs are, respectively, fed into the pretrained ResNeXt-101 to extract diverse image features. Then, a multiscale attention feature fusion model is constructed, where self-attention dilated convolution blocks effectively capture multiscale global features with long-distance dependencies and feature fusion blocks promote information exchange between the two tributaries, further reinforcing features. Next, a guiding decoder is designed to refine the restored depth map by assistively integrating the outputs of each encoder layer, and exploit efficient channel attention network to recalibrate the meaningful information. Finally, vertical information extractor is utilized to capture vertical features for enhancing the restore ability of longitudinal depth details. Extensive experiments are conducted on the KITTI and NYU Depth V2 datasets, and the results show that the proposed DMA-Net outperforms all previous methods, achieving competitive results on the majority of the metrics.
期刊介绍:
The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.