Meftahul Ferdaus, Mahdi Abdelguerfi, Kendall N. Niles, Ken Pathak, Joe Tom
{"title":"Widened Attention‐Enhanced Atrous Convolutional Network for Efficient Embedded Vision Applications under Resource Constraints","authors":"Meftahul Ferdaus, Mahdi Abdelguerfi, Kendall N. Niles, Ken Pathak, Joe Tom","doi":"10.1002/aisy.202300480","DOIUrl":null,"url":null,"abstract":"Onboard image analysis enables real‐time autonomous capabilities for unmanned platforms including aerial, ground, and aquatic drones. Performing classification on embedded systems, rather than transmitting data, allows rapid perception and decision‐making critical for time‐sensitive applications such as search and rescue, hazardous environment exploration, and military operations. To fully capitalize on these systems’ potential, specialized deep learning solutions are needed that balance accuracy and computational efficiency for time‐sensitive inference. This article introduces the widened attention‐enhanced atrous convolution‐based efficient network (WACEfNet), a new convolutional neural network designed specifically for real‐time visual classification challenges using resource‐constrained embedded devices. WACEfNet builds on EfficientNet and integrates innovative width‐wise feature processing, atrous convolutions, and attention modules to improve representational power without excessive overhead. Extensive benchmarking confirms state‐of‐the‐art performance from WACEfNet for aerial imaging applications while remaining suitable for embedded deployment. The improvements in accuracy and speed demonstrate the potential of customized deep learning advancements to unlock new capabilities for unmanned aerial vehicles and related embedded systems with tight size, weight, and power constraints. This research offers an optimized framework, combining widened residual learning and attention mechanisms, to meet the unique demands of high‐fidelity real‐time analytics across a variety of embedded perception paradigms.","PeriodicalId":7187,"journal":{"name":"Advanced Intelligent Systems","volume":"8 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/aisy.202300480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Onboard image analysis enables real‐time autonomous capabilities for unmanned platforms including aerial, ground, and aquatic drones. Performing classification on embedded systems, rather than transmitting data, allows rapid perception and decision‐making critical for time‐sensitive applications such as search and rescue, hazardous environment exploration, and military operations. To fully capitalize on these systems’ potential, specialized deep learning solutions are needed that balance accuracy and computational efficiency for time‐sensitive inference. This article introduces the widened attention‐enhanced atrous convolution‐based efficient network (WACEfNet), a new convolutional neural network designed specifically for real‐time visual classification challenges using resource‐constrained embedded devices. WACEfNet builds on EfficientNet and integrates innovative width‐wise feature processing, atrous convolutions, and attention modules to improve representational power without excessive overhead. Extensive benchmarking confirms state‐of‐the‐art performance from WACEfNet for aerial imaging applications while remaining suitable for embedded deployment. The improvements in accuracy and speed demonstrate the potential of customized deep learning advancements to unlock new capabilities for unmanned aerial vehicles and related embedded systems with tight size, weight, and power constraints. This research offers an optimized framework, combining widened residual learning and attention mechanisms, to meet the unique demands of high‐fidelity real‐time analytics across a variety of embedded perception paradigms.