Qinghui Chen , Lunqian Wang , Zekai Zhang , Xinghua Wang , Weilin Liu , Bo Xia , Hao Ding , Jinglin Zhang , Sen Xu , Xin Wang
{"title":"用于图像遮挡和可变性超分辨率的双路径聚合变换器网络","authors":"Qinghui Chen , Lunqian Wang , Zekai Zhang , Xinghua Wang , Weilin Liu , Bo Xia , Hao Ding , Jinglin Zhang , Sen Xu , Xin Wang","doi":"10.1016/j.engappai.2024.109535","DOIUrl":null,"url":null,"abstract":"<div><div>While Transformer-based approaches have recently achieved notable success in super-resolution, their extensive computational requirements impede widespread practical adoption. High-resolution meteorological satellite cloud imagery is essential for weather analysis and forecasting. Enhancing image resolution through super-resolution techniques facilitates the accurate identification and localization of geographic features by meteorological systems. However, current super-resolution methods fail to restore the intricacies of cloud formations and complex regions fully. This research introduces a novel dual-path aggregation Transformer network (DPAT) tailored to enhance the super-resolution of meteorological satellite cloud images. The DPAT network adeptly captures cloud imagery's subtle details and textures, effectively addressing occlusions and the variability inherent in satellite imagery. It bolsters the model's ability to manage the complex attributes of cloud images through the introduction of the Dual-path Aggregation Self-Attention (DASA) mechanism and the Multi-scale Feature Aggregation Block (MFAB), thereby enhancing performance in processing intricate cloud features. The DASA mechanism synthesizes features across spatial, depth, and channel dimensions via a dual-path approach, thoroughly exploiting feature correlations. The MFAB, designed to supplant the multilayer perceptron, incorporates shift convolution and a multi-scale interaction block to augment feature information, compensating for the deficiency in local information absorption due to fixed receptive fields. Experimental outcomes indicate that DPAT delivers superior super-resolution outcomes. With a parameter count of only 32% of the Enhanced Deep Residual Network (EDSR) or 77% of the Image Restoration using Shift Window Transformer (SwinIR), DPAT matches SwinIR's performance on the satellite cloud dataset. Moreover, DPAT balances accuracy and parameter economy across various datasets. This technology is expected to improve image super-resolution capabilities in multiple fields such as human action recognition and industrial recognition, and indirectly improve the accuracy of image perception tasks.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109535"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual-path aggregation transformer network for super-resolution with images occlusions and variability\",\"authors\":\"Qinghui Chen , Lunqian Wang , Zekai Zhang , Xinghua Wang , Weilin Liu , Bo Xia , Hao Ding , Jinglin Zhang , Sen Xu , Xin Wang\",\"doi\":\"10.1016/j.engappai.2024.109535\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>While Transformer-based approaches have recently achieved notable success in super-resolution, their extensive computational requirements impede widespread practical adoption. High-resolution meteorological satellite cloud imagery is essential for weather analysis and forecasting. Enhancing image resolution through super-resolution techniques facilitates the accurate identification and localization of geographic features by meteorological systems. However, current super-resolution methods fail to restore the intricacies of cloud formations and complex regions fully. This research introduces a novel dual-path aggregation Transformer network (DPAT) tailored to enhance the super-resolution of meteorological satellite cloud images. The DPAT network adeptly captures cloud imagery's subtle details and textures, effectively addressing occlusions and the variability inherent in satellite imagery. It bolsters the model's ability to manage the complex attributes of cloud images through the introduction of the Dual-path Aggregation Self-Attention (DASA) mechanism and the Multi-scale Feature Aggregation Block (MFAB), thereby enhancing performance in processing intricate cloud features. The DASA mechanism synthesizes features across spatial, depth, and channel dimensions via a dual-path approach, thoroughly exploiting feature correlations. The MFAB, designed to supplant the multilayer perceptron, incorporates shift convolution and a multi-scale interaction block to augment feature information, compensating for the deficiency in local information absorption due to fixed receptive fields. Experimental outcomes indicate that DPAT delivers superior super-resolution outcomes. With a parameter count of only 32% of the Enhanced Deep Residual Network (EDSR) or 77% of the Image Restoration using Shift Window Transformer (SwinIR), DPAT matches SwinIR's performance on the satellite cloud dataset. Moreover, DPAT balances accuracy and parameter economy across various datasets. This technology is expected to improve image super-resolution capabilities in multiple fields such as human action recognition and industrial recognition, and indirectly improve the accuracy of image perception tasks.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"139 \",\"pages\":\"Article 109535\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624016932\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624016932","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Dual-path aggregation transformer network for super-resolution with images occlusions and variability
While Transformer-based approaches have recently achieved notable success in super-resolution, their extensive computational requirements impede widespread practical adoption. High-resolution meteorological satellite cloud imagery is essential for weather analysis and forecasting. Enhancing image resolution through super-resolution techniques facilitates the accurate identification and localization of geographic features by meteorological systems. However, current super-resolution methods fail to restore the intricacies of cloud formations and complex regions fully. This research introduces a novel dual-path aggregation Transformer network (DPAT) tailored to enhance the super-resolution of meteorological satellite cloud images. The DPAT network adeptly captures cloud imagery's subtle details and textures, effectively addressing occlusions and the variability inherent in satellite imagery. It bolsters the model's ability to manage the complex attributes of cloud images through the introduction of the Dual-path Aggregation Self-Attention (DASA) mechanism and the Multi-scale Feature Aggregation Block (MFAB), thereby enhancing performance in processing intricate cloud features. The DASA mechanism synthesizes features across spatial, depth, and channel dimensions via a dual-path approach, thoroughly exploiting feature correlations. The MFAB, designed to supplant the multilayer perceptron, incorporates shift convolution and a multi-scale interaction block to augment feature information, compensating for the deficiency in local information absorption due to fixed receptive fields. Experimental outcomes indicate that DPAT delivers superior super-resolution outcomes. With a parameter count of only 32% of the Enhanced Deep Residual Network (EDSR) or 77% of the Image Restoration using Shift Window Transformer (SwinIR), DPAT matches SwinIR's performance on the satellite cloud dataset. Moreover, DPAT balances accuracy and parameter economy across various datasets. This technology is expected to improve image super-resolution capabilities in multiple fields such as human action recognition and industrial recognition, and indirectly improve the accuracy of image perception tasks.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.