{"title":"面向遥感图像多模态场景分类的多维知识精馏","authors":"Xiaomin Fan , Wujie Zhou","doi":"10.1016/j.dsp.2024.104876","DOIUrl":null,"url":null,"abstract":"<div><div>The advancement of deep learning technology has significantly improved the performance of remote sensing image (RSI) scene classification. However, it is important to note that most RSI scene classification models heavily depend on complex structures, resulting in high computational requirements and substantial costs. This study addresses this issue by utilizing a state-of-the-art model compression technique known as knowledge distillation (KD). The objective of KD is to transfer extensive knowledge from an excellent teacher model to a lightweight student model. While existing models focus on guiding the student network to learn specific stage or scale features from the teacher network, they lack comprehensiveness. To enhance the model's feature representation capability in complex scenarios, this study proposes a multidimensional KD approach (MKD). MKD enables the student network (MKD-S) to learn the feature representation capability of the teacher network (MKD-T) at each stage through a hybrid KD method. Specifically, the encoder incorporates a local-global KD mechanism to capture both low-level local information and high-level global information based on feature differences. Moreover, the fusion stage introduces inter-layer relationship KD and intra-layer feature KD to account for the dependencies between intermediate features within the MKD-S and MKD-T models. Additionally, the discrete wavelet transform, known for its ability to capture frequency domain and time domain features, is applied in the decoding stage of the MKD-T. This integration of decoding features across layers enables the completion of the knowledge response in the MKD-S. Experimental results demonstrate the effectiveness of our MKD on two benchmark datasets: Vaihingen and Potsdam.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104876"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multidimensional knowledge distillation for multimodal scene classification of remote sensing images\",\"authors\":\"Xiaomin Fan , Wujie Zhou\",\"doi\":\"10.1016/j.dsp.2024.104876\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The advancement of deep learning technology has significantly improved the performance of remote sensing image (RSI) scene classification. However, it is important to note that most RSI scene classification models heavily depend on complex structures, resulting in high computational requirements and substantial costs. This study addresses this issue by utilizing a state-of-the-art model compression technique known as knowledge distillation (KD). The objective of KD is to transfer extensive knowledge from an excellent teacher model to a lightweight student model. While existing models focus on guiding the student network to learn specific stage or scale features from the teacher network, they lack comprehensiveness. To enhance the model's feature representation capability in complex scenarios, this study proposes a multidimensional KD approach (MKD). MKD enables the student network (MKD-S) to learn the feature representation capability of the teacher network (MKD-T) at each stage through a hybrid KD method. Specifically, the encoder incorporates a local-global KD mechanism to capture both low-level local information and high-level global information based on feature differences. Moreover, the fusion stage introduces inter-layer relationship KD and intra-layer feature KD to account for the dependencies between intermediate features within the MKD-S and MKD-T models. Additionally, the discrete wavelet transform, known for its ability to capture frequency domain and time domain features, is applied in the decoding stage of the MKD-T. This integration of decoding features across layers enables the completion of the knowledge response in the MKD-S. Experimental results demonstrate the effectiveness of our MKD on two benchmark datasets: Vaihingen and Potsdam.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"157 \",\"pages\":\"Article 104876\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200424005001\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200424005001","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Multidimensional knowledge distillation for multimodal scene classification of remote sensing images
The advancement of deep learning technology has significantly improved the performance of remote sensing image (RSI) scene classification. However, it is important to note that most RSI scene classification models heavily depend on complex structures, resulting in high computational requirements and substantial costs. This study addresses this issue by utilizing a state-of-the-art model compression technique known as knowledge distillation (KD). The objective of KD is to transfer extensive knowledge from an excellent teacher model to a lightweight student model. While existing models focus on guiding the student network to learn specific stage or scale features from the teacher network, they lack comprehensiveness. To enhance the model's feature representation capability in complex scenarios, this study proposes a multidimensional KD approach (MKD). MKD enables the student network (MKD-S) to learn the feature representation capability of the teacher network (MKD-T) at each stage through a hybrid KD method. Specifically, the encoder incorporates a local-global KD mechanism to capture both low-level local information and high-level global information based on feature differences. Moreover, the fusion stage introduces inter-layer relationship KD and intra-layer feature KD to account for the dependencies between intermediate features within the MKD-S and MKD-T models. Additionally, the discrete wavelet transform, known for its ability to capture frequency domain and time domain features, is applied in the decoding stage of the MKD-T. This integration of decoding features across layers enables the completion of the knowledge response in the MKD-S. Experimental results demonstrate the effectiveness of our MKD on two benchmark datasets: Vaihingen and Potsdam.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,