基于模态蒸馏的低成本多光谱场景分析

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2022-01-01 DOI:10.1109/WACV51458.2022.00339

Heng Zhang, É. Fromont, S. Lefèvre, Bruno Avignon

{"title":"基于模态蒸馏的低成本多光谱场景分析","authors":"Heng Zhang, É. Fromont, S. Lefèvre, Bruno Avignon","doi":"10.1109/WACV51458.2022.00339","DOIUrl":null,"url":null,"abstract":"Despite its robust performance under various illumination conditions, multispectral scene analysis has not been widely deployed due to two strong practical limitations: 1) thermal cameras, especially high-resolution ones are much more expensive than conventional visible cameras; 2) the most commonly adopted multispectral architectures, two-stream neural networks, nearly double the inference time of a regular mono-spectral model which makes them impractical in embedded environments. In this work, we aim to tackle these two limitations by proposing a novel knowledge distillation framework named Modality Distillation (MD). The proposed framework distils the knowledge from a high thermal resolution two-stream network with feature-level fusion to a low thermal resolution one-stream network with image-level fusion. We show on different multispectral scene analysis benchmarks that our method can effectively allow the use of low-resolution thermal sensors with more compact one-stream networks.","PeriodicalId":297092,"journal":{"name":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Low-cost Multispectral Scene Analysis with Modality Distillation\",\"authors\":\"Heng Zhang, É. Fromont, S. Lefèvre, Bruno Avignon\",\"doi\":\"10.1109/WACV51458.2022.00339\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite its robust performance under various illumination conditions, multispectral scene analysis has not been widely deployed due to two strong practical limitations: 1) thermal cameras, especially high-resolution ones are much more expensive than conventional visible cameras; 2) the most commonly adopted multispectral architectures, two-stream neural networks, nearly double the inference time of a regular mono-spectral model which makes them impractical in embedded environments. In this work, we aim to tackle these two limitations by proposing a novel knowledge distillation framework named Modality Distillation (MD). The proposed framework distils the knowledge from a high thermal resolution two-stream network with feature-level fusion to a low thermal resolution one-stream network with image-level fusion. We show on different multispectral scene analysis benchmarks that our method can effectively allow the use of low-resolution thermal sensors with more compact one-stream networks.\",\"PeriodicalId\":297092,\"journal\":{\"name\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV51458.2022.00339\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV51458.2022.00339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

尽管多光谱场景分析在各种光照条件下都具有良好的性能，但由于两个强烈的实际限制，它没有得到广泛应用:1)热像仪，特别是高分辨率热像仪比传统的可见光相机昂贵得多;2)最常用的多光谱结构——双流神经网络，其推理时间几乎是常规单光谱模型的两倍，这使得它们在嵌入式环境中不实用。在这项工作中，我们的目标是通过提出一种名为情态蒸馏(MD)的新型知识蒸馏框架来解决这两个限制。该框架将知识从具有特征级融合的高热分辨率双流网络提取到具有图像级融合的低热分辨率单流网络。我们在不同的多光谱场景分析基准测试中表明，我们的方法可以有效地允许使用具有更紧凑的单流网络的低分辨率热传感器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Low-cost Multispectral Scene Analysis with Modality Distillation

Despite its robust performance under various illumination conditions, multispectral scene analysis has not been widely deployed due to two strong practical limitations: 1) thermal cameras, especially high-resolution ones are much more expensive than conventional visible cameras; 2) the most commonly adopted multispectral architectures, two-stream neural networks, nearly double the inference time of a regular mono-spectral model which makes them impractical in embedded environments. In this work, we aim to tackle these two limitations by proposing a novel knowledge distillation framework named Modality Distillation (MD). The proposed framework distils the knowledge from a high thermal resolution two-stream network with feature-level fusion to a low thermal resolution one-stream network with image-level fusion. We show on different multispectral scene analysis benchmarks that our method can effectively allow the use of low-resolution thermal sensors with more compact one-stream networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量