{"title":"用于轻量级声学场景分类的深空可分离蒸馏技术","authors":"ShuQi Ye, Yuan Tian","doi":"arxiv-2405.03567","DOIUrl":null,"url":null,"abstract":"Acoustic scene classification (ASC) is highly important in the real world.\nRecently, deep learning-based methods have been widely employed for acoustic\nscene classification. However, these methods are currently not lightweight\nenough as well as their performance is not satisfactory. To solve these\nproblems, we propose a deep space separable distillation network. Firstly, the\nnetwork performs high-low frequency decomposition on the log-mel spectrogram,\nsignificantly reducing computational complexity while maintaining model\nperformance. Secondly, we specially design three lightweight operators for ASC,\nincluding Separable Convolution (SC), Orthonormal Separable Convolution (OSC),\nand Separable Partial Convolution (SPC). These operators exhibit highly\nefficient feature extraction capabilities in acoustic scene classification\ntasks. The experimental results demonstrate that the proposed method achieves a\nperformance gain of 9.8% compared to the currently popular deep learning\nmethods, while also having smaller parameter count and computational\ncomplexity.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Space Separable Distillation for Lightweight Acoustic Scene Classification\",\"authors\":\"ShuQi Ye, Yuan Tian\",\"doi\":\"arxiv-2405.03567\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Acoustic scene classification (ASC) is highly important in the real world.\\nRecently, deep learning-based methods have been widely employed for acoustic\\nscene classification. However, these methods are currently not lightweight\\nenough as well as their performance is not satisfactory. To solve these\\nproblems, we propose a deep space separable distillation network. Firstly, the\\nnetwork performs high-low frequency decomposition on the log-mel spectrogram,\\nsignificantly reducing computational complexity while maintaining model\\nperformance. Secondly, we specially design three lightweight operators for ASC,\\nincluding Separable Convolution (SC), Orthonormal Separable Convolution (OSC),\\nand Separable Partial Convolution (SPC). These operators exhibit highly\\nefficient feature extraction capabilities in acoustic scene classification\\ntasks. The experimental results demonstrate that the proposed method achieves a\\nperformance gain of 9.8% compared to the currently popular deep learning\\nmethods, while also having smaller parameter count and computational\\ncomplexity.\",\"PeriodicalId\":501178,\"journal\":{\"name\":\"arXiv - CS - Sound\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Sound\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.03567\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.03567","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Space Separable Distillation for Lightweight Acoustic Scene Classification
Acoustic scene classification (ASC) is highly important in the real world.
Recently, deep learning-based methods have been widely employed for acoustic
scene classification. However, these methods are currently not lightweight
enough as well as their performance is not satisfactory. To solve these
problems, we propose a deep space separable distillation network. Firstly, the
network performs high-low frequency decomposition on the log-mel spectrogram,
significantly reducing computational complexity while maintaining model
performance. Secondly, we specially design three lightweight operators for ASC,
including Separable Convolution (SC), Orthonormal Separable Convolution (OSC),
and Separable Partial Convolution (SPC). These operators exhibit highly
efficient feature extraction capabilities in acoustic scene classification
tasks. The experimental results demonstrate that the proposed method achieves a
performance gain of 9.8% compared to the currently popular deep learning
methods, while also having smaller parameter count and computational
complexity.