Rongyan Wang , Yan Leng , Jian Zhuang , Chengli Sun
{"title":"A sound event detection support system for smart home based on “two-to-one” teacher–student learning","authors":"Rongyan Wang , Yan Leng , Jian Zhuang , Chengli Sun","doi":"10.1016/j.asoc.2024.112224","DOIUrl":null,"url":null,"abstract":"<div><p>Sound event detection (SED) is a core technology in smart home projects that rely on detected sound events to trigger specific actions. SED systems face two major challenges: high labeling costs and complex acoustic environments. To reduce labeling costs, some semi-supervised systems extract both global and local features for classification. However, these methods treat global and local features equally, not accounting for their varying importance when recognizing different types of sound events. Furthermore, to address complex acoustic environments, some studies use multitask learning frameworks to introduce SED-related tasks as auxiliaries to improve detection performance. However, these methods fail to align tasks within the framework, leading to conflicting outputs that may limit system performance. To address these issues, in this paper we propose a “two-to-one” teacher-student learning based semi-supervised SED system. This system employs a gating mechanism to selectively enhance global and local features, improving adaptability to different types of sound events, and incorporates a cross-task alignment module to interact SED with related tasks, reducing the risk of performance degradation caused by conflicting outputs. Experimental results on two datasets demonstrate that our system achieves the best performance in all metrics, with EB-F1 scores of 48.1 % and 64.7 %, representing improvements of 15.3 % and 10.6 % over the baseline ConformerSED system, respectively. Our work offers an effective SED solution for smart home projects by providing a semi-supervised SED system that performs well while reducing labeling costs.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624009980","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Sound event detection (SED) is a core technology in smart home projects that rely on detected sound events to trigger specific actions. SED systems face two major challenges: high labeling costs and complex acoustic environments. To reduce labeling costs, some semi-supervised systems extract both global and local features for classification. However, these methods treat global and local features equally, not accounting for their varying importance when recognizing different types of sound events. Furthermore, to address complex acoustic environments, some studies use multitask learning frameworks to introduce SED-related tasks as auxiliaries to improve detection performance. However, these methods fail to align tasks within the framework, leading to conflicting outputs that may limit system performance. To address these issues, in this paper we propose a “two-to-one” teacher-student learning based semi-supervised SED system. This system employs a gating mechanism to selectively enhance global and local features, improving adaptability to different types of sound events, and incorporates a cross-task alignment module to interact SED with related tasks, reducing the risk of performance degradation caused by conflicting outputs. Experimental results on two datasets demonstrate that our system achieves the best performance in all metrics, with EB-F1 scores of 48.1 % and 64.7 %, representing improvements of 15.3 % and 10.6 % over the baseline ConformerSED system, respectively. Our work offers an effective SED solution for smart home projects by providing a semi-supervised SED system that performs well while reducing labeling costs.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.