A sound event detection support system for smart home based on “two-to-one” teacher–student learning

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Applied Soft Computing Pub Date : 2024-09-10 DOI:10.1016/j.asoc.2024.112224
Rongyan Wang , Yan Leng , Jian Zhuang , Chengli Sun
{"title":"A sound event detection support system for smart home based on “two-to-one” teacher–student learning","authors":"Rongyan Wang ,&nbsp;Yan Leng ,&nbsp;Jian Zhuang ,&nbsp;Chengli Sun","doi":"10.1016/j.asoc.2024.112224","DOIUrl":null,"url":null,"abstract":"<div><p>Sound event detection (SED) is a core technology in smart home projects that rely on detected sound events to trigger specific actions. SED systems face two major challenges: high labeling costs and complex acoustic environments. To reduce labeling costs, some semi-supervised systems extract both global and local features for classification. However, these methods treat global and local features equally, not accounting for their varying importance when recognizing different types of sound events. Furthermore, to address complex acoustic environments, some studies use multitask learning frameworks to introduce SED-related tasks as auxiliaries to improve detection performance. However, these methods fail to align tasks within the framework, leading to conflicting outputs that may limit system performance. To address these issues, in this paper we propose a “two-to-one” teacher-student learning based semi-supervised SED system. This system employs a gating mechanism to selectively enhance global and local features, improving adaptability to different types of sound events, and incorporates a cross-task alignment module to interact SED with related tasks, reducing the risk of performance degradation caused by conflicting outputs. Experimental results on two datasets demonstrate that our system achieves the best performance in all metrics, with EB-F1 scores of 48.1 % and 64.7 %, representing improvements of 15.3 % and 10.6 % over the baseline ConformerSED system, respectively. Our work offers an effective SED solution for smart home projects by providing a semi-supervised SED system that performs well while reducing labeling costs.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624009980","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Sound event detection (SED) is a core technology in smart home projects that rely on detected sound events to trigger specific actions. SED systems face two major challenges: high labeling costs and complex acoustic environments. To reduce labeling costs, some semi-supervised systems extract both global and local features for classification. However, these methods treat global and local features equally, not accounting for their varying importance when recognizing different types of sound events. Furthermore, to address complex acoustic environments, some studies use multitask learning frameworks to introduce SED-related tasks as auxiliaries to improve detection performance. However, these methods fail to align tasks within the framework, leading to conflicting outputs that may limit system performance. To address these issues, in this paper we propose a “two-to-one” teacher-student learning based semi-supervised SED system. This system employs a gating mechanism to selectively enhance global and local features, improving adaptability to different types of sound events, and incorporates a cross-task alignment module to interact SED with related tasks, reducing the risk of performance degradation caused by conflicting outputs. Experimental results on two datasets demonstrate that our system achieves the best performance in all metrics, with EB-F1 scores of 48.1 % and 64.7 %, representing improvements of 15.3 % and 10.6 % over the baseline ConformerSED system, respectively. Our work offers an effective SED solution for smart home projects by providing a semi-supervised SED system that performs well while reducing labeling costs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于 "二对一 "师生学习的智能家居声音事件检测支持系统
声音事件检测(SED)是智能家居项目中的一项核心技术,它依靠检测到的声音事件来触发特定操作。SED 系统面临两大挑战:高昂的标注成本和复杂的声学环境。为了降低标注成本,一些半监督系统同时提取全局和局部特征进行分类。然而,这些方法对全局和局部特征一视同仁,没有考虑到它们在识别不同类型声音事件时的不同重要性。此外,为了应对复杂的声学环境,一些研究利用多任务学习框架引入 SED 相关任务作为辅助工具,以提高检测性能。然而,这些方法未能协调框架内的任务,导致输出结果相互冲突,从而限制了系统性能。为了解决这些问题,我们在本文中提出了一种基于 "二对一 "师生学习的半监督 SED 系统。该系统采用门控机制,选择性地增强全局和局部特征,提高了对不同类型声音事件的适应性,并结合了跨任务对齐模块,将 SED 与相关任务进行交互,降低了因输出冲突而导致性能下降的风险。在两个数据集上的实验结果表明,我们的系统在所有指标上都取得了最佳性能,EB-F1 分数分别为 48.1 % 和 64.7 %,与基线 ConformerSED 系统相比分别提高了 15.3 % 和 10.6 %。我们的工作为智能家居项目提供了一种有效的 SED 解决方案,它提供了一种半监督 SED 系统,该系统性能良好,同时降低了标签成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Soft Computing
Applied Soft Computing 工程技术-计算机:跨学科应用
CiteScore
15.80
自引率
6.90%
发文量
874
审稿时长
10.9 months
期刊介绍: Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.
期刊最新文献
An effective surrogate-assisted rank method for evolutionary neural architecture search Knowledge graph-driven mountain railway alignment optimization integrating karst hazard assessment Medical image segmentation network based on feature filtering with low number of parameters Robust Chinese Clinical Named Entity Recognition with information bottleneck and adversarial training Clustering based fuzzy classification with a noise cluster in detecting fraud in insurance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1