Joint Analysis of Acoustic Scenes and Sound Events in Multitask Learning Based on Cross_MMoE Model and Class-Balanced Loss

IF 4.3 2区 综合性期刊 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Sensors Journal Pub Date : 2024-04-29 DOI:10.1109/JSEN.2024.3390231
Lin Zhang;Menglong Wu;Xichang Cai;Yundong Li;Wenkai Liu
{"title":"Joint Analysis of Acoustic Scenes and Sound Events in Multitask Learning Based on Cross_MMoE Model and Class-Balanced Loss","authors":"Lin Zhang;Menglong Wu;Xichang Cai;Yundong Li;Wenkai Liu","doi":"10.1109/JSEN.2024.3390231","DOIUrl":null,"url":null,"abstract":"Acoustic scene classification (ASC) and sound event detection (SED) are two research directions in the field of acoustics, and they are closely related. Previous works have adopted a joint analysis method for acoustic scenes and events based on multitask learning (MTL). However, the traditional MTL models are often sensitive to the proportion of dataset partitioning, and multitask analysis is not as effective as single-task analysis. In addition, the performance of traditional MTL models is highly dependent on the weights of the loss function, and manually adjusting weights is costly. In response to these issues, we suggest improvements in both the model and loss function formulation, to utilize additional sound event information to assist in improving the performance of ASC. First, the multigate mixture-of-experts (MMoEs) model is introduced into the field of acoustics. Experimental results obtained using TUT Sound Events 2016/2017 and TUT Acoustic Scenes 2016 datasets indicate that the mixture-of-experts model achieves an optimal performance of 98.74% in terms of \n<inline-formula> <tex-math>$F1$ </tex-math></inline-formula>\n-score, which is 1.43% higher than traditional MTL models; second, we improve the mixture-of-experts model and propose the Cross_MMoE model, which increases the information interaction between different task branches, and the \n<inline-formula> <tex-math>$F1$ </tex-math></inline-formula>\n-score is further improved to 99.04%; finally, to address the issue of imbalanced sample categories in the dataset, we evaluate the class balanced loss formulation to replace the traditional multitask loss function. The performance of the traditional multitask model, MMoE model, and Cross_MMoE model has been improved, and more specifically, the \n<inline-formula> <tex-math>$F1$ </tex-math></inline-formula>\n-score of the Cross_MMoE model has increased to 99.31%.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10510225/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Acoustic scene classification (ASC) and sound event detection (SED) are two research directions in the field of acoustics, and they are closely related. Previous works have adopted a joint analysis method for acoustic scenes and events based on multitask learning (MTL). However, the traditional MTL models are often sensitive to the proportion of dataset partitioning, and multitask analysis is not as effective as single-task analysis. In addition, the performance of traditional MTL models is highly dependent on the weights of the loss function, and manually adjusting weights is costly. In response to these issues, we suggest improvements in both the model and loss function formulation, to utilize additional sound event information to assist in improving the performance of ASC. First, the multigate mixture-of-experts (MMoEs) model is introduced into the field of acoustics. Experimental results obtained using TUT Sound Events 2016/2017 and TUT Acoustic Scenes 2016 datasets indicate that the mixture-of-experts model achieves an optimal performance of 98.74% in terms of $F1$ -score, which is 1.43% higher than traditional MTL models; second, we improve the mixture-of-experts model and propose the Cross_MMoE model, which increases the information interaction between different task branches, and the $F1$ -score is further improved to 99.04%; finally, to address the issue of imbalanced sample categories in the dataset, we evaluate the class balanced loss formulation to replace the traditional multitask loss function. The performance of the traditional multitask model, MMoE model, and Cross_MMoE model has been improved, and more specifically, the $F1$ -score of the Cross_MMoE model has increased to 99.31%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于 Cross_MMoE 模型和类平衡损失的多任务学习中的声学场景和声音事件联合分析
声学场景分类(ASC)和声学事件检测(SED)是声学领域的两个研究方向,两者密切相关。以往的研究采用基于多任务学习(MTL)的声学场景和事件联合分析方法。然而,传统的 MTL 模型往往对数据集的划分比例比较敏感,多任务分析的效果不如单任务分析。此外,传统 MTL 模型的性能高度依赖于损失函数的权重,而手动调整权重的成本很高。针对这些问题,我们建议对模型和损失函数公式进行改进,利用更多的声音事件信息来帮助提高 ASC 的性能。首先,我们在声学领域引入了多专家混合物(MMoEs)模型。使用 TUT Sound Events 2016/2017 和 TUT Acoustic Scenes 2016 数据集获得的实验结果表明,专家混合物模型在 $F1$ -score 方面达到了 98.74% 的最佳性能,比传统的 MTL 模型高出 1.43% ;其次,我们改进了专家混合物模型,提出了 Cross_MMoE 模型,增加了不同任务分支之间的信息交互,$F1$ -score 进一步提高到 99.04% ;最后,针对数据集中样本类别不平衡的问题,我们评估了类平衡损失表述来替代传统的多任务损失函数。传统多任务模型、MMoE 模型和 Cross_MMoE 模型的性能都得到了提高,更具体地说,Cross_MMoE 模型的 F1$ -score 分数提高到了 99.31%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Sensors Journal
IEEE Sensors Journal 工程技术-工程:电子与电气
CiteScore
7.70
自引率
14.00%
发文量
2058
审稿时长
5.2 months
期刊介绍: The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following: -Sensor Phenomenology, Modelling, and Evaluation -Sensor Materials, Processing, and Fabrication -Chemical and Gas Sensors -Microfluidics and Biosensors -Optical Sensors -Physical Sensors: Temperature, Mechanical, Magnetic, and others -Acoustic and Ultrasonic Sensors -Sensor Packaging -Sensor Networks -Sensor Applications -Sensor Systems: Signals, Processing, and Interfaces -Actuators and Sensor Power Systems -Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting -Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data) -Sensors in Industrial Practice
期刊最新文献
Fault Diagnosis of Circuit Breakers Based on MCF-RPs and Deep Residual Knowledge Incremental under Distillation Learning Remaining Useful Life Prediction of Bearings Using Reverse Attention Graph Convolution Network with Residual Convolution Transformer Star Spot Extraction for Multi-FOV Star Sensors Under Extremely High Dynamic Conditions An Ultra-miniaturized Inflammation Monitoring Platform Implemented by Long Afterglow Lat-eral Flow Immunoassay Angle-Agnostic Radio Frequency Sensing Integrated into 5G-NR
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1