ORCA-PARTY: An Automatic Killer Whale Sound Type Separation Toolkit Using Deep Learning

Christian Bergler, M. Schmitt, A. Maier, R. Cheng, Volker Barth, E. Nöth
{"title":"ORCA-PARTY: An Automatic Killer Whale Sound Type Separation Toolkit Using Deep Learning","authors":"Christian Bergler, M. Schmitt, A. Maier, R. Cheng, Volker Barth, E. Nöth","doi":"10.1109/icassp43922.2022.9746623","DOIUrl":null,"url":null,"abstract":"Data-driven and machine-based analysis of massive bioacoustic data collections, in particular acoustic regions containing a substantial number of vocalizations events, is essential and extremely valuable to identify recurring vocal paradigms. However, these acoustic sections are usually characterized by a strong incidence of overlapping vocalization events, a major problem severely affecting subsequent human-/machine-based analysis and interpretation. Robust machine-driven signal separation of species-specific call types is extremely challenging due to missing ground truth data, speaker/source-relevant information, limited knowledge about inter- and intra-call type variations, next to diverse recording conditions. The current study is the first introducing a fully-automated deep signal separation approach for overlapping orca vocalizations, addressing all of the previously mentioned challenges, together with one of the largest bioacoustic data archives recorded on killer whales (Orcinus Orca). Incorporating ORCA-PARTY as additional data enhancement step for downstream call type classification demonstrated to be extremely valuable. Besides the proof of cross-domain applicability and consistently promising results on non-overlapping signals, significant improvements were achieved when processing acoustic orca segments comprising a multitude of vocal activities. Apart from auspicious visual inspections, a final numerical evaluation on an unseen dataset proved that about 30 % more known sound patterns could be identified.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icassp43922.2022.9746623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Data-driven and machine-based analysis of massive bioacoustic data collections, in particular acoustic regions containing a substantial number of vocalizations events, is essential and extremely valuable to identify recurring vocal paradigms. However, these acoustic sections are usually characterized by a strong incidence of overlapping vocalization events, a major problem severely affecting subsequent human-/machine-based analysis and interpretation. Robust machine-driven signal separation of species-specific call types is extremely challenging due to missing ground truth data, speaker/source-relevant information, limited knowledge about inter- and intra-call type variations, next to diverse recording conditions. The current study is the first introducing a fully-automated deep signal separation approach for overlapping orca vocalizations, addressing all of the previously mentioned challenges, together with one of the largest bioacoustic data archives recorded on killer whales (Orcinus Orca). Incorporating ORCA-PARTY as additional data enhancement step for downstream call type classification demonstrated to be extremely valuable. Besides the proof of cross-domain applicability and consistently promising results on non-overlapping signals, significant improvements were achieved when processing acoustic orca segments comprising a multitude of vocal activities. Apart from auspicious visual inspections, a final numerical evaluation on an unseen dataset proved that about 30 % more known sound patterns could be identified.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ORCA-PARTY:使用深度学习的自动虎鲸声音类型分离工具包
对大量生物声学数据收集,特别是包含大量发声事件的声学区域进行数据驱动和基于机器的分析,对于识别重复出现的声乐范式至关重要且极具价值。然而,这些声学部分通常具有很强的重叠发声事件发生率,这是严重影响后续基于人/机器的分析和解释的主要问题。由于缺少地面真实数据、说话者/源相关信息、对呼叫类型之间和呼叫类型变化的有限知识以及不同的记录条件,对特定物种呼叫类型的鲁棒机器驱动信号分离极具挑战性。目前的研究首次引入了一种全自动深度信号分离方法,用于重叠逆戟鲸的发声,解决了前面提到的所有挑战,以及记录在逆戟鲸(Orcinus orca)上的最大生物声学数据档案之一。将ORCA-PARTY作为下游呼叫类型分类的额外数据增强步骤被证明是非常有价值的。除了证明跨域适用性和在非重叠信号上的一致有希望的结果外,在处理包含大量声乐活动的声学逆戟鲸片段时取得了显着改进。除了吉祥的视觉检查,对一个未知数据集的最终数值评估证明,大约30%的已知声音模式可以被识别出来。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Spatio-Temporal Attention Graph Convolution Network for Functional Connectome Classification Improving Biomedical Named Entity Recognition with a Unified Multi-Task MRC Framework Combining Multiple Style Transfer Networks and Transfer Learning For LGE-CMR Segmentation Sensors to Sign Language: A Natural Approach to Equitable Communication Estimation of the Admittance Matrix in Power Systems Under Laplacian and Physical Constraints
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1