SALT: Standardized Audio event Label Taxonomy

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-18 DOI:arxiv-2409.11746

Paraskevas StamatiadisIDS, S2A, LTCI, Michel OlveraIDS, S2A, LTCI, Slim EssidIDS, S2A, LTCI

引用次数: 0

Abstract

Machine listening systems often rely on fixed taxonomies to organize and label audio data, key for training and evaluating deep neural networks (DNNs) and other supervised algorithms. However, such taxonomies face significant constraints: they are composed of application-dependent predefined categories, which hinders the integration of new or varied sounds, and exhibits limited cross-dataset compatibility due to inconsistent labeling standards. To overcome these limitations, we introduce SALT: Standardized Audio event Label Taxonomy. Building upon the hierarchical structure of AudioSet's ontology, our taxonomy extends and standardizes labels across 24 publicly available environmental sound datasets, allowing the mapping of class labels from diverse datasets to a unified system. Our proposal comes with a new Python package designed for navigating and utilizing this taxonomy, easing cross-dataset label searching and hierarchical exploration. Notably, our package allows effortless data aggregation from diverse sources, hence easy experimentation with combined datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SALT：标准化音频事件标签分类法

机器听音系统通常依靠固定的分类标准来组织和标记音频数据，这是训练和评估深度神经网络（DNN）和其他监督算法的关键。然而，这些分类标准面临着很大的限制：它们由依赖于应用的预定义类别组成，这阻碍了新声音或各种声音的整合，而且由于标签标准不一致，跨数据集的兼容性也很有限。为了克服这些限制，我们引入了 SALT：标准化音频事件标签分类法。在 AudioSet 本体的分层结构基础上，我们的分类法扩展并标准化了 24 个公开可用的环境声音数据集的标签，允许将不同数据集的类标签映射到统一的系统中。我们的提案还附带了一个新的 Python 软件包，该软件包专为导航和使用该分类法而设计，可简化跨数据集标签搜索和分层探索。值得注意的是，我们的软件包可以毫不费力地对不同来源的数据进行聚合，从而轻松地对组合数据集进行实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量