Multipitch tracking in music signals using Echo State Networks

2020 28th European Signal Processing Conference (EUSIPCO) Pub Date : 2021-01-24 DOI:10.23919/Eusipco47968.2020.9287638

P. Steiner, Simon Stone, P. Birkholz, A. Jalalvand

{"title":"Multipitch tracking in music signals using Echo State Networks","authors":"P. Steiner, Simon Stone, P. Birkholz, A. Jalalvand","doi":"10.23919/Eusipco47968.2020.9287638","DOIUrl":null,"url":null,"abstract":"Currently, convolutional neural networks (CNNs) define the state of the art for multipitch tracking in music signals. Echo State Networks (ESNs), a recently introduced recurrent neural network architecture, achieved similar results as CNNs for various tasks, such as phoneme or digit recognition. However, they have not yet received much attention in the community of Music Information Retrieval. The core of ESNs is a group of unordered, randomly connected neurons, i.e., the reservoir, by which the low-dimensional input space is non-linearly transformed into a high-dimensional feature space. Because only the weights of the connections between the reservoir and the output are trained using linear regression, ESNs are easier to train than deep neural networks. This paper presents a first exploration of ESNs for the challenging task of multipitch tracking in music signals. The best results presented in this paper were achieved with a bidirectional two-layer ESN with 20 000 neurons in each layer. Although the final F-score of 0.7198 still falls below the state of the art (0.7370), the proposed ESN-based approach serves as a baseline for further investigations of ESNs in audio signal processing in the future.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"2012 1","pages":"126-130"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 28th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/Eusipco47968.2020.9287638","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

Currently, convolutional neural networks (CNNs) define the state of the art for multipitch tracking in music signals. Echo State Networks (ESNs), a recently introduced recurrent neural network architecture, achieved similar results as CNNs for various tasks, such as phoneme or digit recognition. However, they have not yet received much attention in the community of Music Information Retrieval. The core of ESNs is a group of unordered, randomly connected neurons, i.e., the reservoir, by which the low-dimensional input space is non-linearly transformed into a high-dimensional feature space. Because only the weights of the connections between the reservoir and the output are trained using linear regression, ESNs are easier to train than deep neural networks. This paper presents a first exploration of ESNs for the challenging task of multipitch tracking in music signals. The best results presented in this paper were achieved with a bidirectional two-layer ESN with 20 000 neurons in each layer. Although the final F-score of 0.7198 still falls below the state of the art (0.7370), the proposed ESN-based approach serves as a baseline for further investigations of ESNs in audio signal processing in the future.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

回声状态网络在音乐信号中的多音高跟踪

目前，卷积神经网络(cnn)定义了音乐信号中多音高跟踪的最新技术。回声状态网络(Echo State Networks, ESNs)是最近引入的一种循环神经网络架构，在各种任务(如音素或数字识别)上取得了与cnn相似的结果。然而，它们在音乐信息检索界还没有得到足够的重视。ESNs的核心是一组无序、随机连接的神经元，即存储库，通过它将低维输入空间非线性转换为高维特征空间。因为只有储层和输出之间的连接权值是用线性回归训练的，所以esn比深度神经网络更容易训练。本文首次探索了ESNs用于音乐信号中多音高跟踪的挑战性任务。本文给出的最佳结果是双向双层回声状态网络，每层有20,000个神经元。虽然最终的f值0.7198仍然低于目前的水平(0.7370)，但所提出的基于esn的方法可以作为未来音频信号处理中进一步研究esn的基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 28th European Signal Processing Conference (EUSIPCO)

自引率

0.00%

发文量

期刊最新文献

Eusipco 2021 Cover Page A graph-theoretic sensor-selection scheme for covariance-based Motor Imagery (MI) decoding Hidden Markov Model Based Data-driven Calibration of Non-dispersive Infrared Gas Sensor Deep Transform Learning for Multi-Sensor Fusion Two Stages Parallel LMS Structure: A Pipelined Hardware Architecture