{"title":"Physically Architected Recurrent Neural Networks for Nonlinear Dynamical Loudspeaker Modeling","authors":"Christian Gruber;Gerald Enzner;Rainer Martin","doi":"10.1109/TSP.2024.3480321","DOIUrl":null,"url":null,"abstract":"The nonlinear behavior of loudspeakers is of great interest in a number of audio processing algorithms, as it may have a detrimental effect on their performance. These algorithms may be further enhanced when an accurate model of the loudspeaker's input-output behavior is available. A variety of approaches has been investigated in the past to solve this task via nonlinear adaptive system identification. Their modeling capabilities are often limited due to a mismatch with electroacoustic principles of real loudspeakers. This paper therefore presents a novel approach using recurrent neural networks (RNN) specifically designed to match the dynamical loudspeaker's physical model behavior. By means of the physical model and its corresponding state-space representation, we derive three conceptually different RNN architectures, which are initially trained on synthetic audio data in order to gain insights into the required training procedure and limitations. Further training and evaluation of the preferred architecture on real loudspeaker recordings shows consistent improvements of the mean-squared modeling error compared to a linear system model and to nonlinear baseline algorithms.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"72 ","pages":"5371-5387"},"PeriodicalIF":5.8000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10716735/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The nonlinear behavior of loudspeakers is of great interest in a number of audio processing algorithms, as it may have a detrimental effect on their performance. These algorithms may be further enhanced when an accurate model of the loudspeaker's input-output behavior is available. A variety of approaches has been investigated in the past to solve this task via nonlinear adaptive system identification. Their modeling capabilities are often limited due to a mismatch with electroacoustic principles of real loudspeakers. This paper therefore presents a novel approach using recurrent neural networks (RNN) specifically designed to match the dynamical loudspeaker's physical model behavior. By means of the physical model and its corresponding state-space representation, we derive three conceptually different RNN architectures, which are initially trained on synthetic audio data in order to gain insights into the required training procedure and limitations. Further training and evaluation of the preferred architecture on real loudspeaker recordings shows consistent improvements of the mean-squared modeling error compared to a linear system model and to nonlinear baseline algorithms.
期刊介绍:
The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.