Speed-Up of Machine Learning for Sound Localization via High-Performance Computing

2022 26th International Conference on Information Technology (IT) Pub Date : 2022-02-16 DOI:10.1109/IT54280.2022.9743519

E. M. Sumner, Marcel Aach, A. Lintermann, Runar Unnthorsson, M. Riedel

{"title":"Speed-Up of Machine Learning for Sound Localization via High-Performance Computing","authors":"E. M. Sumner, Marcel Aach, A. Lintermann, Runar Unnthorsson, M. Riedel","doi":"10.1109/IT54280.2022.9743519","DOIUrl":null,"url":null,"abstract":"Sound localization is the ability of humans to determine the source direction of sounds that they hear. Emulating this capability in virtual environments can have various societally relevant applications enabling more realistic virtual acoustics. We use a variety of artificial intelligence methods, such as machine learning via an Artificial Neural Network (ANN) model, to emulate human sound localization abilities. This paper addresses the particular challenge that the training and optimization of these models is very computationally-intensive when working with audio signal datasets. It describes the successful porting of our novel ANN model code for sound localization from limiting serial CPU-based systems to powerful, cutting-edge High-Performance Computing (HPC) resources to obtain significant speed-ups of the training and optimization process. Selected details of the code refactoring and HPC porting are described, such as adapting hyperparameter optimization algorithms to efficiently use the available HPC resources and replacing third-party libraries responsible for audio signal analysis and linear algebra. This study demonstrates that using innovative HPC systems at the Jülich Supercomputing Centre, equipped with high-tech Graphics Processing Unit (GPU) resources and based on the Modular Supercomputing Architecture, enables significant speed-ups and reduces the time-to-solution for sound localization from three days to three hours per ANN model.","PeriodicalId":335678,"journal":{"name":"2022 26th International Conference on Information Technology (IT)","volume":"165 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 26th International Conference on Information Technology (IT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IT54280.2022.9743519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Sound localization is the ability of humans to determine the source direction of sounds that they hear. Emulating this capability in virtual environments can have various societally relevant applications enabling more realistic virtual acoustics. We use a variety of artificial intelligence methods, such as machine learning via an Artificial Neural Network (ANN) model, to emulate human sound localization abilities. This paper addresses the particular challenge that the training and optimization of these models is very computationally-intensive when working with audio signal datasets. It describes the successful porting of our novel ANN model code for sound localization from limiting serial CPU-based systems to powerful, cutting-edge High-Performance Computing (HPC) resources to obtain significant speed-ups of the training and optimization process. Selected details of the code refactoring and HPC porting are described, such as adapting hyperparameter optimization algorithms to efficiently use the available HPC resources and replacing third-party libraries responsible for audio signal analysis and linear algebra. This study demonstrates that using innovative HPC systems at the Jülich Supercomputing Centre, equipped with high-tech Graphics Processing Unit (GPU) resources and based on the Modular Supercomputing Architecture, enables significant speed-ups and reduces the time-to-solution for sound localization from three days to three hours per ANN model.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

声音定位是人类确定所听到声音的来源方向的能力。在虚拟环境中模拟这种能力可以有各种与社会相关的应用，从而实现更逼真的虚拟声学。我们使用各种人工智能方法，如通过人工神经网络(ANN)模型的机器学习，来模拟人类的声音定位能力。本文解决了这些模型的训练和优化在处理音频信号数据集时需要大量计算的特殊挑战。它描述了我们的新颖的人工神经网络模型代码成功移植的声音定位从有限的串行cpu为基础的系统到强大的，尖端的高性能计算(HPC)资源，以获得显著的加速训练和优化过程。描述了代码重构和HPC移植的部分细节，例如采用超参数优化算法来有效地利用可用的HPC资源，以及替换负责音频信号分析和线性代数的第三方库。这项研究表明，在j lich超级计算中心使用创新的高性能计算系统，配备高科技图形处理单元(GPU)资源，并基于模块化超级计算架构，可以显著加快速度，并将每个人工神经网络模型的声音定位解决时间从三天缩短到三小时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 26th International Conference on Information Technology (IT)

自引率

0.00%

发文量

期刊最新文献

A New Framework for Quantum Image Processing and Application of Binary Template Matching Some IT Tools for Virtual Exchange in Higher Education Audio Signal Denoising Based on Laplacian Filter and Sparse Signal Reconstruction 360-degree Video Technology with Potential Use in Educational Applications [Copyright notice]