Speed-Up of Machine Learning for Sound Localization via High-Performance Computing

E. M. Sumner, Marcel Aach, A. Lintermann, Runar Unnthorsson, M. Riedel
{"title":"Speed-Up of Machine Learning for Sound Localization via High-Performance Computing","authors":"E. M. Sumner, Marcel Aach, A. Lintermann, Runar Unnthorsson, M. Riedel","doi":"10.1109/IT54280.2022.9743519","DOIUrl":null,"url":null,"abstract":"Sound localization is the ability of humans to determine the source direction of sounds that they hear. Emulating this capability in virtual environments can have various societally relevant applications enabling more realistic virtual acoustics. We use a variety of artificial intelligence methods, such as machine learning via an Artificial Neural Network (ANN) model, to emulate human sound localization abilities. This paper addresses the particular challenge that the training and optimization of these models is very computationally-intensive when working with audio signal datasets. It describes the successful porting of our novel ANN model code for sound localization from limiting serial CPU-based systems to powerful, cutting-edge High-Performance Computing (HPC) resources to obtain significant speed-ups of the training and optimization process. Selected details of the code refactoring and HPC porting are described, such as adapting hyperparameter optimization algorithms to efficiently use the available HPC resources and replacing third-party libraries responsible for audio signal analysis and linear algebra. This study demonstrates that using innovative HPC systems at the Jülich Supercomputing Centre, equipped with high-tech Graphics Processing Unit (GPU) resources and based on the Modular Supercomputing Architecture, enables significant speed-ups and reduces the time-to-solution for sound localization from three days to three hours per ANN model.","PeriodicalId":335678,"journal":{"name":"2022 26th International Conference on Information Technology (IT)","volume":"165 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 26th International Conference on Information Technology (IT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IT54280.2022.9743519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Sound localization is the ability of humans to determine the source direction of sounds that they hear. Emulating this capability in virtual environments can have various societally relevant applications enabling more realistic virtual acoustics. We use a variety of artificial intelligence methods, such as machine learning via an Artificial Neural Network (ANN) model, to emulate human sound localization abilities. This paper addresses the particular challenge that the training and optimization of these models is very computationally-intensive when working with audio signal datasets. It describes the successful porting of our novel ANN model code for sound localization from limiting serial CPU-based systems to powerful, cutting-edge High-Performance Computing (HPC) resources to obtain significant speed-ups of the training and optimization process. Selected details of the code refactoring and HPC porting are described, such as adapting hyperparameter optimization algorithms to efficiently use the available HPC resources and replacing third-party libraries responsible for audio signal analysis and linear algebra. This study demonstrates that using innovative HPC systems at the Jülich Supercomputing Centre, equipped with high-tech Graphics Processing Unit (GPU) resources and based on the Modular Supercomputing Architecture, enables significant speed-ups and reduces the time-to-solution for sound localization from three days to three hours per ANN model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
声音定位是人类确定所听到声音的来源方向的能力。在虚拟环境中模拟这种能力可以有各种与社会相关的应用,从而实现更逼真的虚拟声学。我们使用各种人工智能方法,如通过人工神经网络(ANN)模型的机器学习,来模拟人类的声音定位能力。本文解决了这些模型的训练和优化在处理音频信号数据集时需要大量计算的特殊挑战。它描述了我们的新颖的人工神经网络模型代码成功移植的声音定位从有限的串行cpu为基础的系统到强大的,尖端的高性能计算(HPC)资源,以获得显著的加速训练和优化过程。描述了代码重构和HPC移植的部分细节,例如采用超参数优化算法来有效地利用可用的HPC资源,以及替换负责音频信号分析和线性代数的第三方库。这项研究表明,在j lich超级计算中心使用创新的高性能计算系统,配备高科技图形处理单元(GPU)资源,并基于模块化超级计算架构,可以显著加快速度,并将每个人工神经网络模型的声音定位解决时间从三天缩短到三小时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A New Framework for Quantum Image Processing and Application of Binary Template Matching Some IT Tools for Virtual Exchange in Higher Education Audio Signal Denoising Based on Laplacian Filter and Sparse Signal Reconstruction 360-degree Video Technology with Potential Use in Educational Applications [Copyright notice]
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1