arXiv: Audio and Speech Processing最新文献

英文中文

ToyADMOS2 dataset: Another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions ToyADMOS2数据集:另一个用于域移位条件下异常声音检测的小型机器操作声音数据集

arXiv: Audio and Speech Processing

Pub Date : 2021-06-03 DOI: 10.5281/ZENODO.4580270

N. Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito

This paper proposes a new large-scale dataset called "ToyADMOS2" for anomaly detection in machine operating sounds (ADMOS). As did for our previous ToyADMOS dataset, we collected a large number of operating sounds of miniature machines (toys) under normal and anomaly conditions by deliberately damaging them but extended with providing controlled depth of damages in anomaly samples. Since typical application scenarios of ADMOS often require robust performance under domain-shift conditions, the ToyADMOS2 dataset is designed for evaluating systems under such conditions. The released dataset consists of two sub-datasets for machine-condition inspection: fault diagnosis of machines with geometrically fixed tasks and fault diagnosis of machines with moving tasks. Domain shifts are represented by introducing several differences in operating conditions, such as the use of the same machine type but with different machine models and parts configurations, different operating speeds, microphone arrangements, etc. Each sub-dataset contains over 27 k samples of normal machine-operating sounds and over 8 k samples of anomalous sounds recorded with five to eight microphones. The dataset is freely available for download at this https URL and this https URL.

本文提出了一种新的用于机器操作声音异常检测的大规模数据集“ToyADMOS2”。与之前的ToyADMOS数据集一样，我们收集了大量在正常和异常条件下的微型机器(玩具)的操作声音，通过故意破坏它们，并在异常样本中提供可控的损坏深度来扩展。由于ADMOS的典型应用场景通常需要在域移位条件下的鲁棒性能，ToyADMOS2数据集被设计用于在这种条件下评估系统。发布的数据集包括两个用于机器状态检测的子数据集:具有几何固定任务的机器故障诊断和具有运动任务的机器故障诊断。领域转移是通过引入几种不同的操作条件来表示的，例如使用相同的机器类型，但不同的机器型号和部件配置，不同的操作速度，麦克风安排等。每个子数据集包含超过27k个正常机器操作声音样本和超过8k个用5到8个麦克风记录的异常声音样本。数据集可以在这个https URL和这个https URL上免费下载。

{"title":"ToyADMOS2 dataset: Another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions","authors":"N. Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito","doi":"10.5281/ZENODO.4580270","DOIUrl":"https://doi.org/10.5281/ZENODO.4580270","url":null,"abstract":"This paper proposes a new large-scale dataset called \"ToyADMOS2\" for anomaly detection in machine operating sounds (ADMOS). As did for our previous ToyADMOS dataset, we collected a large number of operating sounds of miniature machines (toys) under normal and anomaly conditions by deliberately damaging them but extended with providing controlled depth of damages in anomaly samples. Since typical application scenarios of ADMOS often require robust performance under domain-shift conditions, the ToyADMOS2 dataset is designed for evaluating systems under such conditions. The released dataset consists of two sub-datasets for machine-condition inspection: fault diagnosis of machines with geometrically fixed tasks and fault diagnosis of machines with moving tasks. Domain shifts are represented by introducing several differences in operating conditions, such as the use of the same machine type but with different machine models and parts configurations, different operating speeds, microphone arrangements, etc. Each sub-dataset contains over 27 k samples of normal machine-operating sounds and over 8 k samples of anomalous sounds recorded with five to eight microphones. The dataset is freely available for download at this https URL and this https URL.","PeriodicalId":119553,"journal":{"name":"arXiv: Audio and Speech Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123642728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Clarity : machine learning challenges to revolutionise hearing device processing 清晰度:机器学习挑战彻底改变听力设备处理

arXiv: Audio and Speech Processing

Pub Date : 2020-06-19 DOI: 10.48465/FA.2020.0198

S. Graetzer, M. Akeroyd, J. Barker, T. Cox, J. Culling, G. Naylor, Eszter Porter, R. V. Muñoz

In the Clarity project, we will run a series of machine learning challenges to revolutionise speech processing for hearing devices. Over five years, there will be three paired challenges. Each pair will consist of a competition focussed on hearing-device processing (“enhancement”) and another focussed on speech perception modelling (“prediction”). The enhancement challenges will deliver new and improved approaches for hearing device signal processing for speech. The parallel prediction challenges will develop and improve methods for predicting speech intelligibility and quality for hearing impaired listeners. To facilitate the challenges, we will generate openaccess datasets, models and infrastructure. These will include: (1) tools for generating realistic test/training materials for different listening scenarios; (2) baseline models of hearing impairment; (3) baseline models of hearing-device processing; (4) baseline models of speech perception and (5) databases of speech perception in noise. The databases will include the results of listening tests that characterise how hearing-impaired listeners perceive speech in noise. We will also provide a comprehensive characterisation of each listeners hearing ability. The provision of open-access datasets, models and infrastructure will allow other researchers to develop algorithms for speech and hearing aid processing. In addition, it will lower barriers that prevent researchers from considering hearing impairment. In round one, speech will occur in the context of a living room, i.e., a moderately reverberant room with minimal (non-speech) background noise. Entries can be submitted to either the enhancement or prediction challenges, or both. We expect to open the beta version of round one in October for a full opening in November 2020, a closing date in June 2021 and results in October 2021. This Engineering and Physical Sciences Research Council (EPSRC) funded project involves researchers from the Universities of Sheffield, Salford, Nottingham and Cardiff in conjunction with the Hearing Industry Research Consortium, Action on Hearing Loss, Amazon, and Honda. To register interest in the challenges, go to www.claritychallenge.org/.

在Clarity项目中，我们将进行一系列机器学习挑战，以彻底改变听力设备的语音处理。在接下来的五年里，将会有三个成对的挑战。每组比赛将包括一场专注于听力设备处理(“增强”)和另一场专注于语音感知建模(“预测”)的比赛。增强挑战将为语音的听力设备信号处理提供新的和改进的方法。平行预测挑战将发展和改进预测听力受损听众语音清晰度和质量的方法。为了应对这些挑战，我们将生成开放获取的数据集、模型和基础设施。这些工具将包括:(1)针对不同听力场景生成真实的测试/培训材料的工具;(2)听力损伤基线模型;(3)助听器加工基线模型;(4)语音感知基线模型;(5)噪声环境下语音感知数据库。该数据库将包括听力测试的结果，这些测试描述了听力受损的听众如何在噪音中感知语音。我们还将提供每位听众听力能力的综合描述。提供开放获取的数据集、模型和基础设施将允许其他研究人员开发语音和助听器处理的算法。此外，它将降低阻碍研究人员考虑听力障碍的障碍。在第一轮中，讲话将发生在客厅的环境中，即一个具有最小(非讲话)背景噪音的中度混响房间。参赛作品可以提交增强或预测挑战，或同时提交。我们预计将在10月份开放第一轮的测试版，2020年11月全面开放，2021年6月截止，2021年10月公布结果。这个工程和物理科学研究委员会(EPSRC)资助的项目涉及来自谢菲尔德大学、索尔福德大学、诺丁汉大学和卡迪夫大学的研究人员，以及听力产业研究联盟、听力损失行动、亚马逊和本田。要对挑战感兴趣，请登录www.claritychallenge.org/。

{"title":"Clarity : machine learning challenges to revolutionise hearing device processing","authors":"S. Graetzer, M. Akeroyd, J. Barker, T. Cox, J. Culling, G. Naylor, Eszter Porter, R. V. Muñoz","doi":"10.48465/FA.2020.0198","DOIUrl":"https://doi.org/10.48465/FA.2020.0198","url":null,"abstract":"In the Clarity project, we will run a series of machine learning challenges to revolutionise speech processing for hearing devices. Over five years, there will be three paired challenges. Each pair will consist of a competition focussed on hearing-device processing (“enhancement”) and another focussed on speech perception modelling (“prediction”). The enhancement challenges will deliver new and improved approaches for hearing device signal processing for speech. The parallel prediction challenges will develop and improve methods for predicting speech intelligibility and quality for hearing impaired listeners. To facilitate the challenges, we will generate openaccess datasets, models and infrastructure. These will include: (1) tools for generating realistic test/training materials for different listening scenarios; (2) baseline models of hearing impairment; (3) baseline models of hearing-device processing; (4) baseline models of speech perception and (5) databases of speech perception in noise. The databases will include the results of listening tests that characterise how hearing-impaired listeners perceive speech in noise. We will also provide a comprehensive characterisation of each listeners hearing ability. The provision of open-access datasets, models and infrastructure will allow other researchers to develop algorithms for speech and hearing aid processing. In addition, it will lower barriers that prevent researchers from considering hearing impairment. In round one, speech will occur in the context of a living room, i.e., a moderately reverberant room with minimal (non-speech) background noise. Entries can be submitted to either the enhancement or prediction challenges, or both. We expect to open the beta version of round one in October for a full opening in November 2020, a closing date in June 2021 and results in October 2021. This Engineering and Physical Sciences Research Council (EPSRC) funded project involves researchers from the Universities of Sheffield, Salford, Nottingham and Cardiff in conjunction with the Hearing Industry Research Consortium, Action on Hearing Loss, Amazon, and Honda. To register interest in the challenges, go to www.claritychallenge.org/.","PeriodicalId":119553,"journal":{"name":"arXiv: Audio and Speech Processing","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129207585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv: Audio and Speech Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀