Online sound structure analysis based on generative model of acoustic feature sequences

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI:10.1109/APSIPA.2017.8282236

Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita

{"title":"Online sound structure analysis based on generative model of acoustic feature sequences","authors":"Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita","doi":"10.1109/APSIPA.2017.8282236","DOIUrl":null,"url":null,"abstract":"We propose a method for the online sound structure analysis based on a Bayesian generative model of acoustic feature sequences, with which the hierarchical generative process of the sound clip, acoustic topic, acoustic word, and acoustic feature is assumed. In this model, it is assumed that sound clips are organized based on the combination of latent acoustic topics, and each acoustic topic is represented by a Gaussian mixture model (GMM) over an acoustic feature space, where the components of the GMM correspond to acoustic words. Since the conventional batch algorithm for learning this model requires a huge amount of calculation, it is difficult to analyze the massive amount of sound data. Moreover, the batch algorithm does not allow us to analyze the sequentially obtained data. Our variational Bayes-based online algorithm for this generative model can analyze the structure of sounds sound clip by sound clip. The experimental results show that the proposed online algorithm can reduce the calculation cost by about 90% and estimate the posterior distributions as efficiently as the conventional batch algorithm.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPA.2017.8282236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We propose a method for the online sound structure analysis based on a Bayesian generative model of acoustic feature sequences, with which the hierarchical generative process of the sound clip, acoustic topic, acoustic word, and acoustic feature is assumed. In this model, it is assumed that sound clips are organized based on the combination of latent acoustic topics, and each acoustic topic is represented by a Gaussian mixture model (GMM) over an acoustic feature space, where the components of the GMM correspond to acoustic words. Since the conventional batch algorithm for learning this model requires a huge amount of calculation, it is difficult to analyze the massive amount of sound data. Moreover, the batch algorithm does not allow us to analyze the sequentially obtained data. Our variational Bayes-based online algorithm for this generative model can analyze the structure of sounds sound clip by sound clip. The experimental results show that the proposed online algorithm can reduce the calculation cost by about 90% and estimate the posterior distributions as efficiently as the conventional batch algorithm.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于声特征序列生成模型的在线声结构分析

提出了一种基于声学特征序列贝叶斯生成模型的在线声音结构分析方法，该模型假设了声音片段、声学主题、声学词和声学特征的分层生成过程。在该模型中，假设声音片段是基于潜在声学主题的组合进行组织的，每个声学主题由声学特征空间上的高斯混合模型(GMM)表示，GMM的分量对应于声学单词。由于传统的批处理算法学习该模型需要大量的计算量，难以分析海量的声音数据。此外，批处理算法不允许我们分析顺序获得的数据。我们基于变分贝叶斯的在线算法可以对每个声音片段的声音结构进行分析。实验结果表明，该算法可将计算成本降低约90%，且估计后验分布的效率与传统的批处理算法相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

自引率

0.00%

发文量