Audio Retrieval By Voice Imitation

Samah Khawaled, Mohamad Khateeb, Hadas Benisty
{"title":"Audio Retrieval By Voice Imitation","authors":"Samah Khawaled, Mohamad Khateeb, Hadas Benisty","doi":"10.1109/ICSEE.2018.8646294","DOIUrl":null,"url":null,"abstract":"Existing sound retrieval systems are mostly based on a textual query. Using text to describe a sound signal is not intuitive and is often inaccurate due to subjective impression of the user; different people may use different words to describe the same sound which makes theses system complex to design and unintuitive to use. Vocal imitation, however, is the most natural human way to describe a sound. In this paper we consider a newly rising approach for sound retrieval based on vocal imitations, where the user records himself imitating the desired sound, and the system retrieves a ranked list of the most similar sounds in the dataset. In this work we represent sound signals using histograms, obtained with respect to a Gaussian Mixture Model (GMM), representing the spectral domain. This recently proposed approach was successfully applied for word representation in a keyword spotting task. Having a fixed length representation for vocal imitation signals allows us to train a robust classifier using support vector machine (SVM). Given a test imitation signal, we apply the classifier and use the output score to rank the retrieved signals, based on a majority vote. Our simulation results show that the proposed system yields a more accurate ranking compared with other existing solutions.","PeriodicalId":254455,"journal":{"name":"2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSEE.2018.8646294","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Existing sound retrieval systems are mostly based on a textual query. Using text to describe a sound signal is not intuitive and is often inaccurate due to subjective impression of the user; different people may use different words to describe the same sound which makes theses system complex to design and unintuitive to use. Vocal imitation, however, is the most natural human way to describe a sound. In this paper we consider a newly rising approach for sound retrieval based on vocal imitations, where the user records himself imitating the desired sound, and the system retrieves a ranked list of the most similar sounds in the dataset. In this work we represent sound signals using histograms, obtained with respect to a Gaussian Mixture Model (GMM), representing the spectral domain. This recently proposed approach was successfully applied for word representation in a keyword spotting task. Having a fixed length representation for vocal imitation signals allows us to train a robust classifier using support vector machine (SVM). Given a test imitation signal, we apply the classifier and use the output score to rank the retrieved signals, based on a majority vote. Our simulation results show that the proposed system yields a more accurate ranking compared with other existing solutions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语音模仿音频检索
现有的声音检索系统大多基于文本查询。用文字来描述声音信号并不直观,而且由于用户的主观印象,往往是不准确的;不同的人可能会用不同的词来描述相同的声音,这使得这些系统设计复杂,使用起来不直观。然而,模仿声音是人类描述声音最自然的方式。在本文中,我们考虑了一种基于声音模仿的声音检索新方法,其中用户记录自己模仿所需的声音,系统检索数据集中最相似声音的排名列表。在这项工作中,我们使用直方图来表示声音信号,该直方图是根据高斯混合模型(GMM)获得的,代表频谱域。该方法已成功应用于关键词识别任务中的单词表示。有了固定长度的语音模仿信号表示,我们可以使用支持向量机(SVM)训练一个鲁棒分类器。给定一个测试模仿信号,我们应用分类器并使用输出分数对检索到的信号进行排序,基于多数投票。仿真结果表明,与其他现有的解决方案相比,所提出的系统产生了更准确的排名。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Robust Motion Compensation for Forensic Analysis of Egocentric Video using Joint Stabilization and Tracking DC low current Hall effect measurements Examining Change Detection Methods For Hyperspectral Data Effect of Reverberation in Speech-based Emotion Recognition Traveling-Wave Ring Oscillator – Simulations and Prototype Measurements for a New Architecture for a Transmission Line Based Oscillator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1