Musical Instrument Sound Morphing Guided by Perceptually Motivated Features

Marcelo F. Caetano, X. Rodet
{"title":"Musical Instrument Sound Morphing Guided by Perceptually Motivated Features","authors":"Marcelo F. Caetano, X. Rodet","doi":"10.1109/TASL.2013.2260154","DOIUrl":null,"url":null,"abstract":"Sound morphing is a transformation that gradually blurs the distinction between the source and target sounds. For musical instrument sounds, the morph must operate across timbre dimensions to create the auditory illusion of hybrid musical instruments. The ultimate goal of sound morphing is to perform perceptually linear transitions, which requires an appropriate model to represent the sounds being morphed and an interpolation function to obtain intermediate sounds. Typically, morphing techniques directly interpolate the parameters of the sound model without considering the perceptual impact or evaluating the results. Perceptual evaluations are cumbersome and not always conclusive. In this work, we seek parameters of a sound model that favor linear variation of perceptually motivated temporal and spectral features used to guide the morph towards more perceptually linear results. The requirement of linear variation of feature values gives rise to objective evaluation criteria for sound morphing. We investigate several spectral envelope morphing techniques to determine which spectral representation renders the most linear transformation in the spectral shape feature domain. We found that interpolation of line spectral frequencies gives the most linear spectral envelope morphs. Analogously, we study temporal envelope morphing techniques and we concluded that interpolation of cepstral coefficients results in the most linear temporal envelope morph.","PeriodicalId":55014,"journal":{"name":"IEEE Transactions on Audio Speech and Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TASL.2013.2260154","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Audio Speech and Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASL.2013.2260154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

Abstract

Sound morphing is a transformation that gradually blurs the distinction between the source and target sounds. For musical instrument sounds, the morph must operate across timbre dimensions to create the auditory illusion of hybrid musical instruments. The ultimate goal of sound morphing is to perform perceptually linear transitions, which requires an appropriate model to represent the sounds being morphed and an interpolation function to obtain intermediate sounds. Typically, morphing techniques directly interpolate the parameters of the sound model without considering the perceptual impact or evaluating the results. Perceptual evaluations are cumbersome and not always conclusive. In this work, we seek parameters of a sound model that favor linear variation of perceptually motivated temporal and spectral features used to guide the morph towards more perceptually linear results. The requirement of linear variation of feature values gives rise to objective evaluation criteria for sound morphing. We investigate several spectral envelope morphing techniques to determine which spectral representation renders the most linear transformation in the spectral shape feature domain. We found that interpolation of line spectral frequencies gives the most linear spectral envelope morphs. Analogously, we study temporal envelope morphing techniques and we concluded that interpolation of cepstral coefficients results in the most linear temporal envelope morph.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
由感知动机特征引导的乐器声音变形
声音变形是一种逐渐模糊源音和目标音之间区别的转换。对于乐器的声音,变形必须跨音色维度运作,以创造混合乐器的听觉错觉。声音变形的最终目标是实现感知上的线性转换,这需要一个合适的模型来表示被变形的声音,并需要一个插值函数来获得中间声音。通常,变形技术直接插入声音模型的参数,而不考虑感知影响或评估结果。知觉评价是繁琐的,并不总是决定性的。在这项工作中,我们寻求一个健全模型的参数,该模型有利于感知驱动的时间和光谱特征的线性变化,用于指导更多感知线性结果的变化。特征值线性变化的要求为声音变形提供了客观的评价标准。我们研究了几种光谱包络变形技术,以确定哪种光谱表示在光谱形状特征域中呈现最线性的变换。我们发现线谱频率的插值给出了最线性的谱包络变形。类似地,我们研究了时间包络变形技术,我们得出的结论是,倒谱系数的插值结果是最线性的时间包络变形。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Audio Speech and Language Processing
IEEE Transactions on Audio Speech and Language Processing 工程技术-工程:电子与电气
自引率
0.00%
发文量
0
审稿时长
24.0 months
期刊介绍: The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.
期刊最新文献
A High-Quality Speech and Audio Codec With Less Than 10-ms Delay Efficient Approximation of Head-Related Transfer Functions in Subbands for Accurate Sound Localization. Epoch Extraction Based on Integrated Linear Prediction Residual Using Plosion Index Body Conducted Speech Enhancement by Equalization and Signal Fusion Soundfield Imaging in the Ray Space
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1