单音与复调:一种基于威布尔二元模型的新方法

H. Lachambre, R. André-Obrecht, J. Pinquier
{"title":"单音与复调:一种基于威布尔二元模型的新方法","authors":"H. Lachambre, R. André-Obrecht, J. Pinquier","doi":"10.1109/CBMI.2009.24","DOIUrl":null,"url":null,"abstract":"Our contribution takes place in the context of music indexation. In many applications, such as multipitch estimation, it can be useful to know the number of notes played at a time. In this work, we aim at distinguish monophonies (one note at a time) from polyphonies (several notes at a time). We analyze an indicator which gives the confidence on the estimated pitch. In the case of a monophony, the pitch is relatively easy to determine, this indicator is low. In the case of a polyphony, the pitch is much more difficult to determine, so the indicator is higher and varies more. Considering these two facts, we compute the short term mean and variance of the indicator, and model the bivariate repartition of these two parameters with Weibull bivariate distributions for each class (monophony and polyphony). The classification is made by computing the likelihood over one second for each class and taking the best one.Models are learned with 25 seconds of each kind of signal. Our best results give a global error rate of 6.3 %, performed on a balanced corpus containing approximately 18 minutes of signal.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models\",\"authors\":\"H. Lachambre, R. André-Obrecht, J. Pinquier\",\"doi\":\"10.1109/CBMI.2009.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Our contribution takes place in the context of music indexation. In many applications, such as multipitch estimation, it can be useful to know the number of notes played at a time. In this work, we aim at distinguish monophonies (one note at a time) from polyphonies (several notes at a time). We analyze an indicator which gives the confidence on the estimated pitch. In the case of a monophony, the pitch is relatively easy to determine, this indicator is low. In the case of a polyphony, the pitch is much more difficult to determine, so the indicator is higher and varies more. Considering these two facts, we compute the short term mean and variance of the indicator, and model the bivariate repartition of these two parameters with Weibull bivariate distributions for each class (monophony and polyphony). The classification is made by computing the likelihood over one second for each class and taking the best one.Models are learned with 25 seconds of each kind of signal. Our best results give a global error rate of 6.3 %, performed on a balanced corpus containing approximately 18 minutes of signal.\",\"PeriodicalId\":417012,\"journal\":{\"name\":\"2009 Seventh International Workshop on Content-Based Multimedia Indexing\",\"volume\":\"65 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Seventh International Workshop on Content-Based Multimedia Indexing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CBMI.2009.24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI.2009.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

我们的贡献发生在音乐索引的背景下。在许多应用程序中,例如多音高估计,知道一次演奏的音符数量可能是有用的。在这部作品中,我们的目标是区分单音(一次一个音符)和复音(一次几个音符)。我们分析了一个给出估计pitch置信度的指标。在单音的情况下,音高相对容易确定,这个指标低。在复调的情况下,音高更难确定,所以指标更高,变化更大。考虑到这两个事实,我们计算了指标的短期均值和方差,并对每个类别(单音和复音)用威布尔二元分布对这两个参数的二元重划分进行了建模。分类是通过计算每个类别在一秒内的可能性并选择最好的类别来完成的。每种信号的学习时间为25秒。我们的最佳结果给出了6.3%的全局错误率,在包含大约18分钟信号的平衡语料库上执行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models
Our contribution takes place in the context of music indexation. In many applications, such as multipitch estimation, it can be useful to know the number of notes played at a time. In this work, we aim at distinguish monophonies (one note at a time) from polyphonies (several notes at a time). We analyze an indicator which gives the confidence on the estimated pitch. In the case of a monophony, the pitch is relatively easy to determine, this indicator is low. In the case of a polyphony, the pitch is much more difficult to determine, so the indicator is higher and varies more. Considering these two facts, we compute the short term mean and variance of the indicator, and model the bivariate repartition of these two parameters with Weibull bivariate distributions for each class (monophony and polyphony). The classification is made by computing the likelihood over one second for each class and taking the best one.Models are learned with 25 seconds of each kind of signal. Our best results give a global error rate of 6.3 %, performed on a balanced corpus containing approximately 18 minutes of signal.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Motion Vector Based Moving Object Detection and Tracking in the MPEG Compressed Domain A Comparison of L_1 Norm and L_2 Norm Multiple Kernel SVMs in Image and Video Classification Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models Kernel Discriminant Analysis Using Triangular Kernel for Semantic Scene Classification Biometric Responses to Music-Rich Segments in Films: The CDVPlex
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1