Pixelwise classification for music document analysis

Jorge Calvo-Zaragoza, Gabriel Vigliensoni, Ichiro Fujinaga
{"title":"Pixelwise classification for music document analysis","authors":"Jorge Calvo-Zaragoza, Gabriel Vigliensoni, Ichiro Fujinaga","doi":"10.1109/IPTA.2017.8310134","DOIUrl":null,"url":null,"abstract":"Content within musical documents not only contains music symbol but also include different elements such as staff lines, text, or frontispieces. Before attempting to automatically recognize components in these layers, it is necessary to perform an analysis of the musical documents in order to detect and classify each of these constituent parts. The obstacle for this analysis is the high heterogeneity amongst music collections, especially with ancient documents, which makes it difficult to devise methods that can be generalizable to a broader range of sources. In this paper we propose a data-driven document analysis framework based on machine learning that focuses on classifying regions of interest at pixel level. For that, we make use of Convolutional Neural Networks trained to infer the category of each pixel. The main advantage of this approach is that it can be applied regardless of the type of document provided, as long as training data is available. Since this work represents first efforts in that direction, our experimentation focuses on reporting a baseline classification using our framework. The experiments show promising performance, achieving an accuracy around 90% in two corpora of old music documents.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPTA.2017.8310134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Content within musical documents not only contains music symbol but also include different elements such as staff lines, text, or frontispieces. Before attempting to automatically recognize components in these layers, it is necessary to perform an analysis of the musical documents in order to detect and classify each of these constituent parts. The obstacle for this analysis is the high heterogeneity amongst music collections, especially with ancient documents, which makes it difficult to devise methods that can be generalizable to a broader range of sources. In this paper we propose a data-driven document analysis framework based on machine learning that focuses on classifying regions of interest at pixel level. For that, we make use of Convolutional Neural Networks trained to infer the category of each pixel. The main advantage of this approach is that it can be applied regardless of the type of document provided, as long as training data is available. Since this work represents first efforts in that direction, our experimentation focuses on reporting a baseline classification using our framework. The experiments show promising performance, achieving an accuracy around 90% in two corpora of old music documents.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于音乐文档分析的像素分类
音乐文档中的内容不仅包含音乐符号,还包括不同的元素,如五线谱、文本或扉页。在尝试自动识别这些层中的组成部分之前,有必要对音乐文档进行分析,以便检测和分类每个组成部分。这种分析的障碍是音乐收藏的高度异质性,特别是古代文献,这使得很难设计出可以推广到更广泛来源的方法。在本文中,我们提出了一个基于机器学习的数据驱动文档分析框架,该框架侧重于在像素级别对感兴趣的区域进行分类。为此,我们使用经过训练的卷积神经网络来推断每个像素的类别。这种方法的主要优点是,只要有可用的训练数据,无论所提供的文档类型如何,都可以应用该方法。由于这项工作代表了该方向的第一次努力,我们的实验集中在使用我们的框架报告基线分类上。实验结果表明,在两个旧音乐文档的语料库中,准确率达到90%左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automated quantification of retinal vessel morphometry in the UK biobank cohort Deep learning for automatic sale receipt understanding Illumination-robust multispectral demosaicing Completed local structure patterns on three orthogonal planes for dynamic texture recognition Single object tracking using offline trained deep regression networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1