Pub Date : 2019-01-01DOI: 10.1109/MMRP.2019.8665368
Stefano Cherubin, Clara Borrelli, M. Zanoni, Michele Buccoli, A. Sarti, S. Tubaro
The increased availability of musical content comes with the need of novel paradigms for recommendation, browsing and retrieval from large music libraries. Most music players and streaming services propose a paradigm based on content listing of meta-data information, which provides little insight on the music content. In services with huge catalogs of songs, a more informative paradigm is needed. In this work we propose a framework for music browsing based on the navigation into a three-dimensional (3-D) space, where musical items are placed as a 3-D mapping of their high-level semantic descriptors. We conducted a survey to guide the design of the framework and the implementation choices. We rely on state-of-the-art techniques from Music Information Retrieval to automatically extract the high-level descriptors from a low-level representation of the musical signal. The framework is validated by means of a subjective evaluation from 33 users, who give positive feedbacks and highlight promising future developments especially in virtual reality field.
{"title":"Three-Dimensional Mapping of High-Level Music Features for Music Browsing","authors":"Stefano Cherubin, Clara Borrelli, M. Zanoni, Michele Buccoli, A. Sarti, S. Tubaro","doi":"10.1109/MMRP.2019.8665368","DOIUrl":"https://doi.org/10.1109/MMRP.2019.8665368","url":null,"abstract":"The increased availability of musical content comes with the need of novel paradigms for recommendation, browsing and retrieval from large music libraries. Most music players and streaming services propose a paradigm based on content listing of meta-data information, which provides little insight on the music content. In services with huge catalogs of songs, a more informative paradigm is needed. In this work we propose a framework for music browsing based on the navigation into a three-dimensional (3-D) space, where musical items are placed as a 3-D mapping of their high-level semantic descriptors. We conducted a survey to guide the design of the framework and the implementation choices. We rely on state-of-the-art techniques from Music Information Retrieval to automatically extract the high-level descriptors from a low-level representation of the musical signal. The framework is validated by means of a subjective evaluation from 33 users, who give positive feedbacks and highlight promising future developments especially in virtual reality field.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130300430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP) MMRP 2019","authors":"","doi":"10.1109/mmrp.2019.00004","DOIUrl":"https://doi.org/10.1109/mmrp.2019.00004","url":null,"abstract":"","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128347258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1109/MMRP.2019.8665363
Hunter M. Brown, M. Casey
This article presents a new system for real-time machine listening within human-machine free improvisation. Heretic uses Anthony Braxton's Language Music system as a grammatical model for contextualizing real-time audio feature data within free improvisation. Heretic hears, recognizes, and organizes unseen musical material from a human improviser into a fluid, coherent, and expressive musical language. Systems similar to Heretic often prioritize agnostic approaches to machine listening by avoiding prior musical knowledge in the system's training stage. However, prominent improvisers such as Cecil Taylor, Ornette Coleman, Joe Morris, and Anthony Braxton detail their approaches to improvisation as languages or grammatical systems. These improvisers contextualize the real-time musical materials of their band-mates by applying their formulated grammatical systems to their decision-making processes. Taylor, Coleman, Morris, and Braxton's autonomy and musical creativity are not compromised by using grammatical systems. In regards to human-machine improvisation, Heretic demonstrates that a grammatical approach to machine listening can yield idiosyncratic interactions, full machine autonomy, and novel musical output. This article details a re-imagining of Anthony Braxton's Language Music within the context of machine listening, and an implementation of Language Music within Heretic via SuperCollider's audio feature extraction functionality and Wekinator's multi-layer perceptron neural networks.
{"title":"Heretic: Modeling Anthony Braxton's Language Music","authors":"Hunter M. Brown, M. Casey","doi":"10.1109/MMRP.2019.8665363","DOIUrl":"https://doi.org/10.1109/MMRP.2019.8665363","url":null,"abstract":"This article presents a new system for real-time machine listening within human-machine free improvisation. Heretic uses Anthony Braxton's Language Music system as a grammatical model for contextualizing real-time audio feature data within free improvisation. Heretic hears, recognizes, and organizes unseen musical material from a human improviser into a fluid, coherent, and expressive musical language. Systems similar to Heretic often prioritize agnostic approaches to machine listening by avoiding prior musical knowledge in the system's training stage. However, prominent improvisers such as Cecil Taylor, Ornette Coleman, Joe Morris, and Anthony Braxton detail their approaches to improvisation as languages or grammatical systems. These improvisers contextualize the real-time musical materials of their band-mates by applying their formulated grammatical systems to their decision-making processes. Taylor, Coleman, Morris, and Braxton's autonomy and musical creativity are not compromised by using grammatical systems. In regards to human-machine improvisation, Heretic demonstrates that a grammatical approach to machine listening can yield idiosyncratic interactions, full machine autonomy, and novel musical output. This article details a re-imagining of Anthony Braxton's Language Music within the context of machine listening, and an implementation of Language Music within Heretic via SuperCollider's audio feature extraction functionality and Wekinator's multi-layer perceptron neural networks.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125173844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1109/MMRP.2019.8665370
F. Avanzini, L. A. Ludovico
This work represents the introduction to the proceedings of the IstInternational Workshop on Multilayer Music Representation and Processing (MMRP19) authored by the Program Co-Chairs. The idea is to explain the rationale behind such a scientific initiative, describe the methodological approach used in paper selection, and provide a short overview of the workshop's accepted works, trying to highlight the thread that runs through different contributions and approaches.
{"title":"Multilayer Music Representation and Processing: Key Advances and Emerging Trends","authors":"F. Avanzini, L. A. Ludovico","doi":"10.1109/MMRP.2019.8665370","DOIUrl":"https://doi.org/10.1109/MMRP.2019.8665370","url":null,"abstract":"This work represents the introduction to the proceedings of the IstInternational Workshop on Multilayer Music Representation and Processing (MMRP19) authored by the Program Co-Chairs. The idea is to explain the rationale behind such a scientific initiative, describe the methodological approach used in paper selection, and provide a short overview of the workshop's accepted works, trying to highlight the thread that runs through different contributions and approaches.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127416834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1109/MMRP.2019.8665372
Rainer Kelz, Sebastian Böck, G. Widmer
Viewing polyphonic piano transcription as a multitask learning problem, where we need to simultaneously predict onsets, intermediate frames and offsets of notes, we investigate the performance impact of additional prediction targets, using a variety of suitable convolutional neural network architectures. We quantify performance differences of additional objectives on the larGe MAESTRO dataset.
{"title":"Multitask Learning for Polyphonic Piano Transcription, a Case Study","authors":"Rainer Kelz, Sebastian Böck, G. Widmer","doi":"10.1109/MMRP.2019.8665372","DOIUrl":"https://doi.org/10.1109/MMRP.2019.8665372","url":null,"abstract":"Viewing polyphonic piano transcription as a multitask learning problem, where we need to simultaneously predict onsets, intermediate frames and offsets of notes, we investigate the performance impact of additional prediction targets, using a variety of suitable convolutional neural network architectures. We quantify performance differences of additional objectives on the larGe MAESTRO dataset.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"351 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132167302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1109/mmrp.2019.8665358
A. Baratè, L. A. Ludovico, S. Ntalampiras, G. Presti
{"title":"2019 International Workshop on Multilayer Music Representation and Processing MMRP 2019","authors":"A. Baratè, L. A. Ludovico, S. Ntalampiras, G. Presti","doi":"10.1109/mmrp.2019.8665358","DOIUrl":"https://doi.org/10.1109/mmrp.2019.8665358","url":null,"abstract":"","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"98 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114271296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This workshop has two main goals: first, bringing together the scientific community for an up-to-date discussion about the multilayer music representation topic; secondly, hosting the kickoff meeting of the Working Group for the IEEE1599 standard revision. The latter point implies a number of activities, such as forming and introducing the team, understanding the project background, identifying the main goals to pursue, and agreeing on how to work together effectively.
{"title":"Message from the General Chair MMRP 2019","authors":"Mmrp","doi":"10.1109/mmrp.2019.00005","DOIUrl":"https://doi.org/10.1109/mmrp.2019.00005","url":null,"abstract":"This workshop has two main goals: first, bringing together the scientific community for an up-to-date discussion about the multilayer music representation topic; secondly, hosting the kickoff meeting of the Working Group for the IEEE1599 standard revision. The latter point implies a number of activities, such as forming and introducing the team, understanding the project background, identifying the main goals to pursue, and agreeing on how to work together effectively.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116583968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval, and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years.
{"title":"Multimodal Music Information Processing and Retrieval: Survey and Future Challenges","authors":"Federico Simonetta, S. Ntalampiras, F. Avanzini","doi":"10.1109/MMRP.2019.00012","DOIUrl":"https://doi.org/10.1109/MMRP.2019.00012","url":null,"abstract":"Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval, and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125032138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Chen, Weilin Zhang, S. Dubnov, Gus G. Xia, Wei Li
With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.
{"title":"The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation","authors":"K. Chen, Weilin Zhang, S. Dubnov, Gus G. Xia, Wei Li","doi":"10.1109/MMRP.2019.00022","DOIUrl":"https://doi.org/10.1109/MMRP.2019.00022","url":null,"abstract":"With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128850254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/MMRP.2019.8665378
M. Sandler, D. De Roure, S. Benford, Kevin R. Page
The FAST project (Fusing Audio and Semantic Technology for Intelligent Music Production and Consumption) with 5 years of UK funding, has sought to create a new musical ecosystem that empowers all manner of people, from professional performers to casual listeners, to engage in new, more creative, immersive and dynamic musical experiences. Realising this requires a step-change in digital music technologies. Going beyond today's digital sound files, future experiences will demand far richer musical information, whereby music content will be packaged in a flexible, structured way that combines audio recordings with rich, layered metadata to support interactive and adaptive musical experiences. This defines the overall ambition of FAST-to lay the foundations for a new generation of ‘semantic audio’ technologies that underpin diverse future music experiences. This paper therefore aims to describe the overall vision of the project, set out the broad landscape in which it is working, highlight some key results and show how they bring out a central notion of FAST, that of Digital Music Objects, which are flexible constructs consisting of recorded music essence coupled with rich, semantic, linked metadata.
{"title":"Semantic Web Technology for New Experiences Throughout the Music Production-Consumption Chain","authors":"M. Sandler, D. De Roure, S. Benford, Kevin R. Page","doi":"10.1109/MMRP.2019.8665378","DOIUrl":"https://doi.org/10.1109/MMRP.2019.8665378","url":null,"abstract":"The FAST project (Fusing Audio and Semantic Technology for Intelligent Music Production and Consumption) with 5 years of UK funding, has sought to create a new musical ecosystem that empowers all manner of people, from professional performers to casual listeners, to engage in new, more creative, immersive and dynamic musical experiences. Realising this requires a step-change in digital music technologies. Going beyond today's digital sound files, future experiences will demand far richer musical information, whereby music content will be packaged in a flexible, structured way that combines audio recordings with rich, layered metadata to support interactive and adaptive musical experiences. This defines the overall ambition of FAST-to lay the foundations for a new generation of ‘semantic audio’ technologies that underpin diverse future music experiences. This paper therefore aims to describe the overall vision of the project, set out the broad landscape in which it is working, highlight some key results and show how they bring out a central notion of FAST, that of Digital Music Objects, which are flexible constructs consisting of recorded music essence coupled with rich, semantic, linked metadata.","PeriodicalId":441469,"journal":{"name":"2019 International Workshop on Multilayer Music Representation and Processing (MMRP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114993459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}