Deep Music Information Dynamics Novel Framework for Reduced Neural-Network Music Representation with Applications to Midi and Audio Analysis and Improvisation
{"title":"Deep Music Information Dynamics Novel Framework for Reduced Neural-Network Music Representation with Applications to Midi and Audio Analysis and Improvisation","authors":"S. Dubnov, K. Chen, Kevin Huang","doi":"10.5920/jcms.894","DOIUrl":null,"url":null,"abstract":"Generative musical models often comprise of multiple levels of structure, presuming that the process of composition moves between background to foreground, or between generating musical surface and some deeper and reduced representation that governs hidden or latent dimensions of music. In this paper we are using a recently proposed framework called Deep Musical Information Dynamics (DMID) to explore information contents of deep neural models of music through rate reduction of latent representation streams, which is contrasted with hight rate information dynamics of the musical surface. This approach is partially motivated by rate-distortion theories of human cognition, providing a framework for exploring possible relations between imaginary anticipations existing in the listener's or composer's mind, and the information dynamics of the sensory (acoustic) or symbolic score data. In the paper the DMID framework is demonstrated using several experiments with symbolic (MIDI) and acoustic (spectral) music representations. We use variational encoding to learn a latent representation of the musical surface. This embedding is further reduced using a bit-allocation method into a second stream of low bit-rate encoding. The combined loss includes temporal information in terms of predictive properties for each encoding stream, and accuracy loss measured in terms of mutual information between the encoding at low rate and the high rate surface representations. For the case of counterpoint, we also study the mutual information between two voices in a musical piece at different levels of information reduction.The DMID framework allows to explore aspects of computational creativity in terms of juxtaposition of latent/imaginary surprisal aspects of deeper structure with music surprisal on the surface level, done in a manner that is quantifiable and computationally tractable. The relevant information theory modeling and analysis methods are discussed in the paper, suggesting that a trade off between compression and prediction play an important factor in the analysis and design of creative musical systems.","PeriodicalId":52272,"journal":{"name":"Journal of Creative Music Systems","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Creative Music Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5920/jcms.894","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 1
Abstract
Generative musical models often comprise of multiple levels of structure, presuming that the process of composition moves between background to foreground, or between generating musical surface and some deeper and reduced representation that governs hidden or latent dimensions of music. In this paper we are using a recently proposed framework called Deep Musical Information Dynamics (DMID) to explore information contents of deep neural models of music through rate reduction of latent representation streams, which is contrasted with hight rate information dynamics of the musical surface. This approach is partially motivated by rate-distortion theories of human cognition, providing a framework for exploring possible relations between imaginary anticipations existing in the listener's or composer's mind, and the information dynamics of the sensory (acoustic) or symbolic score data. In the paper the DMID framework is demonstrated using several experiments with symbolic (MIDI) and acoustic (spectral) music representations. We use variational encoding to learn a latent representation of the musical surface. This embedding is further reduced using a bit-allocation method into a second stream of low bit-rate encoding. The combined loss includes temporal information in terms of predictive properties for each encoding stream, and accuracy loss measured in terms of mutual information between the encoding at low rate and the high rate surface representations. For the case of counterpoint, we also study the mutual information between two voices in a musical piece at different levels of information reduction.The DMID framework allows to explore aspects of computational creativity in terms of juxtaposition of latent/imaginary surprisal aspects of deeper structure with music surprisal on the surface level, done in a manner that is quantifiable and computationally tractable. The relevant information theory modeling and analysis methods are discussed in the paper, suggesting that a trade off between compression and prediction play an important factor in the analysis and design of creative musical systems.