{"title":"用神经网络转录钢琴复调音乐","authors":"M. Marolt","doi":"10.1109/MELCON.2000.879982","DOIUrl":null,"url":null,"abstract":"This paper presents our experiences in building a system for transcription of polyphonic piano music. By transcription we mean the conversion of an audio recording of a polyphonic piano performance to a series of notes and their starting times. Our final goal is to build a transcription system that would transcribe polyphonic piano music over the entire piano range and with large polyphony. The system consists of three main stages. We first use a cochlear model based on the gammatone filterbank to transform an audio signal of a piano performance into time-frequency space. In the second stage we use a network of coupled adaptive oscillators to extract partial tracks from the output of the cochlear model and in the third stage we employ artificial neural networks acting as pattern recognisers to extract notes from the output of the oscillator network. The system uses several networks each trained to recognize the occurrence of a specific note in the input signal.","PeriodicalId":151424,"journal":{"name":"2000 10th Mediterranean Electrotechnical Conference. Information Technology and Electrotechnology for the Mediterranean Countries. Proceedings. MeleCon 2000 (Cat. No.00CH37099)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"55","resultStr":"{\"title\":\"Transcription of polyphonic piano music with neural networks\",\"authors\":\"M. Marolt\",\"doi\":\"10.1109/MELCON.2000.879982\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents our experiences in building a system for transcription of polyphonic piano music. By transcription we mean the conversion of an audio recording of a polyphonic piano performance to a series of notes and their starting times. Our final goal is to build a transcription system that would transcribe polyphonic piano music over the entire piano range and with large polyphony. The system consists of three main stages. We first use a cochlear model based on the gammatone filterbank to transform an audio signal of a piano performance into time-frequency space. In the second stage we use a network of coupled adaptive oscillators to extract partial tracks from the output of the cochlear model and in the third stage we employ artificial neural networks acting as pattern recognisers to extract notes from the output of the oscillator network. The system uses several networks each trained to recognize the occurrence of a specific note in the input signal.\",\"PeriodicalId\":151424,\"journal\":{\"name\":\"2000 10th Mediterranean Electrotechnical Conference. Information Technology and Electrotechnology for the Mediterranean Countries. Proceedings. MeleCon 2000 (Cat. No.00CH37099)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"55\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2000 10th Mediterranean Electrotechnical Conference. Information Technology and Electrotechnology for the Mediterranean Countries. Proceedings. MeleCon 2000 (Cat. No.00CH37099)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MELCON.2000.879982\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2000 10th Mediterranean Electrotechnical Conference. Information Technology and Electrotechnology for the Mediterranean Countries. Proceedings. MeleCon 2000 (Cat. No.00CH37099)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MELCON.2000.879982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Transcription of polyphonic piano music with neural networks
This paper presents our experiences in building a system for transcription of polyphonic piano music. By transcription we mean the conversion of an audio recording of a polyphonic piano performance to a series of notes and their starting times. Our final goal is to build a transcription system that would transcribe polyphonic piano music over the entire piano range and with large polyphony. The system consists of three main stages. We first use a cochlear model based on the gammatone filterbank to transform an audio signal of a piano performance into time-frequency space. In the second stage we use a network of coupled adaptive oscillators to extract partial tracks from the output of the cochlear model and in the third stage we employ artificial neural networks acting as pattern recognisers to extract notes from the output of the oscillator network. The system uses several networks each trained to recognize the occurrence of a specific note in the input signal.