A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz
{"title":"用相图训练的神经网络识别声源节点位置","authors":"A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz","doi":"10.1109/ISSPIT51521.2020.9408643","DOIUrl":null,"url":null,"abstract":"In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.","PeriodicalId":111385,"journal":{"name":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Identifying Sound Source Node Locations Using Neural Networks Trained with Phasograms\",\"authors\":\"A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz\",\"doi\":\"10.1109/ISSPIT51521.2020.9408643\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.\",\"PeriodicalId\":111385,\"journal\":{\"name\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPIT51521.2020.9408643\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPIT51521.2020.9408643","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identifying Sound Source Node Locations Using Neural Networks Trained with Phasograms
In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.