用相图训练的神经网络识别声源节点位置

A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz
{"title":"用相图训练的神经网络识别声源节点位置","authors":"A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz","doi":"10.1109/ISSPIT51521.2020.9408643","DOIUrl":null,"url":null,"abstract":"In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.","PeriodicalId":111385,"journal":{"name":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Identifying Sound Source Node Locations Using Neural Networks Trained with Phasograms\",\"authors\":\"A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz\",\"doi\":\"10.1109/ISSPIT51521.2020.9408643\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.\",\"PeriodicalId\":111385,\"journal\":{\"name\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPIT51521.2020.9408643\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPIT51521.2020.9408643","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在这项工作中,研究了通过神经网络对声源位置的最佳逼近。大多数相关工作要么忽略短时傅里叶变换(STFT)中的相位信息,要么将其仅用于恢复频谱图中的不规则性。我们的过程有所不同,因此它侧重于STFT系数的相位分量,通过对最近的麦克风阵列(节点)进行分类来估计声源位置。由时频域中相位差信息的映射得到的图像就是我们所说的相位图,并被用作神经网络的输入。实验是通过记录SINS数据库的前四个节点实现的。在这项工作中,检查了相邻麦克风之间以及与第一个麦克风之间的相位差。在五重交叉验证中,前者的f1得分为99.68%,后者为99.31%。我们工作的一个实际应用是医疗监控系统,当与声音场景分类系统集成时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Identifying Sound Source Node Locations Using Neural Networks Trained with Phasograms
In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Performance study of CFD Pressure-based solver on HPC Efficient Topology of Multilevel Clustering Algorithm for Underwater Sensor Networks Machine learning applied to diabetes dataset using Quantum versus Classical computation DOAV Estimation Using L-Shaped Antenna Array Configuration Sentiment analysis using an ensemble approach of BiGRU model: A case study of AMIS tweets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1