首页 > 最新文献

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文 中文
Learning from the Best: A Teacher-student Multilingual Framework for Low-resource Languages 向最好的学习:低资源语言的师生多语言框架
Deblin Bagchi, William Hartmann
The traditional method of pretraining neural acoustic models in low-resource languages consists of initializing the acoustic model parameters with a large, annotated multilingual corpus and can be a drain on time and resources. In an attempt to reuse TDNN-LSTMs already pre-trained using multilingual training, we have applied Teacher-Student (TS) learning as a method of pretraining to transfer knowledge from a multilingual TDNN-LSTM to a TDNN. The pretraining time is reduced by an order of magnitude with the use of language-specific data during the teacher-student training. Additionally, the TS architecture allows us to leverage untranscribed data, previously untouched during supervised training. The best student TDNN achieves a WER within 1% of the teacher TDNN-LSTM performance and shows consistent improvement in recognition over TDNNs trained using the traditional pipeline over all the evaluation languages. Switching to TDNN from TDNN-LSTM also allows sub-real time decoding.
传统的低资源语言神经声学模型预训练方法包括使用大型、带注释的多语言语料库初始化声学模型参数,这可能会消耗大量时间和资源。为了重用已经使用多语言训练进行预训练的TDNN- lstm,我们应用师生(TS)学习作为一种预训练方法,将知识从多语言TDNN- lstm转移到TDNN。在师生训练过程中,使用特定语言的数据,预训练时间减少了一个数量级。此外,TS架构允许我们利用未转录的数据,以前在监督训练期间未触及。最好的学生TDNN达到了教师TDNN- lstm性能的1%以内的WER,并且在所有评估语言中使用传统管道训练的TDNN在识别方面表现出一致的改进。从TDNN- lstm切换到TDNN也允许亚实时解码。
{"title":"Learning from the Best: A Teacher-student Multilingual Framework for Low-resource Languages","authors":"Deblin Bagchi, William Hartmann","doi":"10.1109/ICASSP.2019.8683491","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683491","url":null,"abstract":"The traditional method of pretraining neural acoustic models in low-resource languages consists of initializing the acoustic model parameters with a large, annotated multilingual corpus and can be a drain on time and resources. In an attempt to reuse TDNN-LSTMs already pre-trained using multilingual training, we have applied Teacher-Student (TS) learning as a method of pretraining to transfer knowledge from a multilingual TDNN-LSTM to a TDNN. The pretraining time is reduced by an order of magnitude with the use of language-specific data during the teacher-student training. Additionally, the TS architecture allows us to leverage untranscribed data, previously untouched during supervised training. The best student TDNN achieves a WER within 1% of the teacher TDNN-LSTM performance and shows consistent improvement in recognition over TDNNs trained using the traditional pipeline over all the evaluation languages. Switching to TDNN from TDNN-LSTM also allows sub-real time decoding.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"6051-6055"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81681920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On Radar Privacy in Shared Spectrum Scenarios 共享频谱场景下雷达隐私研究
Anastasios Dimas, Matthew A. Clark, Bo Li, K. Psounis, A. Petropulu
To satisfy the increasing demand for additional bandwidth from the wireless sector, regulatory bodies are considering to allow commercial wireless systems to operate on spectrum bands that until recently were reserved exclusively for military radar. Such co-existence would require mechanisms for controlling interference. One such mechanism is to assign a precoder to the communication system, which is designed to minimize the communication system’s interference to the radar. This paper looks into whether the implicit radar information contained in such a precoder can be exploited by an adversary to infer the radar’s location. For two specific precoder schemes, we simulate a machine learning based location inference attack. We show that the system information leaked through the precoder can indeed pose various degrees of risk to the radar’s privacy, and further confirm this by computing the mutual information between the respective precoder and the radar location.
为了满足无线部门对额外带宽日益增长的需求,监管机构正在考虑允许商用无线系统在直到最近才专门为军用雷达保留的频段上运行。这种共存需要控制干扰的机制。一种这样的机制是分配一个预编码器到通信系统,它被设计成最小化通信系统对雷达的干扰。本文研究了这种预编码器中包含的隐式雷达信息是否可以被对手利用来推断雷达的位置。对于两种特定的预编码器方案,我们模拟了一种基于机器学习的位置推理攻击。我们证明了通过预编码器泄露的系统信息确实会对雷达的隐私构成不同程度的风险,并通过计算各自预编码器与雷达位置之间的互信息进一步证实了这一点。
{"title":"On Radar Privacy in Shared Spectrum Scenarios","authors":"Anastasios Dimas, Matthew A. Clark, Bo Li, K. Psounis, A. Petropulu","doi":"10.1109/ICASSP.2019.8682745","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682745","url":null,"abstract":"To satisfy the increasing demand for additional bandwidth from the wireless sector, regulatory bodies are considering to allow commercial wireless systems to operate on spectrum bands that until recently were reserved exclusively for military radar. Such co-existence would require mechanisms for controlling interference. One such mechanism is to assign a precoder to the communication system, which is designed to minimize the communication system’s interference to the radar. This paper looks into whether the implicit radar information contained in such a precoder can be exploited by an adversary to infer the radar’s location. For two specific precoder schemes, we simulate a machine learning based location inference attack. We show that the system information leaked through the precoder can indeed pose various degrees of risk to the radar’s privacy, and further confirm this by computing the mutual information between the respective precoder and the radar location.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"108 1","pages":"7790-7794"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74658721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Deep Learning Propagation Models over Irregular Terrain 不规则地形上的深度学习传播模型
Mónica Ribero, R. Heath, H. Vikalo, D. Chizhik, R. Valenzuela
Accurate path gain models are critical for coverage prediction and radio frequency (RF) planning in wireless communications. In many settings irregular terrain induces blockages and scattering making it difficult to predict the path gain. Current solutions are either computationally expensive or slope-intercept fits that do not capture local deviations due to terrain variation, leading to large prediction errors. We propose to use machine learning to learn path gain based on terrain elevation as features. We implement different neural network architectures with dense and convolutional layers that could include effects difficult to describe with traditional models (e.g. back scatter). We test our framework on an extensive set of measured path gain data and consistently predict with 5 dB Root Mean Squared Error, an 8 dB improvement over traditional slope-intercept solutions.
准确的路径增益模型对无线通信中的覆盖预测和射频规划至关重要。在许多情况下,不规则的地形会引起阻塞和散射,使路径增益难以预测。目前的解决方案要么计算成本高,要么斜坡-截距拟合不能捕捉到由于地形变化而导致的局部偏差,从而导致很大的预测误差。我们建议使用机器学习来学习基于地形高程作为特征的路径增益。我们使用密集层和卷积层实现不同的神经网络架构,这些层可能包含难以用传统模型描述的效果(例如反向散射)。我们在一组广泛的测量路径增益数据上测试了我们的框架,并以5 dB的均方根误差一致地预测,比传统的斜率-截距解决方案提高了8 dB。
{"title":"Deep Learning Propagation Models over Irregular Terrain","authors":"Mónica Ribero, R. Heath, H. Vikalo, D. Chizhik, R. Valenzuela","doi":"10.1109/ICASSP.2019.8682491","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682491","url":null,"abstract":"Accurate path gain models are critical for coverage prediction and radio frequency (RF) planning in wireless communications. In many settings irregular terrain induces blockages and scattering making it difficult to predict the path gain. Current solutions are either computationally expensive or slope-intercept fits that do not capture local deviations due to terrain variation, leading to large prediction errors. We propose to use machine learning to learn path gain based on terrain elevation as features. We implement different neural network architectures with dense and convolutional layers that could include effects difficult to describe with traditional models (e.g. back scatter). We test our framework on an extensive set of measured path gain data and consistently predict with 5 dB Root Mean Squared Error, an 8 dB improvement over traditional slope-intercept solutions.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"94 1","pages":"4519-4523"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72784805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Improvements to the Matching Projection Decoding Method for Ambisonic System with Irregular Loudspeaker Layouts 不规则扬声器布局双声系统匹配投影解码方法的改进
Zhongshu Ge, Xihong Wu, T. Qu
The Ambisonic technique has been widely used for sound field recording and reproduction recently. However, the basic Ambisonic decoding method will break down when the playback loudspeakers distribute unevenly. Various methods have been proposed to solve this problem. This paper introduces several improvements to a recently proposed Ambisonic decoding method, the matching projection method, for uneven loudspeaker layouts. The first improvement is energy preserving; the second is introducing the "in-phase" weight, and the third is introducing partial projection coefficients. To evaluate the improved method, we compared it with the original one and the all-round Ambisonic decoding method with a 2-dimension unevenly arranged loudspeaker array. The result shows our method greatly improves the original method where the loudspeaker arranges very sparsely or densely.
近年来,双声技术在声场记录和再现中得到了广泛的应用。然而,当播放扬声器分布不均匀时,基本的Ambisonic解码方法将失效。已经提出了各种方法来解决这个问题。本文介绍了最近提出的针对不均匀扬声器布局的Ambisonic解码方法——匹配投影法的几个改进。第一个改进是节能;第二种是引入“同相”权值,第三种是引入部分投影系数。为了对改进后的方法进行评价,我们将改进后的方法与原方法和二维非均匀排列扬声器阵列的全方位双声解码方法进行了比较。结果表明,我们的方法大大改善了原方法中扬声器的稀疏或密集排列。
{"title":"Improvements to the Matching Projection Decoding Method for Ambisonic System with Irregular Loudspeaker Layouts","authors":"Zhongshu Ge, Xihong Wu, T. Qu","doi":"10.1109/ICASSP.2019.8683105","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683105","url":null,"abstract":"The Ambisonic technique has been widely used for sound field recording and reproduction recently. However, the basic Ambisonic decoding method will break down when the playback loudspeakers distribute unevenly. Various methods have been proposed to solve this problem. This paper introduces several improvements to a recently proposed Ambisonic decoding method, the matching projection method, for uneven loudspeaker layouts. The first improvement is energy preserving; the second is introducing the \"in-phase\" weight, and the third is introducing partial projection coefficients. To evaluate the improved method, we compared it with the original one and the all-round Ambisonic decoding method with a 2-dimension unevenly arranged loudspeaker array. The result shows our method greatly improves the original method where the loudspeaker arranges very sparsely or densely.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"20 1","pages":"121-125"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74784199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Zero-mean Convolutional Network with Data Augmentation for Sound Level Invariant Singing Voice Separation 基于数据增强的零均值卷积网络的声级不变歌声分离
Kin Wah Edward Lin, Masataka Goto
We address an issue of separating singing voices from polyphonic music signals regardless of sound level variance of the mixture input. Using a standard separation quality assessment tool BSS Eval 4.0, we found that the separation quality of a singing voice separation (SVS) system based on a dilatable Convolutional Neural Network (CNN) decreases under different sound levels. Even if this SVS system is comparable to state-of-the-art SVS systems, it is vulnerable to the issue of sound level variance. We therefore investigate four methods of making the CNN-based SVS system invariant to different sound levels — two types of data augmentation, frame normalization, and zero-mean convolution. By testing all 15 combinations of the four methods, we found that all combinations can improve the sound level invariance and analyzed the best combinations. To the best of our knowledge, this is the first SVS work systematically investigating sound level variance.
我们解决了一个从复调音乐信号中分离歌唱声音的问题,而不管混合输入的声级差异。使用标准的分离质量评估工具BSS Eval 4.0,我们发现基于可扩展卷积神经网络(CNN)的歌唱声音分离(SVS)系统在不同声级下的分离质量下降。即使这种SVS系统与最先进的SVS系统相媲美,它也容易受到声级差异问题的影响。因此,我们研究了使基于cnn的SVS系统对不同声级不变性的四种方法——两种类型的数据增强、帧归一化和零均值卷积。通过对四种方法的15种组合进行测试,发现所有组合都能提高声级不变性,并分析了最佳组合。据我们所知,这是SVS第一次系统地调查声级差异。
{"title":"Zero-mean Convolutional Network with Data Augmentation for Sound Level Invariant Singing Voice Separation","authors":"Kin Wah Edward Lin, Masataka Goto","doi":"10.1109/ICASSP.2019.8682958","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682958","url":null,"abstract":"We address an issue of separating singing voices from polyphonic music signals regardless of sound level variance of the mixture input. Using a standard separation quality assessment tool BSS Eval 4.0, we found that the separation quality of a singing voice separation (SVS) system based on a dilatable Convolutional Neural Network (CNN) decreases under different sound levels. Even if this SVS system is comparable to state-of-the-art SVS systems, it is vulnerable to the issue of sound level variance. We therefore investigate four methods of making the CNN-based SVS system invariant to different sound levels — two types of data augmentation, frame normalization, and zero-mean convolution. By testing all 15 combinations of the four methods, we found that all combinations can improve the sound level invariance and analyzed the best combinations. To the best of our knowledge, this is the first SVS work systematically investigating sound level variance.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"251-255"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79169324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Tensor-based Estimation of mmWave MIMO Channels with Carrier Frequency Offset 载波频偏毫米波MIMO信道的张量估计
Lucas N. Ribeiro, A. Almeida, Nitin Jonathan Myers, R. Heath
Millimeter wave multiple-input-multiple-output (MIMO) achieves the best performance when reliable channel state information is used to design the beams. Most channel estimation methods proposed in the literature, however, ignore practical hardware impairments such as carrier frequency offset (CFO) and may fail under such impairment. In this paper, we present a joint CFO and channel estimation method based on tensor modeling and compressed sensing. Simulation results indicate that the proposed method yields better channel recovery performance than the benchmark and that it is more robust to a small number of channel measurements.
采用可靠的信道状态信息来设计毫米波波束时,MIMO的性能达到最佳。然而,文献中提出的大多数信道估计方法都忽略了实际的硬件损伤,例如载波频率偏移(CFO),并且可能在这种损伤下失败。本文提出了一种基于张量建模和压缩感知的联合CFO和信道估计方法。仿真结果表明,该方法比基准方法具有更好的信道恢复性能,并且对少量信道测量具有更强的鲁棒性。
{"title":"Tensor-based Estimation of mmWave MIMO Channels with Carrier Frequency Offset","authors":"Lucas N. Ribeiro, A. Almeida, Nitin Jonathan Myers, R. Heath","doi":"10.1109/ICASSP.2019.8683496","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683496","url":null,"abstract":"Millimeter wave multiple-input-multiple-output (MIMO) achieves the best performance when reliable channel state information is used to design the beams. Most channel estimation methods proposed in the literature, however, ignore practical hardware impairments such as carrier frequency offset (CFO) and may fail under such impairment. In this paper, we present a joint CFO and channel estimation method based on tensor modeling and compressed sensing. Simulation results indicate that the proposed method yields better channel recovery performance than the benchmark and that it is more robust to a small number of channel measurements.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"4155-4159"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81519370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Auralization of Omnidirectional Room Impulse Responses Based on the Spatial Decomposition Method and Synthetic Spatial Data 基于空间分解和综合空间数据的全向房间脉冲响应听觉化
J. Ahrens
The spatial decomposition method decomposes acoustic room impulse responses into a pressure signal and a direction of arrival for each time instant of the pressure signal. An acoustic space can be auralized by distributing the pressure signal over the available loudspeakers or head-related transfer functions so that the required instantaneous propagation direction is recreated. We present a user study that demonstrates based on binaural auralization that the arrival directions can be synthesized from random data such that the auralization is nearly indistinguishable from the auralization of the original data. The presented concept constitutes the fundament of a highly scalable spatialization method for omnidirectional room impulse responses.
空间分解方法将声室脉冲响应分解为压力信号和压力信号每一时刻的到达方向。通过将压力信号分布到可用的扬声器或与头部相关的传递函数上,可以实现声学空间的听觉化,从而重新创建所需的瞬时传播方向。我们提出了一项用户研究,该研究表明基于双耳耳化的到达方向可以从随机数据中合成,从而使耳化与原始数据的耳化几乎无法区分。提出的概念构成了一个高度可扩展的空间化方法的基础,全方位的房间脉冲响应。
{"title":"Auralization of Omnidirectional Room Impulse Responses Based on the Spatial Decomposition Method and Synthetic Spatial Data","authors":"J. Ahrens","doi":"10.1109/ICASSP.2019.8683661","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683661","url":null,"abstract":"The spatial decomposition method decomposes acoustic room impulse responses into a pressure signal and a direction of arrival for each time instant of the pressure signal. An acoustic space can be auralized by distributing the pressure signal over the available loudspeakers or head-related transfer functions so that the required instantaneous propagation direction is recreated. We present a user study that demonstrates based on binaural auralization that the arrival directions can be synthesized from random data such that the auralization is nearly indistinguishable from the auralization of the original data. The presented concept constitutes the fundament of a highly scalable spatialization method for omnidirectional room impulse responses.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"146-150"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85608365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Factors Affecting Enf Based Time-of-recording Estimation for Video 影响基于Enf的视频记录时间估计的因素
Saffet Vatansever, A. Dirik, N. Memon
ENF (Electric Network Frequency) oscillates around a nominal value (50/60 Hz) due to imbalance between consumed and generated power. The intensity of a light source powered by mains electricity varies depending on the ENF fluctuations. These fluctuations can be extracted from videos recorded in the presence of mains-powered source illumination. This work investigates how the quality of the ENF signal estimated from video is affected by different light source illumination, compression ratios, and by social media encoding. Also explored is the effect of the length of the ENF ground-truth database on time of recording detection and verification.
ENF(电网频率)在标称值(50/ 60hz)附近振荡,因为消耗和产生的功率之间不平衡。由市电供电的光源的强度随ENF的波动而变化。这些波动可以从在主电源照明下录制的视频中提取出来。这项工作研究了从视频中估计的ENF信号的质量如何受到不同光源照明、压缩比和社交媒体编码的影响。还探讨了ENF地基真值数据库长度对记录检测和验证时间的影响。
{"title":"Factors Affecting Enf Based Time-of-recording Estimation for Video","authors":"Saffet Vatansever, A. Dirik, N. Memon","doi":"10.1109/ICASSP.2019.8682419","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682419","url":null,"abstract":"ENF (Electric Network Frequency) oscillates around a nominal value (50/60 Hz) due to imbalance between consumed and generated power. The intensity of a light source powered by mains electricity varies depending on the ENF fluctuations. These fluctuations can be extracted from videos recorded in the presence of mains-powered source illumination. This work investigates how the quality of the ENF signal estimated from video is affected by different light source illumination, compression ratios, and by social media encoding. Also explored is the effect of the length of the ENF ground-truth database on time of recording detection and verification.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"117 1","pages":"2497-2501"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77040920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A Time-frequency Based Multivariate Phase-amplitude Coupling Measure 一种基于时频的多变量相幅耦合测量方法
T. T. Munia, Selin Aviyente
Interaction of neuronal oscillations across different frequency bands plays an important role in perception, attention, and memory. One particular form of interaction is the modulation of the amplitude of high-frequency oscillations by the phase of low-frequency oscillations, known as phase-amplitude coupling (PAC). Current methods for quantifying PAC mostly rely on Hilbert transform which assumes that brain activity is stationary and narrowband. Moreover, these methods are limited to quantifying bivariate PAC and cannot capture multivariate cross-frequency coupling between different brain regions. This paper presents a new complex time-frequency based high resolution PAC measure and its extension to the multivariate case using PARAFAC (Parallel Factor) model. The proposed approach is evaluated on both simulated and real electroencephalogram (EEG) data.
不同频带的神经元振荡相互作用在感知、注意和记忆中起着重要作用。相互作用的一种特殊形式是通过低频振荡的相位调制高频振荡的振幅,称为相位振幅耦合(PAC)。目前量化PAC的方法主要依赖希尔伯特变换,希尔伯特变换假设大脑活动是固定的和窄带的。此外,这些方法仅限于量化二元PAC,无法捕获不同脑区之间的多元交叉频率耦合。本文提出了一种新的基于复时频的高分辨率PAC测量方法,并将其推广到多变量情况。在模拟和真实的脑电图数据上对该方法进行了评估。
{"title":"A Time-frequency Based Multivariate Phase-amplitude Coupling Measure","authors":"T. T. Munia, Selin Aviyente","doi":"10.1109/ICASSP.2019.8682966","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682966","url":null,"abstract":"Interaction of neuronal oscillations across different frequency bands plays an important role in perception, attention, and memory. One particular form of interaction is the modulation of the amplitude of high-frequency oscillations by the phase of low-frequency oscillations, known as phase-amplitude coupling (PAC). Current methods for quantifying PAC mostly rely on Hilbert transform which assumes that brain activity is stationary and narrowband. Moreover, these methods are limited to quantifying bivariate PAC and cannot capture multivariate cross-frequency coupling between different brain regions. This paper presents a new complex time-frequency based high resolution PAC measure and its extension to the multivariate case using PARAFAC (Parallel Factor) model. The proposed approach is evaluated on both simulated and real electroencephalogram (EEG) data.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"48 1","pages":"1095-1099"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80857820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Combining Linear Spatial Filtering and Non-linear Parametric Processing for High-quality Spatial Sound Capturing 结合线性空间滤波和非线性参数处理的高质量空间声音捕获
O. Thiergart, G. Milano, Emanuël Habets
Flexible spatial sound capturing and reproduction can be achieved with multiple microphones by using linear spatial filtering or non-linear parametric processing. The non-linear approaches usually provide a superior spatial resolution compared to the linear approaches but can result in artifacts due to violations of the sound field model. In this paper, we combine both approaches to achieve a high robustness against model violations and a high spatial resolution. We assume linear spatial filters that approximate the spatial responses of the desired output format and compensate remaining deviations with an optimal post filter. The post filter is computed such that the proposed approach behaves like a linear system when the spatial filters achieve the desired spatial response, and scales towards a non-linear system otherwise. Experimental results show that the proposed approach can significantly reduce distortions of existing parametric processing schemes especially when a sufficiently high number of microphones is available.
通过使用线性空间滤波或非线性参数处理,可以实现灵活的空间声音捕获和再现。与线性方法相比,非线性方法通常提供更好的空间分辨率,但由于违反声场模型而导致伪影。在本文中,我们结合了这两种方法来实现对模型违规的高鲁棒性和高空间分辨率。我们假设线性空间滤波器近似期望输出格式的空间响应,并用最优后滤波器补偿剩余的偏差。后滤波器是这样计算的,当空间滤波器达到期望的空间响应时,所提出的方法表现得像一个线性系统,否则向非线性系统扩展。实验结果表明,该方法可以显著降低现有参数处理方案的失真,特别是在麦克风数量足够大的情况下。
{"title":"Combining Linear Spatial Filtering and Non-linear Parametric Processing for High-quality Spatial Sound Capturing","authors":"O. Thiergart, G. Milano, Emanuël Habets","doi":"10.1109/ICASSP.2019.8683515","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683515","url":null,"abstract":"Flexible spatial sound capturing and reproduction can be achieved with multiple microphones by using linear spatial filtering or non-linear parametric processing. The non-linear approaches usually provide a superior spatial resolution compared to the linear approaches but can result in artifacts due to violations of the sound field model. In this paper, we combine both approaches to achieve a high robustness against model violations and a high spatial resolution. We assume linear spatial filters that approximate the spatial responses of the desired output format and compensate remaining deviations with an optimal post filter. The post filter is computed such that the proposed approach behaves like a linear system when the spatial filters achieve the desired spatial response, and scales towards a non-linear system otherwise. Experimental results show that the proposed approach can significantly reduce distortions of existing parametric processing schemes especially when a sufficiently high number of microphones is available.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"14 1","pages":"571-575"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78566629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1