Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631573
Micael Bernhardt, J. Cousseau
The upcoming wireless communication systems are expected to integrate a number of nodes remarkably greater than those observed in current technologies, while offering a sensibly improved service quality for critical applications. This generates a need for innovative schemes to share the available resources among the served terminals as well as to increase the system efficiency and node fairness. Aiming to this objective, we propose a combination of non-orthogonal multiple access and interference alignment schemes applied to the downlink transmissions in a multi-cell environment. The two methods presented in this work enable an efficient reutilization of resources and the suppression of intra-and inter-cell interference in a single step during signal reception. We derive the expressions for the feasibility of our proposed solution from an analysis applied to generic system configurations. Additionally, we show numerical results that highlight the benefits of this scheme in a system setup resembling an Internet-of-things scenario.
{"title":"On Interference Alignment Based NOMA for Downlink Multicell Transmissions","authors":"Micael Bernhardt, J. Cousseau","doi":"10.1109/ICDSP.2018.8631573","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631573","url":null,"abstract":"The upcoming wireless communication systems are expected to integrate a number of nodes remarkably greater than those observed in current technologies, while offering a sensibly improved service quality for critical applications. This generates a need for innovative schemes to share the available resources among the served terminals as well as to increase the system efficiency and node fairness. Aiming to this objective, we propose a combination of non-orthogonal multiple access and interference alignment schemes applied to the downlink transmissions in a multi-cell environment. The two methods presented in this work enable an efficient reutilization of resources and the suppression of intra-and inter-cell interference in a single step during signal reception. We derive the expressions for the feasibility of our proposed solution from an analysis applied to generic system configurations. Additionally, we show numerical results that highlight the benefits of this scheme in a system setup resembling an Internet-of-things scenario.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124276464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631883
Chu-Tak Li, W. Siu, D. Lun
This paper presents a key frame recognition algorithm, using novel offline feature-shifts approach and search window weights. We extract effective feature patches from key frames with an offline feature-shifts approach for real-time key frame recognition. We focus on practical situations in which blurring and shifts in viewpoints occur in our dataset. We compare our method with some conventional keypoint-based matching methods and the newest CNN features for scene recognition. The experimental results illustrate that our method can reasonably preserve the performance in key frame recognition when comparing with methods using online feature-shifts approach. Our proposed method provides larger tolerance of unmatched pairs which is useful for decision making in real-time systems. Moreover, our method is robust to illumination and blurring. We achieve 90% accuracy in a nighttime sequence while CNN approach only attains 60% accuracy. Our method only requires 33.8 ms to match a frame on average using a regular desktop, which is 4 times faster than CNN approach with only CPU mode.
{"title":"Boosting the Performance of Scene Recognition via Offline Feature-Shifts and Search Window Weights","authors":"Chu-Tak Li, W. Siu, D. Lun","doi":"10.1109/ICDSP.2018.8631883","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631883","url":null,"abstract":"This paper presents a key frame recognition algorithm, using novel offline feature-shifts approach and search window weights. We extract effective feature patches from key frames with an offline feature-shifts approach for real-time key frame recognition. We focus on practical situations in which blurring and shifts in viewpoints occur in our dataset. We compare our method with some conventional keypoint-based matching methods and the newest CNN features for scene recognition. The experimental results illustrate that our method can reasonably preserve the performance in key frame recognition when comparing with methods using online feature-shifts approach. Our proposed method provides larger tolerance of unmatched pairs which is useful for decision making in real-time systems. Moreover, our method is robust to illumination and blurring. We achieve 90% accuracy in a nighttime sequence while CNN approach only attains 60% accuracy. Our method only requires 33.8 ms to match a frame on average using a regular desktop, which is 4 times faster than CNN approach with only CPU mode.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125895462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the ever-increasing number of neural networks(NNs) connections and parameters, computation on neural networks is becoming both power hankering and memory intensive. In this paper, we propose a sparse neural networks accelerator to improve memory resource utilization and improve power efficiency. In contrast to prior works, we introduce a highly integrated software and hardware co-design technique that combines resource-aware software compression algorithms and specialized hardware inference engine in the accelerator. Compared with other designs, our design can compress parameters by 90× and substantially improve storage resource utilization, performance (6.9×) and power (1.2×) for NN accelerators.
{"title":"Exploring Resource-Aware Deep Neural Network Accelerator and Architecture Design","authors":"Baoting Li, Longjun Liu, Jiahua Liang, Hongbin Sun, Li Geng, Nanning Zheng","doi":"10.1109/ICDSP.2018.8631853","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631853","url":null,"abstract":"Due to the ever-increasing number of neural networks(NNs) connections and parameters, computation on neural networks is becoming both power hankering and memory intensive. In this paper, we propose a sparse neural networks accelerator to improve memory resource utilization and improve power efficiency. In contrast to prior works, we introduce a highly integrated software and hardware co-design technique that combines resource-aware software compression algorithms and specialized hardware inference engine in the accelerator. Compared with other designs, our design can compress parameters by 90× and substantially improve storage resource utilization, performance (6.9×) and power (1.2×) for NN accelerators.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129533341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631796
Jianguo Sun, Tianxu Sun, Ye Yuan, Xingjian Zhang, Yiqi Shi, Yun Lin
An automatic method applied to the thyroid ultrasound images for lesion localization and diagnosis of benign and malignant lesions was proposed in this paper. The FCN-AlexNet of deep learning method was used to segment images, and accurate localization of thyroid nodules was achieved. Then, the method of transfer learning was introduced to solve the problem of training data shortages during training process. According to the performance of AlexNet in classification, it was used to diagnose benign and malignant lesions. The localization effects of TBD, RGI, PAORGB, and ASPS methods were comparatively evaluated by IoU indicators, and the accuracy of benign and malignant diagnosis of those methods are evaluated by Accuracy, Sensitivity, Specificity, and AUC. The experimental results shown that the proposed method has better performance in localization and diagnosis of benign and malignant lesions.
{"title":"Automatic Diagnosis of Thyroid Ultrasound Image Based on FCN-AlexNet and Transfer Learning","authors":"Jianguo Sun, Tianxu Sun, Ye Yuan, Xingjian Zhang, Yiqi Shi, Yun Lin","doi":"10.1109/ICDSP.2018.8631796","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631796","url":null,"abstract":"An automatic method applied to the thyroid ultrasound images for lesion localization and diagnosis of benign and malignant lesions was proposed in this paper. The FCN-AlexNet of deep learning method was used to segment images, and accurate localization of thyroid nodules was achieved. Then, the method of transfer learning was introduced to solve the problem of training data shortages during training process. According to the performance of AlexNet in classification, it was used to diagnose benign and malignant lesions. The localization effects of TBD, RGI, PAORGB, and ASPS methods were comparatively evaluated by IoU indicators, and the accuracy of benign and malignant diagnosis of those methods are evaluated by Accuracy, Sensitivity, Specificity, and AUC. The experimental results shown that the proposed method has better performance in localization and diagnosis of benign and malignant lesions.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128259879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631851
Jianqiang Lin, S. Chan
The estimation of signal parameters via rotational invariance techniques (ESPRIT) algorithm is a widely used subspace-based method for direction-of-arrival (DOA) estimation in array signal processing and spectral analysis. It requires the estimation of the signal subspaces of rotational invariance sub-arrays of a sensor array, from which the DOAs can be estimated by solving an eigenvalue problem. This paper proposes a projection approximation subspace tracking (PAST)-based adaptive ESPRIT algorithm with variable forgetting factor (VFF) and variable regularization (VR). The VFF and VR PAST algorithm is based on a recently proposed Locally Optimal FF (LOFF) scheme with improved convergence speed and steady state error performance. Moreover, variable regularization is incorporated to reduce the estimation variance during ill-conditioning or low input signal level. The proposed LOFF-VR adaptive ESPRIT method is also utilized for tracking the eigenvalues and hence the DOAs. Experimental simulations show that the proposed LOFF-VR-ESPRIT algorithm outperforms the conventional approaches in stationary and nonstationary environments, especially in the presence of signal fading.
{"title":"A New PAST-Based Adaptive ESPIRT Algorithm with Variable Forgetting Factor and Regularization","authors":"Jianqiang Lin, S. Chan","doi":"10.1109/ICDSP.2018.8631851","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631851","url":null,"abstract":"The estimation of signal parameters via rotational invariance techniques (ESPRIT) algorithm is a widely used subspace-based method for direction-of-arrival (DOA) estimation in array signal processing and spectral analysis. It requires the estimation of the signal subspaces of rotational invariance sub-arrays of a sensor array, from which the DOAs can be estimated by solving an eigenvalue problem. This paper proposes a projection approximation subspace tracking (PAST)-based adaptive ESPRIT algorithm with variable forgetting factor (VFF) and variable regularization (VR). The VFF and VR PAST algorithm is based on a recently proposed Locally Optimal FF (LOFF) scheme with improved convergence speed and steady state error performance. Moreover, variable regularization is incorporated to reduce the estimation variance during ill-conditioning or low input signal level. The proposed LOFF-VR adaptive ESPRIT method is also utilized for tracking the eigenvalues and hence the DOAs. Experimental simulations show that the proposed LOFF-VR-ESPRIT algorithm outperforms the conventional approaches in stationary and nonstationary environments, especially in the presence of signal fading.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131507419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631619
Jing Zhu, Ping Yang, Yue Xiao, Y. Guan, Shaoqian Li
In this paper, we investigate the benefits of the link adaptation (LA) techniques for enhanced spatial modulation (ESM) based multiple-input multiple-output (MIMO) systems. To be specific, we first apply the power allocation (PA) technique to ESM and propose a novel PA algorithm, namely approximated maximum minimum distance (AMMD)-based PA, in order to improve the bit error rate (BER) performance. Then, we combine transmit antenna selection (TAS) technique and ESM scheme to overcome the constraint that the number of transmit antennas in ESM has to be a power of two as well as to enhance its BER performance by using the space resource. Finally, to seek higher BER performance gain, we consider the joint application of PA and TAS in ESM-MIMO systems. Our simulation results show that the proposed PA-ESM, TAS-ESM and joint PA and TAS aided ESM schemes provide beneficial system performance improvements compared to the conventional ESM scheme.
{"title":"Link Adaption aided Enhanced Spatial Modulation for MIMO Transmissions","authors":"Jing Zhu, Ping Yang, Yue Xiao, Y. Guan, Shaoqian Li","doi":"10.1109/ICDSP.2018.8631619","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631619","url":null,"abstract":"In this paper, we investigate the benefits of the link adaptation (LA) techniques for enhanced spatial modulation (ESM) based multiple-input multiple-output (MIMO) systems. To be specific, we first apply the power allocation (PA) technique to ESM and propose a novel PA algorithm, namely approximated maximum minimum distance (AMMD)-based PA, in order to improve the bit error rate (BER) performance. Then, we combine transmit antenna selection (TAS) technique and ESM scheme to overcome the constraint that the number of transmit antennas in ESM has to be a power of two as well as to enhance its BER performance by using the space resource. Finally, to seek higher BER performance gain, we consider the joint application of PA and TAS in ESM-MIMO systems. Our simulation results show that the proposed PA-ESM, TAS-ESM and joint PA and TAS aided ESM schemes provide beneficial system performance improvements compared to the conventional ESM scheme.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121955472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631610
Huafei Wang, Xianpeng Wang, Mengxing Huang, Chunjie Cao, G. Bi
The most of existing off-grid direction of arrival (DOA) estimation methods are based on the perfect array-manifold. However, in practice, it is often hard to obtain a perfect array-manifold. In this paper, to achieve the DOA estimation under mutual coupling condition with low computational complexity, we propose a robust root Sparse Bayesian Learning (SBL) method. In the proposed method, firstly, we adopt the banded complex symmetric Toeplitz structure of the mutual coupling matrix to remove the negative influence of mutual coupling on DOA estimation. Then the DOA with off-grid is estimated by formulating the root-SBL strategy. Compared with the existing SBL-based algorithms, our method can not only maintain superior DOA estimation performance under the condition of mutual coupling, especially with strong mutual coupling, but also have lower computational complexity. Simulation results demonstrate that the proposed method can still accurately estimate DOAs under strong mutual coupling conditions, while other SBL-based methods fail to work.
{"title":"Off-Grid DOA Estimation in Mutual Coupling via Robust Sparse Bayesian Learning","authors":"Huafei Wang, Xianpeng Wang, Mengxing Huang, Chunjie Cao, G. Bi","doi":"10.1109/ICDSP.2018.8631610","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631610","url":null,"abstract":"The most of existing off-grid direction of arrival (DOA) estimation methods are based on the perfect array-manifold. However, in practice, it is often hard to obtain a perfect array-manifold. In this paper, to achieve the DOA estimation under mutual coupling condition with low computational complexity, we propose a robust root Sparse Bayesian Learning (SBL) method. In the proposed method, firstly, we adopt the banded complex symmetric Toeplitz structure of the mutual coupling matrix to remove the negative influence of mutual coupling on DOA estimation. Then the DOA with off-grid is estimated by formulating the root-SBL strategy. Compared with the existing SBL-based algorithms, our method can not only maintain superior DOA estimation performance under the condition of mutual coupling, especially with strong mutual coupling, but also have lower computational complexity. Simulation results demonstrate that the proposed method can still accurately estimate DOAs under strong mutual coupling conditions, while other SBL-based methods fail to work.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115867149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631661
Yongzhi Wang, Yun Lin, Xiuwei Chi
Frequency hopping communication is a common communication method in the field of modern wireless communication countermeasures. Due to the progress of the signal processing in frequency hopping signals, the demand for the estimation of its parameters is also increasing. This paper research on the estimating parameter of frequency hopping signal based on the sparse liner regression of compressed sensing. In addition to the basic sparse analysis, we propose an improved method which combining the algorithm of approximating LO norm and morphological filtering. The simulation of parameter estimation shows that it has a great reduction in estimation error in low SNR to use two improved methods at the same time. And it can reduce about 0.3 in estimation error at-6dB. Also, the estimation error which using the improved method with approximating LO norm and morphological filtering can reach less than 0.003 at-6dB. The experimental results show that the method of processing frequency hopping signals used in this paper can effectively estimate its parameters.
{"title":"A Parameter Estimation Method of Frequency Hopping Signal Based On Sparse Time-frequency Method","authors":"Yongzhi Wang, Yun Lin, Xiuwei Chi","doi":"10.1109/ICDSP.2018.8631661","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631661","url":null,"abstract":"Frequency hopping communication is a common communication method in the field of modern wireless communication countermeasures. Due to the progress of the signal processing in frequency hopping signals, the demand for the estimation of its parameters is also increasing. This paper research on the estimating parameter of frequency hopping signal based on the sparse liner regression of compressed sensing. In addition to the basic sparse analysis, we propose an improved method which combining the algorithm of approximating LO norm and morphological filtering. The simulation of parameter estimation shows that it has a great reduction in estimation error in low SNR to use two improved methods at the same time. And it can reduce about 0.3 in estimation error at-6dB. Also, the estimation error which using the improved method with approximating LO norm and morphological filtering can reach less than 0.003 at-6dB. The experimental results show that the method of processing frequency hopping signals used in this paper can effectively estimate its parameters.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"83 23","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120824834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631803
Zai Yang, Yonina C. Eldar, Lihua Xie
Compressive multichannel frequency estimation refers to the process of retrieving the frequency profile shared by multiple signals from their compressive samples. A recent approach to this problem relies on atomic norm minimization which exploitsjoint sparsity among the channels, is solved using convex optimization, and has strong theoretical guarantees. We provide in this paper an average-case analysis for atomic norm minimization by assuming proper randomness on the amplitudes of the frequencies. We show that the sample size per channel required for exact frequency estimation from noiseless samples decreases as the number of channels increases and is on the order of $Kdisplaystyle log Kleft(1+frac{1}{L}log Nright)$, where K is the number of frequencies, L is the number of channels, and N is a fixed parameter proportional to the sampling window size and inversely proportional to the desired resolution.
{"title":"Average Case Analysis of Compressive Multichannel Frequency Estimation Using Atomic Norm Minimization","authors":"Zai Yang, Yonina C. Eldar, Lihua Xie","doi":"10.1109/ICDSP.2018.8631803","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631803","url":null,"abstract":"Compressive multichannel frequency estimation refers to the process of retrieving the frequency profile shared by multiple signals from their compressive samples. A recent approach to this problem relies on atomic norm minimization which exploitsjoint sparsity among the channels, is solved using convex optimization, and has strong theoretical guarantees. We provide in this paper an average-case analysis for atomic norm minimization by assuming proper randomness on the amplitudes of the frequencies. We show that the sample size per channel required for exact frequency estimation from noiseless samples decreases as the number of channels increases and is on the order of $Kdisplaystyle log Kleft(1+frac{1}{L}log Nright)$, where K is the number of frequencies, L is the number of channels, and N is a fixed parameter proportional to the sampling window size and inversely proportional to the desired resolution.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121344209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ICDSP.2018.8631808
Xingwei Sun, Ziteng Wang, Risheng Xia, Junfeng Li, Yonghong Yan
The minimum variance distortionless response (MV-DR) beamformer is a widely used beamforming technique that extracts sound components coming from a direction specified by a steering vector. In this paper, we present four different steering vector estimation methods and analyze their influence on the MVDR beamformer in speech recognition. The first one is based on the direction of arrival under the plane wave propagation assumption with the prior knowledge of microphone array geometry. The other three methods are based on the decomposition of the observed speech covariance matrix, including the covariance subtraction based method, the eigenvalue decomposition based method, and the generalized eigenvalue decomposition (GEVD) based method. We theoretically prove that the three decomposition based methods are equivalent under the narrowband approximation or after the rank -1 speech covariance matrix approximation. The speech recognition experiments conducted on the CHiME-3 dataset shows that the MVDR beamformer using GEVD-based steering vector estimation achieves the best performance, and word error rates can be further reduced with the rank -1 approximation.
{"title":"Effect of Steering Vector Estimation on MVDR Beamformer for Noisy Speech Recognition","authors":"Xingwei Sun, Ziteng Wang, Risheng Xia, Junfeng Li, Yonghong Yan","doi":"10.1109/ICDSP.2018.8631808","DOIUrl":"https://doi.org/10.1109/ICDSP.2018.8631808","url":null,"abstract":"The minimum variance distortionless response (MV-DR) beamformer is a widely used beamforming technique that extracts sound components coming from a direction specified by a steering vector. In this paper, we present four different steering vector estimation methods and analyze their influence on the MVDR beamformer in speech recognition. The first one is based on the direction of arrival under the plane wave propagation assumption with the prior knowledge of microphone array geometry. The other three methods are based on the decomposition of the observed speech covariance matrix, including the covariance subtraction based method, the eigenvalue decomposition based method, and the generalized eigenvalue decomposition (GEVD) based method. We theoretically prove that the three decomposition based methods are equivalent under the narrowband approximation or after the rank -1 speech covariance matrix approximation. The speech recognition experiments conducted on the CHiME-3 dataset shows that the MVDR beamformer using GEVD-based steering vector estimation achieves the best performance, and word error rates can be further reduced with the rank -1 approximation.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121222727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}