Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8683360
Xiangrong Wang, E. Aboutanios
Adaptive beamforming of large antenna arrays is difficult to implement due to prohibitively high hardware cost and computational complexity. An antenna selection strategy was utilized to maximize the output signal-to-interference-plus- noise ratio (SINR) with fewer antennas by optimizing array configurations. However, antenna selection scheme exhibits high degradation in performance compared to the full array system. In this paper, we consider a reduced-dimensional beamspace beamformer, where analogue phase shifters adaptively synthesize a subset of orthogonal beams whose outputs are then processed in a beamspace beamformer. We examine the selection problem to adaptively identify the beams most relevant to achieving almost the full beamspace performance, especially in the generalized case without any prior information. Simulation results demonstrated that the beam selection enjoys the complexity advantages, while simultaneously enhancing the output SINR of antenna selection.
{"title":"Adaptive Reduced-Dimensional Beamspace Beamformer Design by Analogue Beam Selection","authors":"Xiangrong Wang, E. Aboutanios","doi":"10.1109/ICASSP.2019.8683360","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683360","url":null,"abstract":"Adaptive beamforming of large antenna arrays is difficult to implement due to prohibitively high hardware cost and computational complexity. An antenna selection strategy was utilized to maximize the output signal-to-interference-plus- noise ratio (SINR) with fewer antennas by optimizing array configurations. However, antenna selection scheme exhibits high degradation in performance compared to the full array system. In this paper, we consider a reduced-dimensional beamspace beamformer, where analogue phase shifters adaptively synthesize a subset of orthogonal beams whose outputs are then processed in a beamspace beamformer. We examine the selection problem to adaptively identify the beams most relevant to achieving almost the full beamspace performance, especially in the generalized case without any prior information. Simulation results demonstrated that the beam selection enjoys the complexity advantages, while simultaneously enhancing the output SINR of antenna selection.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"4350-4354"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88101650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8682412
R. Nagar, S. Raman
Reflection symmetry is ubiquitous in nature and plays an important role in object detection and recognition tasks. Most of the existing methods for symmetry detection extract and describe each keypoint using a descriptor and a mirrored descriptor. Two keypoints are said to be mirror symmetric key-points if the original descriptor of one keypoint and the mirrored descriptor of the other keypoint are similar. However, these methods suffer from the following issue. The background pixels around the mirror symmetric pixels lying on the boundary of an object can be different. Therefore, their descriptors can be different. However, the boundary of a symmetric object is a major component of global reflection symmetry. We exploit the estimated boundary of the object and describe a boundary pixel using only the estimated normal of the boundary segment around the pixel. We embed the symmetry axes in a graph as cliques to robustly detect the symmetry axes. We show that this approach achieves state-of-the-art results in a standard dataset.
{"title":"Reflection Symmetry Detection by Embedding Symmetry in a Graph","authors":"R. Nagar, S. Raman","doi":"10.1109/ICASSP.2019.8682412","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682412","url":null,"abstract":"Reflection symmetry is ubiquitous in nature and plays an important role in object detection and recognition tasks. Most of the existing methods for symmetry detection extract and describe each keypoint using a descriptor and a mirrored descriptor. Two keypoints are said to be mirror symmetric key-points if the original descriptor of one keypoint and the mirrored descriptor of the other keypoint are similar. However, these methods suffer from the following issue. The background pixels around the mirror symmetric pixels lying on the boundary of an object can be different. Therefore, their descriptors can be different. However, the boundary of a symmetric object is a major component of global reflection symmetry. We exploit the estimated boundary of the object and describe a boundary pixel using only the estimated normal of the boundary segment around the pixel. We embed the symmetry axes in a graph as cliques to robustly detect the symmetry axes. We show that this approach achieves state-of-the-art results in a standard dataset.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"4012 2 1","pages":"2147-2151"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86699508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8682761
A. Weiss, A. Yeredor
We address the problem of source separation from noisy mixtures in a semi-blind scenario, with stationary, temporally-diverse Gaussian sources and known spectra. In such noisy models, a dilemma arises regarding the desired objective. On one hand, a "maximally separating" solution, providing the minimal attainable Interference-to-Source-Ratio (ISR), would often suffer from significant residual noise. On the other hand, optimal Minimum Mean Square Error (MMSE) estimation would yield estimates which are the "least distorted" versions of the true sources, often at the cost of compromised ISR. Based on Maximum Likelihood (ML) estimation of the unknown underlying model parameters, we propose two ML-based estimates of the sources. One asymptotically coincides with the MMSE estimate of the sources, whereas the other asymptotically coincides with the (unbiased) "least-noisy maximally-separating" solution for this model. We prove the asymptotic optimality of the latter and present the corresponding Cramér-Rao lower bound. We discuss the differences in principal properties of the proposed estimates and demonstrate them empirically using simulation results.
{"title":"Asymptotically Optimal Recovery of Gaussian Sources from Noisy Stationary Mixtures: the Least-noisy Maximally-separating Solution","authors":"A. Weiss, A. Yeredor","doi":"10.1109/ICASSP.2019.8682761","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682761","url":null,"abstract":"We address the problem of source separation from noisy mixtures in a semi-blind scenario, with stationary, temporally-diverse Gaussian sources and known spectra. In such noisy models, a dilemma arises regarding the desired objective. On one hand, a \"maximally separating\" solution, providing the minimal attainable Interference-to-Source-Ratio (ISR), would often suffer from significant residual noise. On the other hand, optimal Minimum Mean Square Error (MMSE) estimation would yield estimates which are the \"least distorted\" versions of the true sources, often at the cost of compromised ISR. Based on Maximum Likelihood (ML) estimation of the unknown underlying model parameters, we propose two ML-based estimates of the sources. One asymptotically coincides with the MMSE estimate of the sources, whereas the other asymptotically coincides with the (unbiased) \"least-noisy maximally-separating\" solution for this model. We prove the asymptotic optimality of the latter and present the corresponding Cramér-Rao lower bound. We discuss the differences in principal properties of the proposed estimates and demonstrate them empirically using simulation results.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"46 1","pages":"5466-5470"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85470901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8682405
Abolfazl Hashemi, H. Vikalo
An evolutionary self-expressive model for clustering a collection of evolving data points that lie on a union of low-dimensional evolving subspaces is proposed. A parsimonious representation of data points at each time step is learned via a non-convex optimization framework that exploits the self-expressiveness property of the evolving data while taking into account data representation from the preceding time step. The resulting scheme adaptively learns an innovation matrix that captures changes in self-representation of data in consecutive time steps as well as a smoothing parameter reflective of the rate of data evolution. Extensive experiments demonstrate superiority of the proposed framework overs state-of-the-art static subspace clustering algorithms and existing evolutionary clustering schemes.
{"title":"Evolutionary Subspace Clustering: Discovering Structure in Self-expressive Time-series Data","authors":"Abolfazl Hashemi, H. Vikalo","doi":"10.1109/ICASSP.2019.8682405","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682405","url":null,"abstract":"An evolutionary self-expressive model for clustering a collection of evolving data points that lie on a union of low-dimensional evolving subspaces is proposed. A parsimonious representation of data points at each time step is learned via a non-convex optimization framework that exploits the self-expressiveness property of the evolving data while taking into account data representation from the preceding time step. The resulting scheme adaptively learns an innovation matrix that captures changes in self-representation of data in consecutive time steps as well as a smoothing parameter reflective of the rate of data evolution. Extensive experiments demonstrate superiority of the proposed framework overs state-of-the-art static subspace clustering algorithms and existing evolutionary clustering schemes.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"73 1","pages":"3707-3711"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85715959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8682487
Shoukang Hu, Max W. Y. Lam, Xurong Xie, Shansong Liu, Jianwei Yu, Xixin Wu, Xunying Liu, H. Meng
The hidden activation functions inside deep neural networks (DNNs) play a vital role in learning high level discriminative features and controlling the information flows to track longer history. However, the fixed model parameters used in standard DNNs can lead to over-fitting and poor generalization when given limited training data. Furthermore, the precise forms of activations used in DNNs are often manually set at a global level for all hidden nodes, thus lacking an automatic selection method. In order to address these issues, Bayesian neural networks (BNNs) acoustic models are proposed in this paper to explicitly model the uncertainty associated with DNN parameters. Gaussian Process (GP) activations based DNN and LSTM acoustic models are also used in this paper to allow the optimal forms of hidden activations to be stochastically learned for individual hidden nodes. An efficient variational inference based training algorithm is derived for BNN, GPNN and GPLSTM systems. Experiments were conducted on a LVCSR system trained on a 75 hour subset of Switchboard I data. The best BNN and GPNN systems outperformed both the baseline DNN systems constructed using fixed form activations and their combination via frame level joint decoding by 1% absolute in word error rate.
{"title":"Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition","authors":"Shoukang Hu, Max W. Y. Lam, Xurong Xie, Shansong Liu, Jianwei Yu, Xixin Wu, Xunying Liu, H. Meng","doi":"10.1109/ICASSP.2019.8682487","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682487","url":null,"abstract":"The hidden activation functions inside deep neural networks (DNNs) play a vital role in learning high level discriminative features and controlling the information flows to track longer history. However, the fixed model parameters used in standard DNNs can lead to over-fitting and poor generalization when given limited training data. Furthermore, the precise forms of activations used in DNNs are often manually set at a global level for all hidden nodes, thus lacking an automatic selection method. In order to address these issues, Bayesian neural networks (BNNs) acoustic models are proposed in this paper to explicitly model the uncertainty associated with DNN parameters. Gaussian Process (GP) activations based DNN and LSTM acoustic models are also used in this paper to allow the optimal forms of hidden activations to be stochastically learned for individual hidden nodes. An efficient variational inference based training algorithm is derived for BNN, GPNN and GPLSTM systems. Experiments were conducted on a LVCSR system trained on a 75 hour subset of Switchboard I data. The best BNN and GPNN systems outperformed both the baseline DNN systems constructed using fixed form activations and their combination via frame level joint decoding by 1% absolute in word error rate.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"9 1","pages":"6555-6559"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85768922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8683552
Henning Lange, M. Berges, J. Z. Kolter
In this paper, an algorithm for performing System Identification and inference of the filtering recursion for stochastic non-linear dynamical systems is introduced. Additionally, the algorithm allows for enforcing domain-constraints of the state variable. The algorithm makes use of an approximate inference technique called Variational Inference in conjunction with Deep Neural Networks as the optimization engine. Although general in its nature, the algorithm is evaluated in the context of Non-Intrusive Load Monitoring, the problem of inferring the operational state of individual electrical appliances given aggregate measurements of electrical power collected in a home.
{"title":"Neural Variational Identification and Filtering for Stochastic Non-linear Dynamical Systems with Application to Non-intrusive Load Monitoring","authors":"Henning Lange, M. Berges, J. Z. Kolter","doi":"10.1109/ICASSP.2019.8683552","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683552","url":null,"abstract":"In this paper, an algorithm for performing System Identification and inference of the filtering recursion for stochastic non-linear dynamical systems is introduced. Additionally, the algorithm allows for enforcing domain-constraints of the state variable. The algorithm makes use of an approximate inference technique called Variational Inference in conjunction with Deep Neural Networks as the optimization engine. Although general in its nature, the algorithm is evaluated in the context of Non-Intrusive Load Monitoring, the problem of inferring the operational state of individual electrical appliances given aggregate measurements of electrical power collected in a home.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"8340-8344"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82402365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8683849
Ilker Gurcan, H. Nguyen
Recently, surgical activity recognition has been receiving significant attention from the medical imaging community. Existing state-of-the-art approaches employ recurrent neural networks such as long-short term memory networks (LSTMs). However, our experiments show that these networks are not effective in capturing the relationship of features with different temporal scales. Such limitation will lead to sub-optimal recognition performance of surgical activities containing complex motions at multiple time scales. To overcome this shortcoming, our paper proposes a multi-scale recurrent neural network (MS-RNN) that combines the strength of both wavelet scattering operations and LSTM. We validate the effectiveness of the proposed network using both real and synthetic datasets. Our experimental results show that MS-RNN outperforms state-of-the-art methods in surgical activity recognition by a significant margin. On a synthetic dataset, the proposed network achieves more than 90% classification accuracy while LSTM’s accuracy is around chance level. Experiments on real surgical activity dataset shows a significant improvement of recognition accuracy over the current state of the art (90.2% versus 83.3%).
{"title":"Surgical Activities Recognition Using Multi-scale Recurrent Networks","authors":"Ilker Gurcan, H. Nguyen","doi":"10.1109/ICASSP.2019.8683849","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683849","url":null,"abstract":"Recently, surgical activity recognition has been receiving significant attention from the medical imaging community. Existing state-of-the-art approaches employ recurrent neural networks such as long-short term memory networks (LSTMs). However, our experiments show that these networks are not effective in capturing the relationship of features with different temporal scales. Such limitation will lead to sub-optimal recognition performance of surgical activities containing complex motions at multiple time scales. To overcome this shortcoming, our paper proposes a multi-scale recurrent neural network (MS-RNN) that combines the strength of both wavelet scattering operations and LSTM. We validate the effectiveness of the proposed network using both real and synthetic datasets. Our experimental results show that MS-RNN outperforms state-of-the-art methods in surgical activity recognition by a significant margin. On a synthetic dataset, the proposed network achieves more than 90% classification accuracy while LSTM’s accuracy is around chance level. Experiments on real surgical activity dataset shows a significant improvement of recognition accuracy over the current state of the art (90.2% versus 83.3%).","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"2887-2891"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82478817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8682424
Zheng Yu, Wenmin Wang, Ge Li
Cross-modal retrieval has been recently proposed to find an appropriate subspace where the similarity among different modalities, such as image and text, can be directly measured. In this paper, we propose Multi-step Self-Attention Network (MSAN) to perform cross-modal retrieval in a limited text space with multiple attention steps, that can selectively attend to partial shared information at each step and aggregate useful information over multiple steps to measure the final similarity. In order to achieve better retrieval results with faster training speed, we introduce global prior knowledge as the global reference information. Extensive experiments on Flickr30K and MSCOCO, show that MSAN achieves new state-of-the-art results in accuracy for cross-modal retrieval.
{"title":"Multi-step Self-attention Network for Cross-modal Retrieval Based on a Limited Text Space","authors":"Zheng Yu, Wenmin Wang, Ge Li","doi":"10.1109/ICASSP.2019.8682424","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682424","url":null,"abstract":"Cross-modal retrieval has been recently proposed to find an appropriate subspace where the similarity among different modalities, such as image and text, can be directly measured. In this paper, we propose Multi-step Self-Attention Network (MSAN) to perform cross-modal retrieval in a limited text space with multiple attention steps, that can selectively attend to partial shared information at each step and aggregate useful information over multiple steps to measure the final similarity. In order to achieve better retrieval results with faster training speed, we introduce global prior knowledge as the global reference information. Extensive experiments on Flickr30K and MSCOCO, show that MSAN achieves new state-of-the-art results in accuracy for cross-modal retrieval.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"16 1","pages":"2082-2086"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82559134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8682154
Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, H. Meng
Speech emotion recognition (SER) plays an important role in intelligent speech interaction. One vital challenge in SER is to extract emotion-relevant features from speech signals. In state-of-the-art SER techniques, deep learning methods, e.g, Convolutional Neural Networks (CNNs), are widely employed for feature learning and have achieved significant performance. However, in the CNN-oriented methods, two performance limitations have raised: 1) the loss of temporal structure of speech in the progressive resolution reduction; 2) the ignoring of relative dependencies between elements in suprasegmental feature sequence. In this paper, we proposed the combining use of Dilated Residual Network (DRN) and Multi-head Self-attention to alleviate the above limitations. By employing DRN, the network can retain high resolution of temporal structure in feature learning, with similar size of receptive field to CNN based approach. By employing Multi-head Self-attention, the network can model the inner dependencies between elements with different positions in the learned suprasegmental feature sequence, which enhances the importing of emotion-salient information. Experiments on emotional benchmarking dataset IEMOCAP have demonstrated the effectiveness of the proposed framework, with 11.7% to 18.6% relative improvement to state-of-the-art approaches.
{"title":"Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition","authors":"Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, H. Meng","doi":"10.1109/ICASSP.2019.8682154","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682154","url":null,"abstract":"Speech emotion recognition (SER) plays an important role in intelligent speech interaction. One vital challenge in SER is to extract emotion-relevant features from speech signals. In state-of-the-art SER techniques, deep learning methods, e.g, Convolutional Neural Networks (CNNs), are widely employed for feature learning and have achieved significant performance. However, in the CNN-oriented methods, two performance limitations have raised: 1) the loss of temporal structure of speech in the progressive resolution reduction; 2) the ignoring of relative dependencies between elements in suprasegmental feature sequence. In this paper, we proposed the combining use of Dilated Residual Network (DRN) and Multi-head Self-attention to alleviate the above limitations. By employing DRN, the network can retain high resolution of temporal structure in feature learning, with similar size of receptive field to CNN based approach. By employing Multi-head Self-attention, the network can model the inner dependencies between elements with different positions in the learned suprasegmental feature sequence, which enhances the importing of emotion-salient information. Experiments on emotional benchmarking dataset IEMOCAP have demonstrated the effectiveness of the proposed framework, with 11.7% to 18.6% relative improvement to state-of-the-art approaches.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"80 1 1","pages":"6675-6679"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89560647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-12DOI: 10.1109/ICASSP.2019.8683084
Kjell Le, T. Eftestøl, K. Engan, Ø. Kleiven, S. Ørn
Baseline wander is a low frequency noise which is often removed by a highpass filter in electrocardiogram signals. However, this might not be sufficient to correct the isoelectric level of the signal, there exist an isoelectric bias. The isoelectric level is used as a reference point for amplitude measurements, and is recommended to have this point at 0 V, i.e. isoelectric adjusted. To correct the isoelectric level a clustering method is proposed to determine the isoelectric bias, which is thereafter subtracted from a signal averaged template. Calculation of the mean electrical axis (MEA) is used to evaluate the iso-electric correction. The MEA can be estimated from any lead pairs in the frontal plane, and a low variance in the estimates over the different lead pairs would suggest that the calculation of the MEA in each lead pair are consistent. Different methods are evaluated for calculating MEA, and the variance in the results as well as other measures, favour the proposed isoelectric adjusted signals in all MEA methods.
{"title":"Baseline Wander Removal and Isoelectric Correction in Electrocardiograms Using Clustering","authors":"Kjell Le, T. Eftestøl, K. Engan, Ø. Kleiven, S. Ørn","doi":"10.1109/ICASSP.2019.8683084","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683084","url":null,"abstract":"Baseline wander is a low frequency noise which is often removed by a highpass filter in electrocardiogram signals. However, this might not be sufficient to correct the isoelectric level of the signal, there exist an isoelectric bias. The isoelectric level is used as a reference point for amplitude measurements, and is recommended to have this point at 0 V, i.e. isoelectric adjusted. To correct the isoelectric level a clustering method is proposed to determine the isoelectric bias, which is thereafter subtracted from a signal averaged template. Calculation of the mean electrical axis (MEA) is used to evaluate the iso-electric correction. The MEA can be estimated from any lead pairs in the frontal plane, and a low variance in the estimates over the different lead pairs would suggest that the calculation of the MEA in each lead pair are consistent. Different methods are evaluated for calculating MEA, and the variance in the results as well as other measures, favour the proposed isoelectric adjusted signals in all MEA methods.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"1274-1278"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90022306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}