Chong Yeh Sai, N. Mokhtar, M. Iwahashi, P. Cumming, H. Arof
Faculty Grant University of Malaya, Grant/Award Number: GPF062A‐2018; Ministry of Higher Education, Malaysia, Grant/Award Number: UM.C/ HIR/MOHE/ENG/16; Universiti Malaya, Grant/ Award Number: PG260‐2015B; JSPS KAKENHI, Grant/Award Number: JP21K11934 Abstract Electroencephalography (EEG) is a method for recording electrical activities arising from the cortical surface of the brain, which has found wide applications not just in clinical medicine, but also in neuroscience research and studies of Brain‐Computer Interface (BCI). However, EEG recordings often suffer from distortions due to artefactual components that degrade the true EEG signals. Artefactual components are any unwanted signals recorded in the EEG spectrum that originate from sources other than the neurophysiological activity of the human brain. Examples of the origin of artefactual components include eye blinking, facial or scalp muscles activities, and electrode slippage. Techniques for automated artefact removal such as Wavelet Transform and Independent Component Analysis (ICA) have been used to remove or reduce the effect of artefactual components on the EEG signals. However, detecting or identifying the signal artefacts to be removed presents a great challenge, as EEG signal properties vary between individuals and age groups. Techniques that rely on some arbitrarily defined threshold often fail to identify accurately the signal artefacts in a given dataset. In this study, a method is proposed using unsupervised machine learning coupled with Wavelet‐ICA to remove EEG artefacts. Using Density‐Based Spatial Clustering of Application with Noise (DBSCAN), a classification accuracy of 97.9% is achieved in identifying artefactual components. DBSCAN achieved excellent and robust performance in identifying artefactual components during the Wavelet‐ICA process, especially in consideration of the low‐density nature of typical artefactual signals. This new hybrid method provides a scalable and unsupervised solution for automated artefact removal that should be applicable for a wide range of EEG data types.
{"title":"Fully automated unsupervised artefact removal in multichannel electroencephalogram using wavelet-independent component analysis with density-based spatial clustering of application with noise","authors":"Chong Yeh Sai, N. Mokhtar, M. Iwahashi, P. Cumming, H. Arof","doi":"10.1049/SIL2.12058","DOIUrl":"https://doi.org/10.1049/SIL2.12058","url":null,"abstract":"Faculty Grant University of Malaya, Grant/Award Number: GPF062A‐2018; Ministry of Higher Education, Malaysia, Grant/Award Number: UM.C/ HIR/MOHE/ENG/16; Universiti Malaya, Grant/ Award Number: PG260‐2015B; JSPS KAKENHI, Grant/Award Number: JP21K11934 Abstract Electroencephalography (EEG) is a method for recording electrical activities arising from the cortical surface of the brain, which has found wide applications not just in clinical medicine, but also in neuroscience research and studies of Brain‐Computer Interface (BCI). However, EEG recordings often suffer from distortions due to artefactual components that degrade the true EEG signals. Artefactual components are any unwanted signals recorded in the EEG spectrum that originate from sources other than the neurophysiological activity of the human brain. Examples of the origin of artefactual components include eye blinking, facial or scalp muscles activities, and electrode slippage. Techniques for automated artefact removal such as Wavelet Transform and Independent Component Analysis (ICA) have been used to remove or reduce the effect of artefactual components on the EEG signals. However, detecting or identifying the signal artefacts to be removed presents a great challenge, as EEG signal properties vary between individuals and age groups. Techniques that rely on some arbitrarily defined threshold often fail to identify accurately the signal artefacts in a given dataset. In this study, a method is proposed using unsupervised machine learning coupled with Wavelet‐ICA to remove EEG artefacts. Using Density‐Based Spatial Clustering of Application with Noise (DBSCAN), a classification accuracy of 97.9% is achieved in identifying artefactual components. DBSCAN achieved excellent and robust performance in identifying artefactual components during the Wavelet‐ICA process, especially in consideration of the low‐density nature of typical artefactual signals. This new hybrid method provides a scalable and unsupervised solution for automated artefact removal that should be applicable for a wide range of EEG data types.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133917986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
National Research Development and Innovation Fund Abstract A computationally efficient four‐parameter least squares (LS) sine fitting method in the time domain is presented here. Unlike the most widespread procedure defined in the relevant IEEE standard, the proposed fitting is non‐iterative. This is achieved by the second‐order approximation of the cost function (CF) around the actual frequency of the sinusoidal excitation. The approximation reduces the four‐parameter non‐linear fitting problem to a defined set of three‐parameter linear fitting problems. Therefore, the computational demand can be predicted precisely, which is an essential aspect of real‐ life applications. Furthermore, the proposed method is shown to have increased numerical stability. Finally, measurements and computer simulations are carried out to demonstrate the reduced computational demand, while preserving the accuracy compared with the algorithm proposed in the IEEE standard.
{"title":"A computationally efficient non-iterative four-parameter sine fitting method","authors":"B. Renczes, V. Pálfi","doi":"10.1049/SIL2.12061","DOIUrl":"https://doi.org/10.1049/SIL2.12061","url":null,"abstract":"National Research Development and Innovation Fund Abstract A computationally efficient four‐parameter least squares (LS) sine fitting method in the time domain is presented here. Unlike the most widespread procedure defined in the relevant IEEE standard, the proposed fitting is non‐iterative. This is achieved by the second‐order approximation of the cost function (CF) around the actual frequency of the sinusoidal excitation. The approximation reduces the four‐parameter non‐linear fitting problem to a defined set of three‐parameter linear fitting problems. Therefore, the computational demand can be predicted precisely, which is an essential aspect of real‐ life applications. Furthermore, the proposed method is shown to have increased numerical stability. Finally, measurements and computer simulations are carried out to demonstrate the reduced computational demand, while preserving the accuracy compared with the algorithm proposed in the IEEE standard.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133793954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander von Humboldt‐Stiftung Abstract In this proposed approach to unobtrusive human activity classification, a two‐stage machine learning–based algorithm was applied to backscattered ultrawideband radar signals. First, a preprocessing step was applied for noise and clutter suppression. Then, feature extraction and a combination of time‐frequency (TF) and time‐range (TR) domains were used to extract the features of human activities. Then, feature analysis was performed to determine robust features relative to this kind of classification and reduce the dimensionality of the feature vector. Subsequently, different recognition algorithms were applied to group activities as fall or non‐fall and categorise their types. Finally, a performance study was used to choose the higher accuracy algorithm. The ensemble bagged tree and fine K‐nearest neighbour methods showed the best performance. The results show that the two‐stage classification was more accurate than the one‐stage. Finally, it was observed that the proposed approach using a combination of TR and TF domains with two‐stage recognition outperformed reference approaches mentioned in the literature, with average accuracies of 95.8% for eight‐activities classification and 96.9% in distinguishing between fall and non‐fall activities with efficient computational complexity.
{"title":"Unobtrusive human activity classification based on combined time-range and time-frequency domain signatures using ultrawideband radar","authors":"Mohamad Mostafa, S. Chamaani","doi":"10.1049/SIL2.12060","DOIUrl":"https://doi.org/10.1049/SIL2.12060","url":null,"abstract":"Alexander von Humboldt‐Stiftung Abstract In this proposed approach to unobtrusive human activity classification, a two‐stage machine learning–based algorithm was applied to backscattered ultrawideband radar signals. First, a preprocessing step was applied for noise and clutter suppression. Then, feature extraction and a combination of time‐frequency (TF) and time‐range (TR) domains were used to extract the features of human activities. Then, feature analysis was performed to determine robust features relative to this kind of classification and reduce the dimensionality of the feature vector. Subsequently, different recognition algorithms were applied to group activities as fall or non‐fall and categorise their types. Finally, a performance study was used to choose the higher accuracy algorithm. The ensemble bagged tree and fine K‐nearest neighbour methods showed the best performance. The results show that the two‐stage classification was more accurate than the one‐stage. Finally, it was observed that the proposed approach using a combination of TR and TF domains with two‐stage recognition outperformed reference approaches mentioned in the literature, with average accuracies of 95.8% for eight‐activities classification and 96.9% in distinguishing between fall and non‐fall activities with efficient computational complexity.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"545 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115249962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamzah A. Alsayadi, A. Abdelhamid, I. Hegazy, Z. Fayed
Arabic automatic speech recognition (ASR) methods with diacritics have the ability to be integrated with other systems better than Arabic ASR methods without diacritics. In this work, the application of state ‐ of ‐ the ‐ art end ‐ to ‐ end deep learning approaches is inves-tigated to build a robust diacritised Arabic ASR. These approaches are based on the Mel ‐ Frequency Cepstral Coefficients and the log Mel ‐ Scale Filter Bank energies as acoustic features. To the best of our knowledge, end ‐ to ‐ end deep learning approach has not been used in the task of diacritised Arabic automatic speech recognition. To fill this gap, this work presents a new CTC ‐ based ASR, CNN ‐ LSTM, and an attention ‐ based end ‐ to ‐ end approach for improving diacritisedArabic ASR. In addition, a word ‐ based language model is employed to achieve better results. The end ‐ to ‐ end approaches applied in this work are based on state ‐ of ‐ the ‐ art frameworks, namely ESPnet and Espresso. Training and testing of these frameworks are performed based on the Standard Arabic Single Speaker Corpus (SASSC), which contains 7 h of modern standard Arabic speech. Experimental results show that the CNN ‐ LSTM
{"title":"Arabic speech recognition using end-to-end deep learning","authors":"Hamzah A. Alsayadi, A. Abdelhamid, I. Hegazy, Z. Fayed","doi":"10.1049/SIL2.12057","DOIUrl":"https://doi.org/10.1049/SIL2.12057","url":null,"abstract":"Arabic automatic speech recognition (ASR) methods with diacritics have the ability to be integrated with other systems better than Arabic ASR methods without diacritics. In this work, the application of state ‐ of ‐ the ‐ art end ‐ to ‐ end deep learning approaches is inves-tigated to build a robust diacritised Arabic ASR. These approaches are based on the Mel ‐ Frequency Cepstral Coefficients and the log Mel ‐ Scale Filter Bank energies as acoustic features. To the best of our knowledge, end ‐ to ‐ end deep learning approach has not been used in the task of diacritised Arabic automatic speech recognition. To fill this gap, this work presents a new CTC ‐ based ASR, CNN ‐ LSTM, and an attention ‐ based end ‐ to ‐ end approach for improving diacritisedArabic ASR. In addition, a word ‐ based language model is employed to achieve better results. The end ‐ to ‐ end approaches applied in this work are based on state ‐ of ‐ the ‐ art frameworks, namely ESPnet and Espresso. Training and testing of these frameworks are performed based on the Standard Arabic Single Speaker Corpus (SASSC), which contains 7 h of modern standard Arabic speech. Experimental results show that the CNN ‐ LSTM","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129945317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yousef Alipouri, Department of Mechanical Engineering, University of Alberta, 9211‐116 Street NW, Edmonton, AB T6G 1H9, Canada. Email: alipouri@ualberta.ca Abstract The existing sensor fusion methods mainly follow two approaches, including Gaussian and Non‐Gaussian‐based sensor fusion approaches. In the first approach, fusion weights are determined based on the second moment. This approach is unable to account for high‐order moments; thus, it is not accurate for non‐Gaussian sensors. In the second approach, the fusion weights are determined using distribution functions of sensor data. Though this method is more accurate than Gaussian‐based sensor fusion, it is a sophisticated method as it requires all moments information of each sensor, which is either not available or at least hard to be identified. Here, we propose an alternative way to determine the fusion weights by a limited number of n (>2) moment information of data. The proposed method makes trades off between accuracy and complexity. The other problem, which has not been studied in the literature, is existence of constraints on moments. The proposed method can address this problem as well. To do this, a projection‐based neural network‐based optimization method is used to calculate the optimal fusion weights that satisfy moment constraints. A practical application of the proposed sensor fusion method on predicting occupancy for heating, ventilation, and air conditioning (HVAC) is conducted.
{"title":"Sensor fusion with high-order moments constraints using projection-based neural network","authors":"Y. Alipouri, Reza Rafati Bonab, Le Zhong","doi":"10.1049/SIL2.12046","DOIUrl":"https://doi.org/10.1049/SIL2.12046","url":null,"abstract":"Yousef Alipouri, Department of Mechanical Engineering, University of Alberta, 9211‐116 Street NW, Edmonton, AB T6G 1H9, Canada. Email: alipouri@ualberta.ca Abstract The existing sensor fusion methods mainly follow two approaches, including Gaussian and Non‐Gaussian‐based sensor fusion approaches. In the first approach, fusion weights are determined based on the second moment. This approach is unable to account for high‐order moments; thus, it is not accurate for non‐Gaussian sensors. In the second approach, the fusion weights are determined using distribution functions of sensor data. Though this method is more accurate than Gaussian‐based sensor fusion, it is a sophisticated method as it requires all moments information of each sensor, which is either not available or at least hard to be identified. Here, we propose an alternative way to determine the fusion weights by a limited number of n (>2) moment information of data. The proposed method makes trades off between accuracy and complexity. The other problem, which has not been studied in the literature, is existence of constraints on moments. The proposed method can address this problem as well. To do this, a projection‐based neural network‐based optimization method is used to calculate the optimal fusion weights that satisfy moment constraints. A practical application of the proposed sensor fusion method on predicting occupancy for heating, ventilation, and air conditioning (HVAC) is conducted.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130830344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mir Saeed Safizadeh, Associate Professor, School of Mechanical Engineering, Iran University of Science and Technology, Narmak, Tehran, 16846‐13114, Iran. Email: safizadeh@iust.ac.ir Abstract Fault detection and identification (FDI) systems are responsible for detecting and identifying errors as fast as possible with high reliability. These systems should be robust against noise and avoid false warnings. Herein, the perspective of using wavelet filters for noise reduction in FDI systems has been investigated. To achieve that, a wavelet filter and a wavelet‐hybrid filter are presented and compared in noise reduction with conventional filters, such as linear filters (finite impulse response (FIR) and infinite impulse response), median filter, and FIR‐median hybrid filter (SWFMH). The comparison is conducted in two steps: (a) noise reduction of a noisy sample signal from a gas turbine and (b) increasing the fault detection accuracy of a gas turbine FDI system in the presence of noisy data. In step one, a conventional noisy sample signal of a gas turbine is presented, and the performances of different filters in noise reduction of the signal have been studied. In step two, considering that excessive filtering can result in the loss of useful information for an FDI system's diagnostics, the performances of an FDI system coupled with different filters have been evaluated. For this purpose, an FDI system utilising an adaptive neuro‐fuzzy inference system and gas path analysis has been designed. It is demonstrated that, in some cases, the wavelet filters have a lower denoising capability for a noisy sample signal, but when used together with the FDI system, they have better performance. Therefore, wavelet filters are better suited for use in FDI systems.
Mir Saeed Safizadeh,伊朗科技大学机械工程学院副教授,德黑兰Narmak, 16846‐13114摘要FDI (Fault detection and identification)系统负责以高可靠性,以最快的速度检测和识别错误。这些系统应该具有抗噪声和避免错误警告的鲁棒性。本文研究了在FDI系统中使用小波滤波器进行降噪的前景。为了实现这一目标,提出了小波滤波器和小波混合滤波器,并将其与传统滤波器(如线性滤波器(有限脉冲响应(FIR)和无限脉冲响应)、中值滤波器和FIR中值混合滤波器(SWFMH))在降噪方面进行了比较。比较分为两个步骤:(a)对燃气轮机噪声样本信号进行降噪,(b)提高燃气轮机FDI系统在噪声数据存在下的故障检测精度。第一步,给出了燃气轮机常规噪声样本信号,研究了不同滤波器对该信号的降噪效果。在第二步中,考虑到过度滤波可能导致FDI系统诊断有用信息的丢失,对耦合不同滤波器的FDI系统的性能进行了评估。为此,设计了一种利用自适应神经模糊推理系统和气路分析的FDI系统。研究表明,在某些情况下,小波滤波器对噪声样本信号的去噪能力较低,但与FDI系统一起使用时,它们具有更好的性能。因此,小波滤波器更适合用于FDI系统。
{"title":"Improved performance of gas turbine diagnostics using new noise-removal techniques","authors":"Mohsen Ensafjoo, M. Safizadeh","doi":"10.1049/SIL2.12042","DOIUrl":"https://doi.org/10.1049/SIL2.12042","url":null,"abstract":"Mir Saeed Safizadeh, Associate Professor, School of Mechanical Engineering, Iran University of Science and Technology, Narmak, Tehran, 16846‐13114, Iran. Email: safizadeh@iust.ac.ir Abstract Fault detection and identification (FDI) systems are responsible for detecting and identifying errors as fast as possible with high reliability. These systems should be robust against noise and avoid false warnings. Herein, the perspective of using wavelet filters for noise reduction in FDI systems has been investigated. To achieve that, a wavelet filter and a wavelet‐hybrid filter are presented and compared in noise reduction with conventional filters, such as linear filters (finite impulse response (FIR) and infinite impulse response), median filter, and FIR‐median hybrid filter (SWFMH). The comparison is conducted in two steps: (a) noise reduction of a noisy sample signal from a gas turbine and (b) increasing the fault detection accuracy of a gas turbine FDI system in the presence of noisy data. In step one, a conventional noisy sample signal of a gas turbine is presented, and the performances of different filters in noise reduction of the signal have been studied. In step two, considering that excessive filtering can result in the loss of useful information for an FDI system's diagnostics, the performances of an FDI system coupled with different filters have been evaluated. For this purpose, an FDI system utilising an adaptive neuro‐fuzzy inference system and gas path analysis has been designed. It is demonstrated that, in some cases, the wavelet filters have a lower denoising capability for a noisy sample signal, but when used together with the FDI system, they have better performance. Therefore, wavelet filters are better suited for use in FDI systems.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"315 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"119183774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junjiang Xiang, Xiao Hu, Jiayu Ding, Xiang Tan, Jiaxin Yang
National Natural Science Foundation of China China, Grant/Award Number: 62076075 Abstract Salient object detection aims to identify the most attractive objects from images. However, their boundaries are typically of poor quality when predicted using available methods. One or multiple objects may also be left undetected if the image contains multiple objects. To solve these problems, this article proposes the novel cross refinement network, which consists of a Res2Net‐based backbone network; a fusion network equipped with four convolutional block attention modules and four edge‐salient cross units; and a detection network with an edge enhancement unit and a residual refinement network (RNN). For RNN training, the rough salient maps generated using the DUTS‐TR dataset are treated as a special training dataset. Compared to existing methods, the proposed network can effectively detect all objects as well as improve the boundaries of the detected objects by performing experiments on five benchmark datasets. Based on the experimental results, the proposed network outperforms existing methods both objectively and subjectively.
{"title":"Cross refinement network with edge detection for salient object detection","authors":"Junjiang Xiang, Xiao Hu, Jiayu Ding, Xiang Tan, Jiaxin Yang","doi":"10.1049/sil2.12041","DOIUrl":"https://doi.org/10.1049/sil2.12041","url":null,"abstract":"National Natural Science Foundation of China China, Grant/Award Number: 62076075 Abstract Salient object detection aims to identify the most attractive objects from images. However, their boundaries are typically of poor quality when predicted using available methods. One or multiple objects may also be left undetected if the image contains multiple objects. To solve these problems, this article proposes the novel cross refinement network, which consists of a Res2Net‐based backbone network; a fusion network equipped with four convolutional block attention modules and four edge‐salient cross units; and a detection network with an edge enhancement unit and a residual refinement network (RNN). For RNN training, the rough salient maps generated using the DUTS‐TR dataset are treated as a special training dataset. Compared to existing methods, the proposed network can effectively detect all objects as well as improve the boundaries of the detected objects by performing experiments on five benchmark datasets. Based on the experimental results, the proposed network outperforms existing methods both objectively and subjectively.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"118750351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1049/iet-spr.2019.0613
Rong Wang, Zhe Chen, F. Yin
Multiple speaker tracking in distributed microphone array (DMA) network is a challenging task. A critical issue for multiple speaker scenarios is to distinguish the ambiguous observation and associate it to the corresponding speaker, especially under reverberant and noisy environments. To address the problem, a distributed multiple speaker tracking method based on time delay estimation in DMA is proposed in this study. Specifically, the time delay estimated by the generalised crosscorrelation function is treated as an observation. In order to distinguish the observation for each speaker, the possible time delays, refer to as candidates, are extracted based on data association technique. Considering the ambient influence, a time delay estimation strategy is designed to calculate the time delay for each speaker from the candidates. Finally, only the reliable time delays in DMA are propagated throughout the whole network by diffusion fusion algorithm and used for updating the speakers' state within the distributed Kalman filter framework. The proposed approach can track multiple speakers successfully in a non-centralised manner under reverberant and noisy environments. Simulation results indicate that, compared with other methods, the proposed method can achieve a smaller root mean square error for multiple speaker tracking, especially in adverse conditions.
{"title":"Distributed multiple speaker tracking based on time delay estimation in microphone array network","authors":"Rong Wang, Zhe Chen, F. Yin","doi":"10.1049/iet-spr.2019.0613","DOIUrl":"https://doi.org/10.1049/iet-spr.2019.0613","url":null,"abstract":"Multiple speaker tracking in distributed microphone array (DMA) network is a challenging task. A critical issue for multiple speaker scenarios is to distinguish the ambiguous observation and associate it to the corresponding speaker, especially under reverberant and noisy environments. To address the problem, a distributed multiple speaker tracking method based on time delay estimation in DMA is proposed in this study. Specifically, the time delay estimated by the generalised crosscorrelation function is treated as an observation. In order to distinguish the observation for each speaker, the possible time delays, refer to as candidates, are extracted based on data association technique. Considering the ambient influence, a time delay estimation strategy is designed to calculate the time delay for each speaker from the candidates. Finally, only the reliable time delays in DMA are propagated throughout the whole network by diffusion fusion algorithm and used for updating the speakers' state within the distributed Kalman filter framework. The proposed approach can track multiple speakers successfully in a non-centralised manner under reverberant and noisy environments. Simulation results indicate that, compared with other methods, the proposed method can achieve a smaller root mean square error for multiple speaker tracking, especially in adverse conditions.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133640540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1049/iet-spr.2020.0199
Lingyue Hu, B. Ling, C. Y. Ho, Guoheng Huang
: The quaternion-valued signals consist of four signal components. The discrete quaternion Fourier transform is to map these four signal components in the time domain to that in the frequency domain. These four signal components in the frequency domain are called the discrete quaternion Fourier transform components. There are a total of 16 inner products among any two discrete quaternion Fourier transform components. The total orthogonal error among the discrete quaternion Fourier transform components is defined based on these 16 inner products. This study aims to find the optimal quaternion number in the discrete quaternion Fourier transforms so that the total orthogonal errors among the discrete quaternion Fourier transform components are minimised. It is worth noting that finding the optimal quaternion number in the discrete quaternion Fourier transform is equivalent to finding the optimal rescaling factors. Since the discrete quaternion Fourier transform components are expressed in terms of the high-order polynomials of the trigonometric functions of the rescaling factors, this optimisation problem is non-convex. To address this problem, a two-stage approach is employed for finding the solution to the optimisation problem. The comparison results show that the authors proposed method outperforms the existing methods in terms of achieving the low total orthogonal error among the discrete quaternion Fourier transform components.
{"title":"Near orthogonal discrete quaternion Fourier transform components via an optimal frequency rescaling approach","authors":"Lingyue Hu, B. Ling, C. Y. Ho, Guoheng Huang","doi":"10.1049/iet-spr.2020.0199","DOIUrl":"https://doi.org/10.1049/iet-spr.2020.0199","url":null,"abstract":": The quaternion-valued signals consist of four signal components. The discrete quaternion Fourier transform is to map these four signal components in the time domain to that in the frequency domain. These four signal components in the frequency domain are called the discrete quaternion Fourier transform components. There are a total of 16 inner products among any two discrete quaternion Fourier transform components. The total orthogonal error among the discrete quaternion Fourier transform components is defined based on these 16 inner products. This study aims to find the optimal quaternion number in the discrete quaternion Fourier transforms so that the total orthogonal errors among the discrete quaternion Fourier transform components are minimised. It is worth noting that finding the optimal quaternion number in the discrete quaternion Fourier transform is equivalent to finding the optimal rescaling factors. Since the discrete quaternion Fourier transform components are expressed in terms of the high-order polynomials of the trigonometric functions of the rescaling factors, this optimisation problem is non-convex. To address this problem, a two-stage approach is employed for finding the solution to the optimisation problem. The comparison results show that the authors proposed method outperforms the existing methods in terms of achieving the low total orthogonal error among the discrete quaternion Fourier transform components.","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122076936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1049/iet-spr.2020.0096
Xueling Zhou, B. Ling, Zikang Tian, Yiu-Wai Ho, K. Teo
{"title":"Joint empirical mode decomposition, exponential function estimation and L 1 norm approach for estimating mean value of photoplethysmogram and blood glucose level","authors":"Xueling Zhou, B. Ling, Zikang Tian, Yiu-Wai Ho, K. Teo","doi":"10.1049/iet-spr.2020.0096","DOIUrl":"https://doi.org/10.1049/iet-spr.2020.0096","url":null,"abstract":"","PeriodicalId":272888,"journal":{"name":"IET Signal Process.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128966338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}