Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282219
Patrick Lumban Tobing, H. Kameoka, T. Toda
This paper presents a novel implementation of latent trajectory modeling in a deep acoustic-to-articulatory inversion mapping framework. In the conventional methods, i.e., the Gaussian mixture model (GMM)- and the deep neural network (DNN)- based inversion mappings, the frame interdependency can be considered while generating articulatory parameter trajectories with the use of an explicit constraint between static and dynamic features. However, in training these models, such a constraint is not considered, and therefore, the trained model is not optimum for the mapping procedure. In this paper, we address this problem by introducing a latent trajectory modeling into the DNN-based inversion mapping. In the latent trajectory model, the frame interdependency can be well considered, in both training and mapping, by using a soft-constraint between static and dynamic features. The experimental results demonstrate that the proposed latent trajectory DNN (LTDNN)-based inversion mapping outperforms the conventional and the state-of-the-art inversion mapping systems.
{"title":"Deep acoustic-to-articulatory inversion mapping with latent trajectory modeling","authors":"Patrick Lumban Tobing, H. Kameoka, T. Toda","doi":"10.1109/APSIPA.2017.8282219","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282219","url":null,"abstract":"This paper presents a novel implementation of latent trajectory modeling in a deep acoustic-to-articulatory inversion mapping framework. In the conventional methods, i.e., the Gaussian mixture model (GMM)- and the deep neural network (DNN)- based inversion mappings, the frame interdependency can be considered while generating articulatory parameter trajectories with the use of an explicit constraint between static and dynamic features. However, in training these models, such a constraint is not considered, and therefore, the trained model is not optimum for the mapping procedure. In this paper, we address this problem by introducing a latent trajectory modeling into the DNN-based inversion mapping. In the latent trajectory model, the frame interdependency can be well considered, in both training and mapping, by using a soft-constraint between static and dynamic features. The experimental results demonstrate that the proposed latent trajectory DNN (LTDNN)-based inversion mapping outperforms the conventional and the state-of-the-art inversion mapping systems.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117268618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282127
Jaeyoung Shin, K. Müller, Han-Jeong Hwang
In this study, we propose a hybrid BCI combining electroencephalography (EEG) and near-infrared spectroscopy (NIRS) that can be potentially operated in eyes-closed condition for paralyzed patients with oculomotor dysfunctions. In the experiment, seven healthy participants performed mental subtraction and stayed relaxed (baseline state), during which EEG and NIRS data were simultaneously measured. To evaluate the feasibility of the hybrid BCI, we classified frontal brain activities inducted by mental subtraction and baseline state, and compared classification accuracies obtained using unimodal EEG and NIRS BCI and the hybrid BCI. As a result, the hybrid BCI (85.54 % ± 8.59) showed significantly higher classification accuracy than those of unimodal EEG (80.77 % ± 11.15) and NIRS BCI (77.12 % ± 7.63) (Wilcoxon signed rank test, Bonferroni corrected p < 0.05). The result demonstrated that our eyes-closed hybrid BCI approach could be potentially applied to neurodegenerative patients with impaired motor functions accompanied by a decline of visual functions.
{"title":"Hybrid EEG-NIRS brain-computer interface under eyes-closed condition","authors":"Jaeyoung Shin, K. Müller, Han-Jeong Hwang","doi":"10.1109/APSIPA.2017.8282127","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282127","url":null,"abstract":"In this study, we propose a hybrid BCI combining electroencephalography (EEG) and near-infrared spectroscopy (NIRS) that can be potentially operated in eyes-closed condition for paralyzed patients with oculomotor dysfunctions. In the experiment, seven healthy participants performed mental subtraction and stayed relaxed (baseline state), during which EEG and NIRS data were simultaneously measured. To evaluate the feasibility of the hybrid BCI, we classified frontal brain activities inducted by mental subtraction and baseline state, and compared classification accuracies obtained using unimodal EEG and NIRS BCI and the hybrid BCI. As a result, the hybrid BCI (85.54 % ± 8.59) showed significantly higher classification accuracy than those of unimodal EEG (80.77 % ± 11.15) and NIRS BCI (77.12 % ± 7.63) (Wilcoxon signed rank test, Bonferroni corrected p < 0.05). The result demonstrated that our eyes-closed hybrid BCI approach could be potentially applied to neurodegenerative patients with impaired motor functions accompanied by a decline of visual functions.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116137384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282329
Fairoza Amira Hamzah, Taichi Yoshida, M. Iwahashi
This paper increases the coding performance for four-dimensional (4D) image based on the region of interest (ROI) coding implemented in the non-separable double lifting structure of 4D integer wavelet transform (WT). The WT has succeeded its predecessor, the discrete cosine transform (DCT), which has been widely used in image compression international standard, the JPEG 2000 since more than a decade ago. The conventional lifting structure which is known as the separable structure has many rounding operators that will increase the rounding noise inside the transform. The higher the rounding noise inside the transform, the lower the coding performance. Thus, a non-separable structure of double lifting WT is introduced to reduce the rounding noise. The non-separable structure is compatible with the conventional wavelet-based JPEG 2000. Furthermore, an ROI coding based non-separable integer WT is proposed by utilizing both lossy and lossless compression and it was observed that the proposed method increased the coding performance of 4D image.
{"title":"Four-dimensional image compression with region of interest based on non-separable double lifting integer wavelet transform","authors":"Fairoza Amira Hamzah, Taichi Yoshida, M. Iwahashi","doi":"10.1109/APSIPA.2017.8282329","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282329","url":null,"abstract":"This paper increases the coding performance for four-dimensional (4D) image based on the region of interest (ROI) coding implemented in the non-separable double lifting structure of 4D integer wavelet transform (WT). The WT has succeeded its predecessor, the discrete cosine transform (DCT), which has been widely used in image compression international standard, the JPEG 2000 since more than a decade ago. The conventional lifting structure which is known as the separable structure has many rounding operators that will increase the rounding noise inside the transform. The higher the rounding noise inside the transform, the lower the coding performance. Thus, a non-separable structure of double lifting WT is introduced to reduce the rounding noise. The non-separable structure is compatible with the conventional wavelet-based JPEG 2000. Furthermore, an ROI coding based non-separable integer WT is proposed by utilizing both lossy and lossless compression and it was observed that the proposed method increased the coding performance of 4D image.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"319 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123330905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282211
Kaavya Sriskandaraja, Gajan Suthokumar, V. Sethu, E. Ambikairajah
Widespread adoption of speaker verification for security relies on the existence of effective anti-spoofing countermeasures. This paper presents a countermeasure based on spectral features to detect replay spoofing attacks on automatic speaker verification systems. In particular, the use of hierarchical scattering decomposition coefficients and inverse- mel frequency cepstral coefficients are explored. Our best system achieved a relative improvement of around 70% in terms of equal error rate on the development set and 20% on the evaluation set, when compared to the baseline on the ASVspoof 2017 database. In addition, we show that features with a shorter window can be beneficial to detecting replayed speech, in contrast to speech synthesis and voice conversion attack.
{"title":"Investigating the use of scattering coefficients for replay attack detection","authors":"Kaavya Sriskandaraja, Gajan Suthokumar, V. Sethu, E. Ambikairajah","doi":"10.1109/APSIPA.2017.8282211","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282211","url":null,"abstract":"Widespread adoption of speaker verification for security relies on the existence of effective anti-spoofing countermeasures. This paper presents a countermeasure based on spectral features to detect replay spoofing attacks on automatic speaker verification systems. In particular, the use of hierarchical scattering decomposition coefficients and inverse- mel frequency cepstral coefficients are explored. Our best system achieved a relative improvement of around 70% in terms of equal error rate on the development set and 20% on the evaluation set, when compared to the baseline on the ASVspoof 2017 database. In addition, we show that features with a shorter window can be beneficial to detecting replayed speech, in contrast to speech synthesis and voice conversion attack.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123064711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282202
Chia-Chun Hsu, Jian-Jiun Ding, Yih-Cherng Lee
Though image interpolation has been developed for many years, most of state-of-the-art methods, including machine learning based methods, can only zoom the image with the scaling factor of 2, 3, 2k, or other integer values. Hence, the bicubic interpolation method is still a popular method for the non-integer scaling problem. In this paper, we propose a novel interpolation algorithm for image zooming with non-integer scaling factors based on the gradient direction. The proposed method first estimates the gradient direction for each pixel in the low resolution image. Then, we construct the gradient map for the high resolution image by the spline interpolation method. Finally, the intensity of missing pixels can be computed by the weighted sum of the pixels in the pre-defined window. To preserve the edge information during the interpolation process, the weight is determined by the inner product of the estimated gradient vector and the vector from the missing pixel to the known data point. Simulations show that the proposed method has higher performance than other non-integer time scaling methods and is helpful for superresolution.
{"title":"Efficient edge-oriented based image interpolation algorithm for non-integer scaling factor","authors":"Chia-Chun Hsu, Jian-Jiun Ding, Yih-Cherng Lee","doi":"10.1109/APSIPA.2017.8282202","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282202","url":null,"abstract":"Though image interpolation has been developed for many years, most of state-of-the-art methods, including machine learning based methods, can only zoom the image with the scaling factor of 2, 3, 2k, or other integer values. Hence, the bicubic interpolation method is still a popular method for the non-integer scaling problem. In this paper, we propose a novel interpolation algorithm for image zooming with non-integer scaling factors based on the gradient direction. The proposed method first estimates the gradient direction for each pixel in the low resolution image. Then, we construct the gradient map for the high resolution image by the spline interpolation method. Finally, the intensity of missing pixels can be computed by the weighted sum of the pixels in the pre-defined window. To preserve the edge information during the interpolation process, the weight is determined by the inner product of the estimated gradient vector and the vector from the missing pixel to the known data point. Simulations show that the proposed method has higher performance than other non-integer time scaling methods and is helpful for superresolution.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"63 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123187549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282099
Michael Hentschel, A. Ogawa, Marc Delcroix, T. Nakatani, Yuji Matsumoto
There have been many attempts in the past to exploit various sources of information in language modelling besides words, for instance prosody or topic information. With neural network based language models, it became easier to make use of this continuous valued information, because the neural network transforms the discrete valued space into a continuous valued space. So far, models incorporating prosodic information were jointly trained on the auxiliary and the textual information from the beginning. However, in practice the auxiliary information is usually only available for a small amount of the training data. In order to fully exploit text and acoustic data, we propose to re-train a recurrent neural network language model, rather than training a language model from scratch. Using this method we achieved perplexity and word error rate reductions for N-best rescoring on the MIT-OCW lecture corpus.
{"title":"Exploiting imbalanced textual and acoustic data for training prosodically-enhanced RNNLMs","authors":"Michael Hentschel, A. Ogawa, Marc Delcroix, T. Nakatani, Yuji Matsumoto","doi":"10.1109/APSIPA.2017.8282099","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282099","url":null,"abstract":"There have been many attempts in the past to exploit various sources of information in language modelling besides words, for instance prosody or topic information. With neural network based language models, it became easier to make use of this continuous valued information, because the neural network transforms the discrete valued space into a continuous valued space. So far, models incorporating prosodic information were jointly trained on the auxiliary and the textual information from the beginning. However, in practice the auxiliary information is usually only available for a small amount of the training data. In order to fully exploit text and acoustic data, we propose to re-train a recurrent neural network language model, rather than training a language model from scratch. Using this method we achieved perplexity and word error rate reductions for N-best rescoring on the MIT-OCW lecture corpus.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126123674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282041
X. Soh, Vishnu Monn Baskaran, Adamu Muhammad Buhari, R. Phan
The implementation of a micro-expression detection system introduces challenges to sustain a real time recognition result. In order to surmount these problems, this paper examines the algorithm of a serial Local Binary Pattern from Three Orthogonal Planes (LBP-TOP) in order to identify the performance limitations for real time system. Videos from SMIC and CASMEII were up sampled to higher resolutions (280×340, 560×680 and 1120×1360) to cater the need of real life implementation. Then, a parallel multicore-based LBP-TOP algorithm is studied as a benchmark. Experimental results show that the parallel LBP-TOP algorithm exhibits 7× and 8× speedup against serial LBP-TOP for SMIC and CASMEII database respectively for the highest tested video resolution utilising 24- logical processor multi-core architecture. To further reduce the computational time, this paper also proposes a many-core parallel LBP-TOP algorithm using Compute Unified Device Architecture (CUDA). In addition, a method is designed to calculate the threads and blocks required to launch the kernel when processing videos from different resolutions. The proposed algorithm increases the performance speedup to 117× and 130× against the serial algorithm for the highest tested resolution videos.
{"title":"A real time micro-expression detection system with LBP-TOP on a many-core processor","authors":"X. Soh, Vishnu Monn Baskaran, Adamu Muhammad Buhari, R. Phan","doi":"10.1109/APSIPA.2017.8282041","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282041","url":null,"abstract":"The implementation of a micro-expression detection system introduces challenges to sustain a real time recognition result. In order to surmount these problems, this paper examines the algorithm of a serial Local Binary Pattern from Three Orthogonal Planes (LBP-TOP) in order to identify the performance limitations for real time system. Videos from SMIC and CASMEII were up sampled to higher resolutions (280×340, 560×680 and 1120×1360) to cater the need of real life implementation. Then, a parallel multicore-based LBP-TOP algorithm is studied as a benchmark. Experimental results show that the parallel LBP-TOP algorithm exhibits 7× and 8× speedup against serial LBP-TOP for SMIC and CASMEII database respectively for the highest tested video resolution utilising 24- logical processor multi-core architecture. To further reduce the computational time, this paper also proposes a many-core parallel LBP-TOP algorithm using Compute Unified Device Architecture (CUDA). In addition, a method is designed to calculate the threads and blocks required to launch the kernel when processing videos from different resolutions. The proposed algorithm increases the performance speedup to 117× and 130× against the serial algorithm for the highest tested resolution videos.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126129679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8281996
I-Hsiang Wang, Jian-Jiun Ding, H. Hsu
This paper proposes a novel one-dimensional (1-D) signal compression technique. We first perform beat-alignment to transform a 1-D signal into 2-D, then use 2-D discrete wavelet transform (DWT) to further decompose the 2-D signal into multiple subbands. These coefficients in certain subbands are then coded using a simple differential pulse code modulation (DPCM). After which, we construct neural networks one for each subband (except the LL subband) to perform prediction. Based on the prediction results, we construct a type of pixel-wise context A to determine the activity of a given pixel. At last, the DWT coefficients and residues from DPCM are bit-plane coded using the Embedded Block Coding with Optimized Truncation (EBCOT) from JPEG2000. We analyzed our results using a well- known 1D signal, the ECG signals in the MIT-BIH database, and it demonstrated significant improvement over existing methods.
{"title":"Prediction techniques for wavelet based 1-D signal compression","authors":"I-Hsiang Wang, Jian-Jiun Ding, H. Hsu","doi":"10.1109/APSIPA.2017.8281996","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8281996","url":null,"abstract":"This paper proposes a novel one-dimensional (1-D) signal compression technique. We first perform beat-alignment to transform a 1-D signal into 2-D, then use 2-D discrete wavelet transform (DWT) to further decompose the 2-D signal into multiple subbands. These coefficients in certain subbands are then coded using a simple differential pulse code modulation (DPCM). After which, we construct neural networks one for each subband (except the LL subband) to perform prediction. Based on the prediction results, we construct a type of pixel-wise context A to determine the activity of a given pixel. At last, the DWT coefficients and residues from DPCM are bit-plane coded using the Embedded Block Coding with Optimized Truncation (EBCOT) from JPEG2000. We analyzed our results using a well- known 1D signal, the ECG signals in the MIT-BIH database, and it demonstrated significant improvement over existing methods.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129647205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282258
Ji-Sang Bae, Jong-Ok Kim
Safety of mass transportation like train cannot be emphasized enough, and accurate rail detection in the direction of progress can be useful to the safe operation of a train. In this paper, we propose a new pair particles filtering based rail detection algorithm that simultaneously predicts a pair position of left and right rails. Multiple pairs of particles are first generated from the previously detected rails, and features of a pair particles position, rail gauge, and gradient magnitude are used to detect the positions of pair rails. The proposed pair particles filtering based method flexibly detects both straight and curved rails robustly. Experiments with various actual rail images show plausible detection results of the proposed method.
{"title":"A rail detection algorithm based on pair particles filtering","authors":"Ji-Sang Bae, Jong-Ok Kim","doi":"10.1109/APSIPA.2017.8282258","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282258","url":null,"abstract":"Safety of mass transportation like train cannot be emphasized enough, and accurate rail detection in the direction of progress can be useful to the safe operation of a train. In this paper, we propose a new pair particles filtering based rail detection algorithm that simultaneously predicts a pair position of left and right rails. Multiple pairs of particles are first generated from the previously detected rails, and features of a pair particles position, rail gauge, and gradient magnitude are used to detect the positions of pair rails. The proposed pair particles filtering based method flexibly detects both straight and curved rails robustly. Experiments with various actual rail images show plausible detection results of the proposed method.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129744707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282255
Tatsunori Itakura, Toshihisa Tanaka
Epilepsy is a neurological disorder which causes abnormal discharges in the brain. Epileptic focus localization is a important factor for successful epilepsy surgery. The intracranial electroencephalogram (iEEG) is the most used signal for detecting epileptic focus. The iEEG signals are obtained from a publicly available database that consists of 7,500 signal pairs. To this dataset, empirical mode decomposition (EMD) has been successfully applied to detect the epileptic focus. However, the EMD method is not suitable for iEEG signal pairs. In this paper, a method for the classification of focal and non-focal iEEG signals using bivariate EMD (BEMD) is presented. The bivariate iEEG signals are decomposed the into signal components of the same frequency band. Various entropy measures calculated from the IMFs of the iEEG signals. Then, some or all of the entropies are chosen as features, which are discriminated into focal or non-focal iEEG by using the support vector machine (SVM). Experimental results show that the proposed method is able to differentiate the focal from non-focal iEEG signals with an average classification accuracy of 86.89%.
{"title":"Epileptic focus localization based on bivariate empirical mode decomposition and entropy","authors":"Tatsunori Itakura, Toshihisa Tanaka","doi":"10.1109/APSIPA.2017.8282255","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282255","url":null,"abstract":"Epilepsy is a neurological disorder which causes abnormal discharges in the brain. Epileptic focus localization is a important factor for successful epilepsy surgery. The intracranial electroencephalogram (iEEG) is the most used signal for detecting epileptic focus. The iEEG signals are obtained from a publicly available database that consists of 7,500 signal pairs. To this dataset, empirical mode decomposition (EMD) has been successfully applied to detect the epileptic focus. However, the EMD method is not suitable for iEEG signal pairs. In this paper, a method for the classification of focal and non-focal iEEG signals using bivariate EMD (BEMD) is presented. The bivariate iEEG signals are decomposed the into signal components of the same frequency band. Various entropy measures calculated from the IMFs of the iEEG signals. Then, some or all of the entropies are chosen as features, which are discriminated into focal or non-focal iEEG by using the support vector machine (SVM). Experimental results show that the proposed method is able to differentiate the focal from non-focal iEEG signals with an average classification accuracy of 86.89%.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129410428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}