Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282202
Chia-Chun Hsu, Jian-Jiun Ding, Yih-Cherng Lee
Though image interpolation has been developed for many years, most of state-of-the-art methods, including machine learning based methods, can only zoom the image with the scaling factor of 2, 3, 2k, or other integer values. Hence, the bicubic interpolation method is still a popular method for the non-integer scaling problem. In this paper, we propose a novel interpolation algorithm for image zooming with non-integer scaling factors based on the gradient direction. The proposed method first estimates the gradient direction for each pixel in the low resolution image. Then, we construct the gradient map for the high resolution image by the spline interpolation method. Finally, the intensity of missing pixels can be computed by the weighted sum of the pixels in the pre-defined window. To preserve the edge information during the interpolation process, the weight is determined by the inner product of the estimated gradient vector and the vector from the missing pixel to the known data point. Simulations show that the proposed method has higher performance than other non-integer time scaling methods and is helpful for superresolution.
{"title":"Efficient edge-oriented based image interpolation algorithm for non-integer scaling factor","authors":"Chia-Chun Hsu, Jian-Jiun Ding, Yih-Cherng Lee","doi":"10.1109/APSIPA.2017.8282202","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282202","url":null,"abstract":"Though image interpolation has been developed for many years, most of state-of-the-art methods, including machine learning based methods, can only zoom the image with the scaling factor of 2, 3, 2k, or other integer values. Hence, the bicubic interpolation method is still a popular method for the non-integer scaling problem. In this paper, we propose a novel interpolation algorithm for image zooming with non-integer scaling factors based on the gradient direction. The proposed method first estimates the gradient direction for each pixel in the low resolution image. Then, we construct the gradient map for the high resolution image by the spline interpolation method. Finally, the intensity of missing pixels can be computed by the weighted sum of the pixels in the pre-defined window. To preserve the edge information during the interpolation process, the weight is determined by the inner product of the estimated gradient vector and the vector from the missing pixel to the known data point. Simulations show that the proposed method has higher performance than other non-integer time scaling methods and is helpful for superresolution.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"63 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123187549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes multiscale directional transforms (MDTs) based on cosine-sine modulated filter banks (CSMFBs). Sparse image representation by directional transforms is necessary for image analysis and processing tasks and has been extensively studied. Conventionally, cosine-sine modulated filter banks (CSMFBs) have been proposed as one of separable directional transforms (SepDTs). Their computational cost is much lower than non-SepDTs, and they can work better than other SepDTs, e.g., dual-tree complex wavelet transforms (DTCWTs) in image processing applications. One drawback of CSMFBs is a lack of multiscale directional selectivity, i.e., it cannot provide multiple scale directional atoms as in the DTCWT frame, and thus flexible image representation cannot be achieved. In this work, we show a design method of multiscale CSMFBs by extending modulated lapped transforms, which are a subclass of CSMFBs. We confirm its effectiveness in nonlinear approximation and image denoising as a practical application.
{"title":"Multiscale directional transforms based on cosine-sine modulated filter banks for sparse directional image representation","authors":"Yusuke Nomura, Ryutaro Ogawa, Seisuke Kyochi, Taizo Suzuki","doi":"10.1109/APSIPA.2017.8282331","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282331","url":null,"abstract":"This paper proposes multiscale directional transforms (MDTs) based on cosine-sine modulated filter banks (CSMFBs). Sparse image representation by directional transforms is necessary for image analysis and processing tasks and has been extensively studied. Conventionally, cosine-sine modulated filter banks (CSMFBs) have been proposed as one of separable directional transforms (SepDTs). Their computational cost is much lower than non-SepDTs, and they can work better than other SepDTs, e.g., dual-tree complex wavelet transforms (DTCWTs) in image processing applications. One drawback of CSMFBs is a lack of multiscale directional selectivity, i.e., it cannot provide multiple scale directional atoms as in the DTCWT frame, and thus flexible image representation cannot be achieved. In this work, we show a design method of multiscale CSMFBs by extending modulated lapped transforms, which are a subclass of CSMFBs. We confirm its effectiveness in nonlinear approximation and image denoising as a practical application.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134477401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8281999
Lounell B. Gueta, Akiko Sato
The paper aims to classify road surface types and conditions by characterizing the temporal and spectral features of vibration signals gathered from land roads. In the past, road surfaces have been studied for detecting road anomalies like bumps and potholes. This study extends the analysis to detect road anomalies such as patches and road gaps. In terms of temporal features such as magnitude peaks and variance, these anomalies have common features to road anomalies. Therefore, a classification method based on support vector classifier is proposed by taking into account both the temporal and spectral features of the road vibrations as well as factor such as vehicle speed. It is tested on a real data gathered by conducting a smart phone-based data collection between Thailand and Cambodia and is shown to be effective in differentiating road segments with and without anomalies. The method is applicable to undertaking appropriate road maintenance works.
{"title":"Classifying road surface conditions using vibration signals","authors":"Lounell B. Gueta, Akiko Sato","doi":"10.1109/APSIPA.2017.8281999","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8281999","url":null,"abstract":"The paper aims to classify road surface types and conditions by characterizing the temporal and spectral features of vibration signals gathered from land roads. In the past, road surfaces have been studied for detecting road anomalies like bumps and potholes. This study extends the analysis to detect road anomalies such as patches and road gaps. In terms of temporal features such as magnitude peaks and variance, these anomalies have common features to road anomalies. Therefore, a classification method based on support vector classifier is proposed by taking into account both the temporal and spectral features of the road vibrations as well as factor such as vehicle speed. It is tested on a real data gathered by conducting a smart phone-based data collection between Thailand and Cambodia and is shown to be effective in differentiating road segments with and without anomalies. The method is applicable to undertaking appropriate road maintenance works.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131558193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282250
Hala As’ad, M. Bouchard, A. H. Kamkar-Parsi
This work is introducing novel binaural beamforming algorithms for hearing aids, with a good trade-off between noise reduction and the preservation of the binaural cues for different types of sources (directional interfering talker sources, diffuse-like background noise). In the proposed methods, no knowledge of the interfering talkers' direction or the second order statistics of the noise-only components is required. Different classification decisions are considered in the time- frequency domain based on the power, the power difference, and the complex coherence of different available signals. Simulations are performed using signals recorded from multichannel binaural hearing aids, to validate the performance of the proposed algorithms under different acoustic scenarios and using different microphone configurations. For the simulations performed in this paper, a good knowledge of the target direction and propagation model is assumed. For hearing aids, this assumption is typically more realistic than the assumption of knowing the direction and propagation model of the interferer talkers. The comparison of the performance results is done with other algorithms that don't require information on the directions or statistics of the interfering talker sources and the background noise. The results indicate that the proposed algorithms can either provide nearly the same noise reduction as classical beamformers but with improved noise binaural cues preservation, or they can produce a good trade-off between noise reduction and noise binaural cues preservation.
{"title":"Binaural beamforming with spatial cues preservation for hearing aids in real-life complex acoustic environments","authors":"Hala As’ad, M. Bouchard, A. H. Kamkar-Parsi","doi":"10.1109/APSIPA.2017.8282250","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282250","url":null,"abstract":"This work is introducing novel binaural beamforming algorithms for hearing aids, with a good trade-off between noise reduction and the preservation of the binaural cues for different types of sources (directional interfering talker sources, diffuse-like background noise). In the proposed methods, no knowledge of the interfering talkers' direction or the second order statistics of the noise-only components is required. Different classification decisions are considered in the time- frequency domain based on the power, the power difference, and the complex coherence of different available signals. Simulations are performed using signals recorded from multichannel binaural hearing aids, to validate the performance of the proposed algorithms under different acoustic scenarios and using different microphone configurations. For the simulations performed in this paper, a good knowledge of the target direction and propagation model is assumed. For hearing aids, this assumption is typically more realistic than the assumption of knowing the direction and propagation model of the interferer talkers. The comparison of the performance results is done with other algorithms that don't require information on the directions or statistics of the interfering talker sources and the background noise. The results indicate that the proposed algorithms can either provide nearly the same noise reduction as classical beamformers but with improved noise binaural cues preservation, or they can produce a good trade-off between noise reduction and noise binaural cues preservation.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134195973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282236
Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita
We propose a method for the online sound structure analysis based on a Bayesian generative model of acoustic feature sequences, with which the hierarchical generative process of the sound clip, acoustic topic, acoustic word, and acoustic feature is assumed. In this model, it is assumed that sound clips are organized based on the combination of latent acoustic topics, and each acoustic topic is represented by a Gaussian mixture model (GMM) over an acoustic feature space, where the components of the GMM correspond to acoustic words. Since the conventional batch algorithm for learning this model requires a huge amount of calculation, it is difficult to analyze the massive amount of sound data. Moreover, the batch algorithm does not allow us to analyze the sequentially obtained data. Our variational Bayes-based online algorithm for this generative model can analyze the structure of sounds sound clip by sound clip. The experimental results show that the proposed online algorithm can reduce the calculation cost by about 90% and estimate the posterior distributions as efficiently as the conventional batch algorithm.
{"title":"Online sound structure analysis based on generative model of acoustic feature sequences","authors":"Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita","doi":"10.1109/APSIPA.2017.8282236","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282236","url":null,"abstract":"We propose a method for the online sound structure analysis based on a Bayesian generative model of acoustic feature sequences, with which the hierarchical generative process of the sound clip, acoustic topic, acoustic word, and acoustic feature is assumed. In this model, it is assumed that sound clips are organized based on the combination of latent acoustic topics, and each acoustic topic is represented by a Gaussian mixture model (GMM) over an acoustic feature space, where the components of the GMM correspond to acoustic words. Since the conventional batch algorithm for learning this model requires a huge amount of calculation, it is difficult to analyze the massive amount of sound data. Moreover, the batch algorithm does not allow us to analyze the sequentially obtained data. Our variational Bayes-based online algorithm for this generative model can analyze the structure of sounds sound clip by sound clip. The experimental results show that the proposed online algorithm can reduce the calculation cost by about 90% and estimate the posterior distributions as efficiently as the conventional batch algorithm.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132423297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282287
Ming-Hsiang Su, Chung-Hsien Wu, Kun-Yi Huang, Qian-Bei Hong, H. Wang
This study presents an approach to personality trait (PT) perception from speech signals using wavelet-based multiresolution analysis and convolutional neural networks (CNNs). In this study, first, wavelet transform is employed to decompose the speech signals into the signals at different levels of resolution. Then, the acoustic features of the speech signals at each resolution are extracted. Given the acoustic features, the CNN is adopted to generate the profiles of the Big Five Inventory-10 (BFI- 10), which provide a quantitative measure for expressing the degree of the presence or absence of a set of 10 basic BFI items. The BFI-10 profiles are further fed into five artificial neural networks (ANN), each for one of the five personality dimensions: Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism for PT perception. To evaluate the performance of the proposed method, experiments were conducted over the SSPNet Speaker Personality Corpus (SPC), including 640 clips randomly extracted from the French news bulletins in the INTERSPEECH 2012 speaker trait sub-challenge. From the experimental results, an average PT perception accuracy of 71.97% was obtained, outperforming the ANN-based method and the Baseline method in the INTERSPEECH 2012 speaker trait sub-challenge.
本研究提出了一种基于小波的多分辨率分析和卷积神经网络(cnn)从语音信号中感知人格特征(PT)的方法。在本研究中,首先利用小波变换将语音信号分解为不同分辨率的信号。然后,提取语音信号在各个分辨率下的声学特征。根据声学特征,采用CNN生成大五项清单-10 (Big Five Inventory-10, BFI- 10)的剖面,为表达一组10个基本BFI项的存在或不存在的程度提供了定量度量。BFI-10的特征被进一步输入到5个人工神经网络(ANN)中,每个网络对应5个人格维度中的一个:开放性、严谨性、外向性、宜人性和神经质。为了评估该方法的性能,在SSPNet说话人人格语料库(SPC)上进行了实验,其中包括在INTERSPEECH 2012说话人特征子挑战中随机抽取的640个法语新闻公告片段。从实验结果来看,在INTERSPEECH 2012说话人特征子挑战中,平均PT感知准确率为71.97%,优于基于神经网络的方法和Baseline方法。
{"title":"Personality trait perception from speech signals using multiresolution analysis and convolutional neural networks","authors":"Ming-Hsiang Su, Chung-Hsien Wu, Kun-Yi Huang, Qian-Bei Hong, H. Wang","doi":"10.1109/APSIPA.2017.8282287","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282287","url":null,"abstract":"This study presents an approach to personality trait (PT) perception from speech signals using wavelet-based multiresolution analysis and convolutional neural networks (CNNs). In this study, first, wavelet transform is employed to decompose the speech signals into the signals at different levels of resolution. Then, the acoustic features of the speech signals at each resolution are extracted. Given the acoustic features, the CNN is adopted to generate the profiles of the Big Five Inventory-10 (BFI- 10), which provide a quantitative measure for expressing the degree of the presence or absence of a set of 10 basic BFI items. The BFI-10 profiles are further fed into five artificial neural networks (ANN), each for one of the five personality dimensions: Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism for PT perception. To evaluate the performance of the proposed method, experiments were conducted over the SSPNet Speaker Personality Corpus (SPC), including 640 clips randomly extracted from the French news bulletins in the INTERSPEECH 2012 speaker trait sub-challenge. From the experimental results, an average PT perception accuracy of 71.97% was obtained, outperforming the ANN-based method and the Baseline method in the INTERSPEECH 2012 speaker trait sub-challenge.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133577060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282219
Patrick Lumban Tobing, H. Kameoka, T. Toda
This paper presents a novel implementation of latent trajectory modeling in a deep acoustic-to-articulatory inversion mapping framework. In the conventional methods, i.e., the Gaussian mixture model (GMM)- and the deep neural network (DNN)- based inversion mappings, the frame interdependency can be considered while generating articulatory parameter trajectories with the use of an explicit constraint between static and dynamic features. However, in training these models, such a constraint is not considered, and therefore, the trained model is not optimum for the mapping procedure. In this paper, we address this problem by introducing a latent trajectory modeling into the DNN-based inversion mapping. In the latent trajectory model, the frame interdependency can be well considered, in both training and mapping, by using a soft-constraint between static and dynamic features. The experimental results demonstrate that the proposed latent trajectory DNN (LTDNN)-based inversion mapping outperforms the conventional and the state-of-the-art inversion mapping systems.
{"title":"Deep acoustic-to-articulatory inversion mapping with latent trajectory modeling","authors":"Patrick Lumban Tobing, H. Kameoka, T. Toda","doi":"10.1109/APSIPA.2017.8282219","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282219","url":null,"abstract":"This paper presents a novel implementation of latent trajectory modeling in a deep acoustic-to-articulatory inversion mapping framework. In the conventional methods, i.e., the Gaussian mixture model (GMM)- and the deep neural network (DNN)- based inversion mappings, the frame interdependency can be considered while generating articulatory parameter trajectories with the use of an explicit constraint between static and dynamic features. However, in training these models, such a constraint is not considered, and therefore, the trained model is not optimum for the mapping procedure. In this paper, we address this problem by introducing a latent trajectory modeling into the DNN-based inversion mapping. In the latent trajectory model, the frame interdependency can be well considered, in both training and mapping, by using a soft-constraint between static and dynamic features. The experimental results demonstrate that the proposed latent trajectory DNN (LTDNN)-based inversion mapping outperforms the conventional and the state-of-the-art inversion mapping systems.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117268618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282013
Chuang Shi, Huiyong Li, Dongyuan Shi, Bhan Lam, W. Gan
This paper formulates the multiple-input multiple- output active noise control as a spatial sampling and reconstruction problem. With the proposed formulation, the inputs from the reference microphones and the outputs of the antinoise sources are regarded as spatial samples. We show that the proposed formulation is general and can unify the existing control strategies. Three control strategies, for instance, are derived from the proposed formulation and linked to different cost functions in the practical implementation. Finally, simulation results are presented to verify the effectiveness of our analysis.
{"title":"Understanding multiple-input multiple-output active noise control from a perspective of sampling and reconstruction","authors":"Chuang Shi, Huiyong Li, Dongyuan Shi, Bhan Lam, W. Gan","doi":"10.1109/APSIPA.2017.8282013","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282013","url":null,"abstract":"This paper formulates the multiple-input multiple- output active noise control as a spatial sampling and reconstruction problem. With the proposed formulation, the inputs from the reference microphones and the outputs of the antinoise sources are regarded as spatial samples. We show that the proposed formulation is general and can unify the existing control strategies. Three control strategies, for instance, are derived from the proposed formulation and linked to different cost functions in the practical implementation. Finally, simulation results are presented to verify the effectiveness of our analysis.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114895755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282126
S. Ogai, Toshihisa Tanaka
A fundamental limitation of human-computer interaction using electrooculogram (EOG) is a low accuracy of eye tracking performance and the head movement that violates the calibration of the on-monitor gaze coordinates. In this paper, we develop a drag-and-drop type interface with the EOG that can avoid a direct estimation of gaze location and can make users free from the restriction of head movement. To drag a cursor on the screen, the proposed system models the relationship between the amount of eye movement and the EOG amplitude with linear regression. Five subjects participated in the experiment to compare the proposed drag-and-drop type and the conventional direct gaze type interfaces. Performance measures such as efficiency and satisfaction showed the advantage of the proposed method with significant differences (p < 0.05).
{"title":"A drag-and-drop type human computer interaction technique based on electrooculogram","authors":"S. Ogai, Toshihisa Tanaka","doi":"10.1109/APSIPA.2017.8282126","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282126","url":null,"abstract":"A fundamental limitation of human-computer interaction using electrooculogram (EOG) is a low accuracy of eye tracking performance and the head movement that violates the calibration of the on-monitor gaze coordinates. In this paper, we develop a drag-and-drop type interface with the EOG that can avoid a direct estimation of gaze location and can make users free from the restriction of head movement. To drag a cursor on the screen, the proposed system models the relationship between the amount of eye movement and the EOG amplitude with linear regression. Five subjects participated in the experiment to compare the proposed drag-and-drop type and the conventional direct gaze type interfaces. Performance measures such as efficiency and satisfaction showed the advantage of the proposed method with significant differences (p < 0.05).","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116113181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-01DOI: 10.1109/APSIPA.2017.8282127
Jaeyoung Shin, K. Müller, Han-Jeong Hwang
In this study, we propose a hybrid BCI combining electroencephalography (EEG) and near-infrared spectroscopy (NIRS) that can be potentially operated in eyes-closed condition for paralyzed patients with oculomotor dysfunctions. In the experiment, seven healthy participants performed mental subtraction and stayed relaxed (baseline state), during which EEG and NIRS data were simultaneously measured. To evaluate the feasibility of the hybrid BCI, we classified frontal brain activities inducted by mental subtraction and baseline state, and compared classification accuracies obtained using unimodal EEG and NIRS BCI and the hybrid BCI. As a result, the hybrid BCI (85.54 % ± 8.59) showed significantly higher classification accuracy than those of unimodal EEG (80.77 % ± 11.15) and NIRS BCI (77.12 % ± 7.63) (Wilcoxon signed rank test, Bonferroni corrected p < 0.05). The result demonstrated that our eyes-closed hybrid BCI approach could be potentially applied to neurodegenerative patients with impaired motor functions accompanied by a decline of visual functions.
{"title":"Hybrid EEG-NIRS brain-computer interface under eyes-closed condition","authors":"Jaeyoung Shin, K. Müller, Han-Jeong Hwang","doi":"10.1109/APSIPA.2017.8282127","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282127","url":null,"abstract":"In this study, we propose a hybrid BCI combining electroencephalography (EEG) and near-infrared spectroscopy (NIRS) that can be potentially operated in eyes-closed condition for paralyzed patients with oculomotor dysfunctions. In the experiment, seven healthy participants performed mental subtraction and stayed relaxed (baseline state), during which EEG and NIRS data were simultaneously measured. To evaluate the feasibility of the hybrid BCI, we classified frontal brain activities inducted by mental subtraction and baseline state, and compared classification accuracies obtained using unimodal EEG and NIRS BCI and the hybrid BCI. As a result, the hybrid BCI (85.54 % ± 8.59) showed significantly higher classification accuracy than those of unimodal EEG (80.77 % ± 11.15) and NIRS BCI (77.12 % ± 7.63) (Wilcoxon signed rank test, Bonferroni corrected p < 0.05). The result demonstrated that our eyes-closed hybrid BCI approach could be potentially applied to neurodegenerative patients with impaired motor functions accompanied by a decline of visual functions.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116137384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}