Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820787
Byoungkyun Kim, Byeongho Choi, Youngbae Hwang
This paper presents a scalable multiple GPU architecture for super multi-view (SMV) synthesis using the multi-view video plus depth (MVD) data. SMV synthesis is essential to generate 3D contents for the SMV 3D display with hundred views. SMV 3D display, recently released to support 108 viewpoints, shows the multiplexed result of small viewing interval. Hence, we should synthesize the intermediate views over a hundred for each pair of two cameras in multi-camera system. View synthesis of more than hundred high resolution images, however, needs massive data processing, which is linearly increased in proportion to the number of synthesized views. In this paper, we propose a real-time SMV synthesis method using multiple GPU. The scalability of GPU can be utilized to reduce the processing time of view synthesis without any changes of the kernel function. We evaluate the proposed method for synthesizing 180 intermediate views from 18 input HD images according to the number of GPUs. We show that 180 intermediate views can be synthesized in real-time using 4 GPUs.
{"title":"Scalable multiple GPU architecture for super multi-view synthesis using MVD","authors":"Byoungkyun Kim, Byeongho Choi, Youngbae Hwang","doi":"10.1109/APSIPA.2016.7820787","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820787","url":null,"abstract":"This paper presents a scalable multiple GPU architecture for super multi-view (SMV) synthesis using the multi-view video plus depth (MVD) data. SMV synthesis is essential to generate 3D contents for the SMV 3D display with hundred views. SMV 3D display, recently released to support 108 viewpoints, shows the multiplexed result of small viewing interval. Hence, we should synthesize the intermediate views over a hundred for each pair of two cameras in multi-camera system. View synthesis of more than hundred high resolution images, however, needs massive data processing, which is linearly increased in proportion to the number of synthesized views. In this paper, we propose a real-time SMV synthesis method using multiple GPU. The scalability of GPU can be utilized to reduce the processing time of view synthesis without any changes of the kernel function. We evaluate the proposed method for synthesizing 180 intermediate views from 18 input HD images according to the number of GPUs. We show that 180 intermediate views can be synthesized in real-time using 4 GPUs.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"247 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122832929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820906
Jie Chen, Lap-Pui Chau, He Li
Rich information could be extracted from the high dimensional light field (LF) data, and one of the most fundamental output is scene depth. State-of-the-art depth calculation methods produce noisy calculations especially over texture-less regions. Based on Super-pixel segmentation, we propose to incorporate multi-level disparity information into a Bayesian Particle Filtering framework. Each pixels' individual as well as regional information are involved to give Maximum A Posteriori (MAP) predictions based on our proposed statistical model. The method can produce equivalent or better scene depth interpolation results than some of the state-of-the art methods, with possible potential in image processing applications such as scene alignment and stablization.
{"title":"Light field depth from multi-scale particle filtering","authors":"Jie Chen, Lap-Pui Chau, He Li","doi":"10.1109/APSIPA.2016.7820906","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820906","url":null,"abstract":"Rich information could be extracted from the high dimensional light field (LF) data, and one of the most fundamental output is scene depth. State-of-the-art depth calculation methods produce noisy calculations especially over texture-less regions. Based on Super-pixel segmentation, we propose to incorporate multi-level disparity information into a Bayesian Particle Filtering framework. Each pixels' individual as well as regional information are involved to give Maximum A Posteriori (MAP) predictions based on our proposed statistical model. The method can produce equivalent or better scene depth interpolation results than some of the state-of-the art methods, with possible potential in image processing applications such as scene alignment and stablization.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116524947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820806
Yanting Chen, Yu Chen, Jin Zhang, Ju Zhang, Hua Lin, Jianguo Wei, J. Dang
The present study examined the citation patterns of Mandarin tones in prelingual deaf adults with cochelar implants or hearing aids. The results showed that the participants tried to build up tonal pattern by exploring phonetic features such as creaky voice and tonal duration. The results also indicated that although the participants had problems distinguishing T2 from T3, T2 was harder than T3 for them. In fact, T2 was the hardest of all Mandarin tones for these prelingual deaf adults.
{"title":"Mandarin citation tone patterns of prelingual Chinese deaf adults","authors":"Yanting Chen, Yu Chen, Jin Zhang, Ju Zhang, Hua Lin, Jianguo Wei, J. Dang","doi":"10.1109/APSIPA.2016.7820806","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820806","url":null,"abstract":"The present study examined the citation patterns of Mandarin tones in prelingual deaf adults with cochelar implants or hearing aids. The results showed that the participants tried to build up tonal pattern by exploring phonetic features such as creaky voice and tonal duration. The results also indicated that although the participants had problems distinguishing T2 from T3, T2 was harder than T3 for them. In fact, T2 was the hardest of all Mandarin tones for these prelingual deaf adults.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127696957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820913
Jan Kristanto Wibisono, H. Hang
The goal of this research is to fuse color and depth information to generate good image segmentation. The image segmentation topic has been studied for several decades. But only recently the use of depth data becomes popular due to the wide spread of affordable and accessible depth cameras such as Microsoft Kinect. The availability of depth information opens up new opportunities for image segmentation. Many methods have developed on color image segmentation over the years. Only recently, several papers are published on image segmentation using both the depth information and the color information. In this research, we focus on how to combine the depth and color information to improve the state of art color image segmentation methods. We adopt a few existing schemes and fuse their outputs to produce the final results. We exploit the planar information to improve the color segmentation. The result is quite satisfactory on both human perception and objective measures.
{"title":"Fusion of color and depth information for image segmentation","authors":"Jan Kristanto Wibisono, H. Hang","doi":"10.1109/APSIPA.2016.7820913","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820913","url":null,"abstract":"The goal of this research is to fuse color and depth information to generate good image segmentation. The image segmentation topic has been studied for several decades. But only recently the use of depth data becomes popular due to the wide spread of affordable and accessible depth cameras such as Microsoft Kinect. The availability of depth information opens up new opportunities for image segmentation. Many methods have developed on color image segmentation over the years. Only recently, several papers are published on image segmentation using both the depth information and the color information. In this research, we focus on how to combine the depth and color information to improve the state of art color image segmentation methods. We adopt a few existing schemes and fuse their outputs to produce the final results. We exploit the planar information to improve the color segmentation. The result is quite satisfactory on both human perception and objective measures.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128041423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820853
Hyukzae Lee, Changick Kim
We propose a simple yet effective blur kernel re-initialization method in a coarse-to-fine framework for blind image deblurring. The proposed method is motivated by observing that most deblurring algorithms use only an estimated blur kernel at the coarser level to initialize a blur kernel for the next finer level. Based on this observation, we design an objective function to exploit both a blur kernel and an latent image estimated at the coarser level to produce an initial blur kernel for the finer level. Experimental results demonstrate that the proposed algorithm improves performance of the existing deblurring algorithms in terms of accuracy and success rate.
{"title":"Blur kernel re-initialization for blind image deblurring","authors":"Hyukzae Lee, Changick Kim","doi":"10.1109/APSIPA.2016.7820853","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820853","url":null,"abstract":"We propose a simple yet effective blur kernel re-initialization method in a coarse-to-fine framework for blind image deblurring. The proposed method is motivated by observing that most deblurring algorithms use only an estimated blur kernel at the coarser level to initialize a blur kernel for the next finer level. Based on this observation, we design an objective function to exploit both a blur kernel and an latent image estimated at the coarser level to produce an initial blur kernel for the finer level. Experimental results demonstrate that the proposed algorithm improves performance of the existing deblurring algorithms in terms of accuracy and success rate.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128066193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820793
T. Samanchuen
Wireless sensor networks (WSNs) were designed for monitoring environment that is difficult to access. The energy of each node has its limit and cannot be replaced or recharged. All components of WSNs must be an energy efficient component, not only hardware component but also software component. Energy efficient routing protocol can prolong the networks lifetime. Reactive WSNs is addressed in this work. A protocol using static clustering technique with cluster head selection based on maximum residual energy is proposed. Simulation is performed to demonstrate the performance of the proposed protocol. It is shown that the proposed protocol can prolong the network lifetime better than that of the conventional protocols.
{"title":"An energy efficient routing protocol with stable cluster head for reactive wireless sensor networks","authors":"T. Samanchuen","doi":"10.1109/APSIPA.2016.7820793","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820793","url":null,"abstract":"Wireless sensor networks (WSNs) were designed for monitoring environment that is difficult to access. The energy of each node has its limit and cannot be replaced or recharged. All components of WSNs must be an energy efficient component, not only hardware component but also software component. Energy efficient routing protocol can prolong the networks lifetime. Reactive WSNs is addressed in this work. A protocol using static clustering technique with cluster head selection based on maximum residual energy is proposed. Simulation is performed to demonstrate the performance of the proposed protocol. It is shown that the proposed protocol can prolong the network lifetime better than that of the conventional protocols.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"62 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132197686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820838
Min-Koo Kang, Sung-Kyu Kim
Visual discomfort (VD) is inevitable as long as stereoscopy is used in 3D displays, and there's a trade-off between depth impression and visual comfort. For this reason, technologies that control depth impression considering VD perception have attracted great interest of researchers. However, VD perception significantly varies according to various personal-factors as well as environmental factors, and evaluating VD perception still takes a lot of time and effort for viewing tests. We propose a simple and reliable method that calibrates stereo acuities, binocular fusion limits, and preferences for depth perception of individuals. For the experiment, four non-expert viewers attended, and the same viewing conditions were given to them. The experimental result confirmed that calibrated features in human binocular vision coincide with the literature except for slight variations among the attendees. The proposed method would be utilized across the whole 3D video technology chain from video capture to the display.
{"title":"Personal binocular vision calibration using layered random dot stereogram","authors":"Min-Koo Kang, Sung-Kyu Kim","doi":"10.1109/APSIPA.2016.7820838","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820838","url":null,"abstract":"Visual discomfort (VD) is inevitable as long as stereoscopy is used in 3D displays, and there's a trade-off between depth impression and visual comfort. For this reason, technologies that control depth impression considering VD perception have attracted great interest of researchers. However, VD perception significantly varies according to various personal-factors as well as environmental factors, and evaluating VD perception still takes a lot of time and effort for viewing tests. We propose a simple and reliable method that calibrates stereo acuities, binocular fusion limits, and preferences for depth perception of individuals. For the experiment, four non-expert viewers attended, and the same viewing conditions were given to them. The experimental result confirmed that calibrated features in human binocular vision coincide with the literature except for slight variations among the attendees. The proposed method would be utilized across the whole 3D video technology chain from video capture to the display.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"87 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130326564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820845
Jaehak Choi, Youngseop Kim
Devices of IoT (Internet of Things) are limited in resources such as CPU, memory etc. The LEA (Lightweight Encryption Algorithm) was standardized as the encryption algorithm suitable for IoT devices in Korea in 2013. However, LEA is vulnerable to the side-channel analysis attack using consumed electric power. To supplement this vulnerability, masking technique is mainly used. However, in case of masking process, the implementation time is increased, losing the characteristics of speedup and lightening. This paper proposes a new and faster LEA algorithm as a countermeasure to the side-channel attack. The proposed algorithm is about 17 times faster than existing algorithms with the masking process to prevent differential side-channel attack.
{"title":"An improved LEA block encryption algorithm to prevent side-channel attack in the IoT system","authors":"Jaehak Choi, Youngseop Kim","doi":"10.1109/APSIPA.2016.7820845","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820845","url":null,"abstract":"Devices of IoT (Internet of Things) are limited in resources such as CPU, memory etc. The LEA (Lightweight Encryption Algorithm) was standardized as the encryption algorithm suitable for IoT devices in Korea in 2013. However, LEA is vulnerable to the side-channel analysis attack using consumed electric power. To supplement this vulnerability, masking technique is mainly used. However, in case of masking process, the implementation time is increased, losing the characteristics of speedup and lightening. This paper proposes a new and faster LEA algorithm as a countermeasure to the side-channel attack. The proposed algorithm is about 17 times faster than existing algorithms with the masking process to prevent differential side-channel attack.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"211 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115300505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.
{"title":"Speech emotion classification using multiple kernel Gaussian process","authors":"Sih-Huei Chen, Jia-Ching Wang, Wen-Chi Hsieh, Yu-Hao Chin, Chin-Wen Ho, Chung-Hsien Wu","doi":"10.1109/APSIPA.2016.7820708","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820708","url":null,"abstract":"Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115423683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820683
Y. Ji, Xiangeng Bu, Jinwei Sun, Zhiyong Liu
In order to establish a more reliable and robust EEG model in sleep stages, the reasonable choice of modeling parameters is necessary. The function of this step is to select a subset of d features from a set of D features based on some optimization criterion, and provide the most optimal input features of classification. In the present study, an improved simulated annealing genetic algorithm (ISAGA) was proposed. 25 feature parameters were extracted from the sleep EEG in MIT-BIH polysomnography database. The feature selection results demonstrated that ISAGA can get a higher classification accuracy with fewer feature number than the correlation coefficient algorithm (CCA), genetic algorithm (GA), adaptive genetic algorithm (AGA) and simulated annealing genetic algorithm (SAGA). Compared to using all the features in sleep staging, the classification accuracy of ISAGA with optimal features is about 92.00%, which improved about 4.83%.
{"title":"An improved simulated annealing genetic algorithm of EEG feature selection in sleep stage","authors":"Y. Ji, Xiangeng Bu, Jinwei Sun, Zhiyong Liu","doi":"10.1109/APSIPA.2016.7820683","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820683","url":null,"abstract":"In order to establish a more reliable and robust EEG model in sleep stages, the reasonable choice of modeling parameters is necessary. The function of this step is to select a subset of d features from a set of D features based on some optimization criterion, and provide the most optimal input features of classification. In the present study, an improved simulated annealing genetic algorithm (ISAGA) was proposed. 25 feature parameters were extracted from the sleep EEG in MIT-BIH polysomnography database. The feature selection results demonstrated that ISAGA can get a higher classification accuracy with fewer feature number than the correlation coefficient algorithm (CCA), genetic algorithm (GA), adaptive genetic algorithm (AGA) and simulated annealing genetic algorithm (SAGA). Compared to using all the features in sleep staging, the classification accuracy of ISAGA with optimal features is about 92.00%, which improved about 4.83%.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121724641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}