Pub Date : 2013-10-16DOI: 10.1109/IIH-MSP.2013.120
Keizo Kato, A. Ito
In the contemporary music scene, the death growl and screaming voice are often used in the extreme metal, and have been one of the indispensable singing styles. In this study, we made an attempt to clarify the acoustic feature of the death growl and screaming voice. We chose jitter, shimmer and HNR as the acoustic features, and found that the death growl and screaming voice have much larger jitter and shimmer, lower HNR compared with the normal voice. Next, we investigated the relationship between subjective impression and acoustic feature, and found that the screaming voice has an optimum jitter.
{"title":"Acoustic Features and Auditory Impressions of Death Growl and Screaming Voice","authors":"Keizo Kato, A. Ito","doi":"10.1109/IIH-MSP.2013.120","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.120","url":null,"abstract":"In the contemporary music scene, the death growl and screaming voice are often used in the extreme metal, and have been one of the indispensable singing styles. In this study, we made an attempt to clarify the acoustic feature of the death growl and screaming voice. We chose jitter, shimmer and HNR as the acoustic features, and found that the death growl and screaming voice have much larger jitter and shimmer, lower HNR compared with the normal voice. Next, we investigated the relationship between subjective impression and acoustic feature, and found that the screaming voice has an optimum jitter.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126525691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Treviño, T. Okamoto, Y. Iwaya, Junfeng Li, Yôiti Suzuki
Ambisonics, a sound field reproduction technique, can present spatial audio with high accuracy. However, it has not been widely adopted due to the hardware requirements it imposes. In consequence, very few Ambisonics encoded contents are available. End-users will find it more attractive to invest in Ambisonics systems if they can be made backwards compatible with existing recordings. In this paper, a technique to spatially extrapolate stereo sources for listening using Ambisonics reproduction systems is introduced. The proposed approach makes use of the spatial information already encoded in most modern stereo recordings. This spatial information is usually used to decode multichannel audio signals, such as 5.1 channel surround. Unlike these technologies, the proposal attempts to extract a continuous description of the 360-degrees horizontal sound field which would result in the original stereo recording. The output is encoded using the circular harmonic functions, making it adequate for reproduction using horizontally-distributed loudspeaker arrays.
{"title":"Extrapolation of Horizontal Ambisonics Data from Mainstream Stereo Sources","authors":"J. Treviño, T. Okamoto, Y. Iwaya, Junfeng Li, Yôiti Suzuki","doi":"10.1109/IIH-MSP.2013.83","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.83","url":null,"abstract":"Ambisonics, a sound field reproduction technique, can present spatial audio with high accuracy. However, it has not been widely adopted due to the hardware requirements it imposes. In consequence, very few Ambisonics encoded contents are available. End-users will find it more attractive to invest in Ambisonics systems if they can be made backwards compatible with existing recordings. In this paper, a technique to spatially extrapolate stereo sources for listening using Ambisonics reproduction systems is introduced. The proposed approach makes use of the spatial information already encoded in most modern stereo recordings. This spatial information is usually used to decode multichannel audio signals, such as 5.1 channel surround. Unlike these technologies, the proposal attempts to extract a continuous description of the 360-degrees horizontal sound field which would result in the original stereo recording. The output is encoded using the circular harmonic functions, making it adequate for reproduction using horizontally-distributed loudspeaker arrays.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121786049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-16DOI: 10.1109/IIH-MSP.2013.101
Peng Li, Q. Kong, Yanpeng Ma
A visual secret sharing (VSS) scheme decodes a secret image into n shares, such that stacking any k (n) or more shares can visually reveal the secret image. Random grid is a novel type of VSS technique with the advantage of no size expansion of shares. In this paper, we integrate the techniques of random grid and probabilistic visual cryptography, and propose a random grid-based probabilistic VSS scheme. Only k out of n shares are generated by random grids, while the other n-k shares are generated by probabilistic visual cryptography technique. The experimental results show that the proposed scheme performs well especially when stacking all shares.
{"title":"Probabilistic Visual Secret Sharing Scheme Based on Random Grids","authors":"Peng Li, Q. Kong, Yanpeng Ma","doi":"10.1109/IIH-MSP.2013.101","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.101","url":null,"abstract":"A visual secret sharing (VSS) scheme decodes a secret image into n shares, such that stacking any k (n) or more shares can visually reveal the secret image. Random grid is a novel type of VSS technique with the advantage of no size expansion of shares. In this paper, we integrate the techniques of random grid and probabilistic visual cryptography, and propose a random grid-based probabilistic VSS scheme. Only k out of n shares are generated by random grids, while the other n-k shares are generated by probabilistic visual cryptography technique. The experimental results show that the proposed scheme performs well especially when stacking all shares.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131705426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intelligent Transportation System is a worldwide research hotspot and the extraction of traffic parameters is a crucial part of it for subsequent identification of traffic states. This paper proposes a novel approach of extracting traffic parameters such as time occupancy, volume and vehicle velocity based on video images. Visual features obtained from spatio-temporal images are more immune to environmental variations which including illuminations and background. Also binaryzation with Self-adaptive Threshold based on clustering can segment vehicle areas more accurately. With combination of parameters modification, PVI and EPI analysis serve to extract final parameters even when congestion happens. To testify the efficacy of measurement, extracted parameters are input to classifier of Support Vector Machine (SVM) to identify four levels of traffic states, which are fluent, non-congestion, congestion and terrible congestion respectively. Experimental results show that performance can sustain various environmental conditions and the accuracy is robust in heavy traffic states.
{"title":"A Novel Approach of Extracting Traffic Parameters by Using Video Features","authors":"Yuan Zhang, Ke-bin Jia","doi":"10.1109/IIH-MSP.2013.66","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.66","url":null,"abstract":"Intelligent Transportation System is a worldwide research hotspot and the extraction of traffic parameters is a crucial part of it for subsequent identification of traffic states. This paper proposes a novel approach of extracting traffic parameters such as time occupancy, volume and vehicle velocity based on video images. Visual features obtained from spatio-temporal images are more immune to environmental variations which including illuminations and background. Also binaryzation with Self-adaptive Threshold based on clustering can segment vehicle areas more accurately. With combination of parameters modification, PVI and EPI analysis serve to extract final parameters even when congestion happens. To testify the efficacy of measurement, extracted parameters are input to classifier of Support Vector Machine (SVM) to identify four levels of traffic states, which are fluent, non-congestion, congestion and terrible congestion respectively. Experimental results show that performance can sustain various environmental conditions and the accuracy is robust in heavy traffic states.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133666291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-16DOI: 10.1109/IIH-MSP.2013.124
Chien-Chung Wu, Jyun-Jie Huang
This study is based on the Samsung Exynos 4210 dual-core Cortex-A9 and Android 4.2.1. The performances of the APPs are improved by tuning CPUs' resources allocation and adding parallelism using the OpenMP compiler directives. The Cgroup and Cpuset are used in this paper to manage the CPUs' resources allocation. Besides, the Android's Native Development Kit is modified to support the Android Apps with OpenMP library in this paper. The study takes the Canny edge detection of the OpenCV as an example. The result shows that the processing time of the one picture can be improved from 939ms to 671ms with 28.5% enhancement.
本研究基于三星Exynos 4210双核Cortex-A9和Android 4.2.1。通过调整cpu的资源分配和使用OpenMP编译器指令增加并行性,可以提高应用程序的性能。本文使用Cgroup和Cpuset来管理cpu的资源分配。此外,本文还对Android的Native Development Kit进行了修改,使其支持OpenMP库的Android应用程序。本研究以OpenCV的Canny边缘检测为例。结果表明,单幅图像的处理时间从939ms提高到671ms,提高了28.5%。
{"title":"The Study of Android Parallel Programming Based on the Dual-Core Cortex-A9","authors":"Chien-Chung Wu, Jyun-Jie Huang","doi":"10.1109/IIH-MSP.2013.124","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.124","url":null,"abstract":"This study is based on the Samsung Exynos 4210 dual-core Cortex-A9 and Android 4.2.1. The performances of the APPs are improved by tuning CPUs' resources allocation and adding parallelism using the OpenMP compiler directives. The Cgroup and Cpuset are used in this paper to manage the CPUs' resources allocation. Besides, the Android's Native Development Kit is modified to support the Android Apps with OpenMP library in this paper. The study takes the Canny edge detection of the OpenCV as an example. The result shows that the processing time of the one picture can be improved from 939ms to 671ms with 28.5% enhancement.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131296267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigated the effects of linear acceleration on auditory space representation. During self-motion, a short noise burst was presented from one of the loudspeakers aligned parallel to the motion direction when the listener's coronal plane reached a particular location (baseline). The effect of active and passive forward movement on perceived auditory space representation was also investigated. Results showed that the sound position aligned with the subjective coronal plane was displaced compared with the baseline only during forward self-motion. However, the direction of displacement during forward motion differed among experiments. Moreover, no difference was observed between active and passive motion. These results suggest a link between auditory space perception and self-motion perception.
{"title":"Auditory Space Perception during Active and Passive Self-Motion","authors":"S. Sakamoto, W. Teramoto, Yôiti Suzuki, J. Gyoba","doi":"10.1109/IIH-MSP.2013.90","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.90","url":null,"abstract":"We investigated the effects of linear acceleration on auditory space representation. During self-motion, a short noise burst was presented from one of the loudspeakers aligned parallel to the motion direction when the listener's coronal plane reached a particular location (baseline). The effect of active and passive forward movement on perceived auditory space representation was also investigated. Results showed that the sound position aligned with the subjective coronal plane was displaced compared with the baseline only during forward self-motion. However, the direction of displacement during forward motion differed among experiments. Moreover, no difference was observed between active and passive motion. These results suggest a link between auditory space perception and self-motion perception.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131873482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-16DOI: 10.1109/IIH-MSP.2013.151
Mehran Iranpour
In the graph theory, a Hamiltonian path is defined as a path in a graph which includes every vertex exactly once. The proposed method divides the cover image into some m×n blocks and partitions the binary secret data into some vectors with the length of m*n. For each block, one Hamiltonian path is first found such that the LSB of pixels of the block along this path have the maximum similarity to the corresponding vector of data. Then this part of data is embedded into the first LSB of pixels of the block along the best path using the modified LSB matching and the code of this path is embedded into the second LSB of the pixels using a novel method such that the minimum MSE value between the block of the cover image and the block of the stego-image is achieved. The experimental results evaluated on 8000 natural images reveal that the proposed method produces minimum distortion in the stego-images. Security of our method against one of the most effective steganalyzers is demonstrated.
{"title":"LSB-Based Steganography Using Hamiltonian Paths","authors":"Mehran Iranpour","doi":"10.1109/IIH-MSP.2013.151","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.151","url":null,"abstract":"In the graph theory, a Hamiltonian path is defined as a path in a graph which includes every vertex exactly once. The proposed method divides the cover image into some m×n blocks and partitions the binary secret data into some vectors with the length of m*n. For each block, one Hamiltonian path is first found such that the LSB of pixels of the block along this path have the maximum similarity to the corresponding vector of data. Then this part of data is embedded into the first LSB of pixels of the block along the best path using the modified LSB matching and the code of this path is embedded into the second LSB of the pixels using a novel method such that the minimum MSE value between the block of the cover image and the block of the stego-image is achieved. The experimental results evaluated on 8000 natural images reveal that the proposed method produces minimum distortion in the stego-images. Security of our method against one of the most effective steganalyzers is demonstrated.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"320 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115458255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Road surface condition is very important for safe driving especially in bad weather such as snow or rainy. In this paper we proposed a video camera road image status detection method. The color and texture information of the road surface is extracted from the video frame and then we build a naïve Bayesian classifier to classify the road surface image into three categories, dry, mild snow coverage, and heavy snow coverage. Meanwhile we compared the classification performance with another three popular classifiers, K-NN, Neural Network and SVM. Experimental results show that the naïve Bayesian classifier is most suitable for this classification problem.
{"title":"Road Surface Condition Classification Based on Color and Texture Information","authors":"Zhonghua Sun, Ke-bin Jia","doi":"10.1109/IIH-MSP.2013.43","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.43","url":null,"abstract":"Road surface condition is very important for safe driving especially in bad weather such as snow or rainy. In this paper we proposed a video camera road image status detection method. The color and texture information of the road surface is extracted from the video frame and then we build a naïve Bayesian classifier to classify the road surface image into three categories, dry, mild snow coverage, and heavy snow coverage. Meanwhile we compared the classification performance with another three popular classifiers, K-NN, Neural Network and SVM. Experimental results show that the naïve Bayesian classifier is most suitable for this classification problem.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123914037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-16DOI: 10.1109/IIH-MSP.2013.150
Hua Chen, Anhong Wang, Xiaoli Ma
In wireless video multicast, the main challenge is to meet the demands of heterogeneous receivers who face the same video source stream but show different channel characteristics. This paper proposes an improved compressed-sensing-base wireless video multicast (iCS-WVM) technique, where the measurements of each group of pictures (GOP) are packed and transmitted through an analog channel. Each receiver obtains some packets according to its channel characteristic and then the packets are decoded by exploiting the correlation of the video frames. Due to the fact that measurements from compressed sensing have the same importance, the receiver with good channel will receive more packets so as to recover a better quality, which guarantees iCS-WVM with a graceful degradation rather than cliff effects. Additionally, by exploit the motion estimation at the decoder side, iCS-WCM achieves better rate-distortion performance than the state-of-the-art Soft Cast and PQC, meanwhile a low-complexity encoding is preserved.
{"title":"An Improved Wireless Video Multicast Based on Compressed Sensing","authors":"Hua Chen, Anhong Wang, Xiaoli Ma","doi":"10.1109/IIH-MSP.2013.150","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.150","url":null,"abstract":"In wireless video multicast, the main challenge is to meet the demands of heterogeneous receivers who face the same video source stream but show different channel characteristics. This paper proposes an improved compressed-sensing-base wireless video multicast (iCS-WVM) technique, where the measurements of each group of pictures (GOP) are packed and transmitted through an analog channel. Each receiver obtains some packets according to its channel characteristic and then the packets are decoded by exploiting the correlation of the video frames. Due to the fact that measurements from compressed sensing have the same importance, the receiver with good channel will receive more packets so as to recover a better quality, which guarantees iCS-WVM with a graceful degradation rather than cliff effects. Additionally, by exploit the motion estimation at the decoder side, iCS-WCM achieves better rate-distortion performance than the state-of-the-art Soft Cast and PQC, meanwhile a low-complexity encoding is preserved.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124283621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virtual vascular interventional operation is a virtual reality technology application in medical training. In this paper, we propose a real-time approach to vascular beating modeling, which can be used in virtual interventional simulation system. First of all, we read out the VTK file of vascular model through VTK function. Then, we choose a center point in a small space and structure the mass-spring model between the center point and other points discreted on the vascular model. After that, we give the center point a force to pull other points. Finally, we set a time callback function to give a beating frequency. Thus, the vascular will beat according to human heart beating rate. Our experimental results show that the method is simple and effective, it can also provide a visually plausible vascular beating effect in a real time way. So it can become an important part of virtual interventional simulation system.
{"title":"Real-Time Vascular Beating Analog Based on the Mass-Spring Model","authors":"Xin Zhi, Ziming Zhang, Yuhui Gao, Zhixiang Zhang","doi":"10.1109/IIH-MSP.2013.55","DOIUrl":"https://doi.org/10.1109/IIH-MSP.2013.55","url":null,"abstract":"Virtual vascular interventional operation is a virtual reality technology application in medical training. In this paper, we propose a real-time approach to vascular beating modeling, which can be used in virtual interventional simulation system. First of all, we read out the VTK file of vascular model through VTK function. Then, we choose a center point in a small space and structure the mass-spring model between the center point and other points discreted on the vascular model. After that, we give the center point a force to pull other points. Finally, we set a time callback function to give a beating frequency. Thus, the vascular will beat according to human heart beating rate. Our experimental results show that the method is simple and effective, it can also provide a visually plausible vascular beating effect in a real time way. So it can become an important part of virtual interventional simulation system.","PeriodicalId":105427,"journal":{"name":"2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116320810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}