Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659567
Rohan Kumar Das, S. Prasanna
This work projects an attempt to explore the prospects of text-independent speaker verification (SV) for practical realizable systems. Although the advancements in SV systems have gained attention towards deployable systems, the performance seems to degrade under uncontrolled conditions. A protocol for data collection is designed for the text-independent SV with student attendance as an application to create a database in a real-world scenario. The i-vector based speaker modeling is used for evaluating the performance that depicts major deviation of results from that obtained on standard database. This portrays the significance of having real-world scenario based databases for robust SV studies. Further, studies are performed related to speaker categorization, speaker confidence and model update that showcase their significance towards systems in practice. The database created in this work is available as a part of multi-style speaker recognition database.
{"title":"Investigating Text-independent Speaker Verification from Practically Realizable System Perspective","authors":"Rohan Kumar Das, S. Prasanna","doi":"10.23919/APSIPA.2018.8659567","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659567","url":null,"abstract":"This work projects an attempt to explore the prospects of text-independent speaker verification (SV) for practical realizable systems. Although the advancements in SV systems have gained attention towards deployable systems, the performance seems to degrade under uncontrolled conditions. A protocol for data collection is designed for the text-independent SV with student attendance as an application to create a database in a real-world scenario. The i-vector based speaker modeling is used for evaluating the performance that depicts major deviation of results from that obtained on standard database. This portrays the significance of having real-world scenario based databases for robust SV studies. Further, studies are performed related to speaker categorization, speaker confidence and model update that showcase their significance towards systems in practice. The database created in this work is available as a part of multi-style speaker recognition database.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131579266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659464
Kui Ye, Jing Dong, Wei Wang, Bo Peng, T. Tan
To advance the state of the art of image forensics technologies, a new formulation of splicing localization is proposed, which aims to obtain the masks for both the query and donor images for a pair of query(probe) image and potential donor image if a region of the donor image was spliced into the probe. The former Deep Matching and Validation Network(DMVN) addresses the problem with a novel end-to-end learning based solution. Inheriting the deep dense matching layer, we propose Feature Pyramid Deep Matching and Localization Network(FPLN), whose contributions are three folds. Firstly, instead of using just one feature map as in DMVN, FPLN utilizes a pyramid of feature maps with different resolutions w.r.t. the input image to achieve better localization performance, especially for small objects. Secondly, we add a fusion layer that fuses together all the features after deep dense matching layer, which not only takes full advantage of the correlation information between those features, but is also able to integrate two pathways in DMVN into just one simple pathway, simplifying the subsequent architecture. Lastly, we employ focal loss to address the imbalance problem, as the foreground area is usually much smaller than the background area. The experiments demonstrate the superior performance of our proposed method in detection accuracy and in localizing small tempered regions.
{"title":"Feature Pyramid Deep Matching and Localization Network for Image Forensics","authors":"Kui Ye, Jing Dong, Wei Wang, Bo Peng, T. Tan","doi":"10.23919/APSIPA.2018.8659464","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659464","url":null,"abstract":"To advance the state of the art of image forensics technologies, a new formulation of splicing localization is proposed, which aims to obtain the masks for both the query and donor images for a pair of query(probe) image and potential donor image if a region of the donor image was spliced into the probe. The former Deep Matching and Validation Network(DMVN) addresses the problem with a novel end-to-end learning based solution. Inheriting the deep dense matching layer, we propose Feature Pyramid Deep Matching and Localization Network(FPLN), whose contributions are three folds. Firstly, instead of using just one feature map as in DMVN, FPLN utilizes a pyramid of feature maps with different resolutions w.r.t. the input image to achieve better localization performance, especially for small objects. Secondly, we add a fusion layer that fuses together all the features after deep dense matching layer, which not only takes full advantage of the correlation information between those features, but is also able to integrate two pathways in DMVN into just one simple pathway, simplifying the subsequent architecture. Lastly, we employ focal loss to address the imbalance problem, as the foreground area is usually much smaller than the background area. The experiments demonstrate the superior performance of our proposed method in detection accuracy and in localizing small tempered regions.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116520895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes statistical multichannel speech enhancement based on a deep generative model of speech spectra. Recently, deep neural networks (DNNs) have widely been used for converting noisy speech spectra to clean speech spectra or estimating time-frequency masks. Such a supervised approach, however, requires a sufficient amount of training data (pairs of noisy speech data and clean speech data) and often fails in an unseen noisy environment. This calls for a blind source separation method called multichannel nonnegative matrix factorization (MNMF) that can jointly estimate low-rank source spectra and spatial covariances on the fly. However, the assumption of low-rankness does not hold true for speech spectra. To solve these problems, we propose a semi-supervised method based on an extension of MNMF that consists of a deep generative model for speech spectra and a standard low-rank model for noise spectra. The speech model can be trained in advance with auto-encoding variational Bayes (AEVB) by using only clean speech data and is used as a prior of clean speech spectra for speech enhancement. Given noisy speech spectrogram, we estimate the posterior of clean speech spectra while estimating the noise model on the fly. Such adaptive estimation is achieved by using Gibbs sampling in a unified Bayesian framework. The experimental results showed the potential of the proposed method.
{"title":"Bayesian Multichannel Speech Enhancement with a Deep Speech Prior","authors":"Kouhei Sekiguchi, Yoshiaki Bando, Kazuyoshi Yoshii, Tatsuya Kawahara","doi":"10.23919/APSIPA.2018.8659591","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659591","url":null,"abstract":"This paper describes statistical multichannel speech enhancement based on a deep generative model of speech spectra. Recently, deep neural networks (DNNs) have widely been used for converting noisy speech spectra to clean speech spectra or estimating time-frequency masks. Such a supervised approach, however, requires a sufficient amount of training data (pairs of noisy speech data and clean speech data) and often fails in an unseen noisy environment. This calls for a blind source separation method called multichannel nonnegative matrix factorization (MNMF) that can jointly estimate low-rank source spectra and spatial covariances on the fly. However, the assumption of low-rankness does not hold true for speech spectra. To solve these problems, we propose a semi-supervised method based on an extension of MNMF that consists of a deep generative model for speech spectra and a standard low-rank model for noise spectra. The speech model can be trained in advance with auto-encoding variational Bayes (AEVB) by using only clean speech data and is used as a prior of clean speech spectra for speech enhancement. Given noisy speech spectrogram, we estimate the posterior of clean speech spectra while estimating the noise model on the fly. Such adaptive estimation is achieved by using Gibbs sampling in a unified Bayesian framework. The experimental results showed the potential of the proposed method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133194129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659665
Meng-Zhen Li, Xiao-Lei Zhang
Speaker clustering is an important problem of speech processing, such as speaker diarization, however, its behavior in adverse acoustic environments is lack of comprehensive study. To address this problem, we focus on investigating its components respectively. A speaker clustering system contains three components—a feature extraction front-end, a dimensionality reduction algorithm, and a clustering back-end. In this paper, we use the standard Gaussian mixture model based universal background model (GMM-UBM) as a front end to extract high-dimensional supervectors, and compare three dimensionality reduction algorithms as well as two clustering algorithms. The three dimensionality reduction algorithms are the principal component analysis (PCA), spectral clustering (SC), and multilayer bootstrap network (MBN). The two clustering algorithms are the k-means and agglomerative hierarchical clustering (AHC). We have conducted an extensive experiment with both in-domain and out-of-domain settings on the noisy versions of the NIST 2006 speaker recognition evaluation (SRE) and NIST 2008 SRE corpora. Experimental results in various noisy environments show that (i) the MBN based systems perform the best in most cases, while the SC based systems outperform the PCA based systems as well as the original supervector based systems; (ii) AHC is more robust than k-means.
{"title":"An Investigation of Speaker Clustering Algorithms in Adverse Acoustic Environments","authors":"Meng-Zhen Li, Xiao-Lei Zhang","doi":"10.23919/APSIPA.2018.8659665","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659665","url":null,"abstract":"Speaker clustering is an important problem of speech processing, such as speaker diarization, however, its behavior in adverse acoustic environments is lack of comprehensive study. To address this problem, we focus on investigating its components respectively. A speaker clustering system contains three components—a feature extraction front-end, a dimensionality reduction algorithm, and a clustering back-end. In this paper, we use the standard Gaussian mixture model based universal background model (GMM-UBM) as a front end to extract high-dimensional supervectors, and compare three dimensionality reduction algorithms as well as two clustering algorithms. The three dimensionality reduction algorithms are the principal component analysis (PCA), spectral clustering (SC), and multilayer bootstrap network (MBN). The two clustering algorithms are the k-means and agglomerative hierarchical clustering (AHC). We have conducted an extensive experiment with both in-domain and out-of-domain settings on the noisy versions of the NIST 2006 speaker recognition evaluation (SRE) and NIST 2008 SRE corpora. Experimental results in various noisy environments show that (i) the MBN based systems perform the best in most cases, while the SC based systems outperform the PCA based systems as well as the original supervector based systems; (ii) AHC is more robust than k-means.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133233206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659711
Ya-Ju Yu, Jhih-Kai Wang
Narrowband Internet of Things (NB-IoT) is a new narrowband radio technology in fifth-generation (5G) networks. In NB-IoT cellular networks, to provide low-power wide-area coverage, there are several different resource allocation units that can be allocated by a base station to NB-IoT devices for uplink transmissions. Traditional resource allocation algorithms without considering the multiple resource allocation units are not appropriate for NB-IoT networks, and we observe that only adopting the same resource unit for each device will result in the radio resource wastage. Therefore, this paper investigates the uplink resource allocation problem with the considerations of the new radio frame structure and multiple resource units for NB-IoT networks. The objective is to minimize the used radio resources while each device can transmit its data. We propose an algorithm to determine a suitable resource unit and allocate the radio resource for each device to solve the target problem. Compared with a baseline, the simulation results show the efficacy of the proposed algorithm and provide useful insights into the resource allocation design for NB-IoT systems.
{"title":"Uplink Resource Allocation for Narrowband Internet of Things (NB-IoT) Cellular Networks","authors":"Ya-Ju Yu, Jhih-Kai Wang","doi":"10.23919/APSIPA.2018.8659711","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659711","url":null,"abstract":"Narrowband Internet of Things (NB-IoT) is a new narrowband radio technology in fifth-generation (5G) networks. In NB-IoT cellular networks, to provide low-power wide-area coverage, there are several different resource allocation units that can be allocated by a base station to NB-IoT devices for uplink transmissions. Traditional resource allocation algorithms without considering the multiple resource allocation units are not appropriate for NB-IoT networks, and we observe that only adopting the same resource unit for each device will result in the radio resource wastage. Therefore, this paper investigates the uplink resource allocation problem with the considerations of the new radio frame structure and multiple resource units for NB-IoT networks. The objective is to minimize the used radio resources while each device can transmit its data. We propose an algorithm to determine a suitable resource unit and allocate the radio resource for each device to solve the target problem. Compared with a baseline, the simulation results show the efficacy of the proposed algorithm and provide useful insights into the resource allocation design for NB-IoT systems.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133684185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659648
Yuki Nakahira, K. Kawamoto
Generative adversarial networks(GANs) have been successfully applied for generating high quality natural images and have been extended to the generation of RGB videos and 3D volume data. In this paper we consider the task of generating RGB-D videos, which is less extensively studied and still challenging. We explore deep GAN architectures suitable for the task, and develop 4 GAN architectures based on existing video-based GANs. With a facial expression database, we experimentally find that an extended version of the motion and content decomposed GANs, known as MoCoGAN, provides the highest quality RGB-D videos. We discuss several applications of our GAN to content creation and data augmentation, and also discuss its potential applications in behavioral experiments.
{"title":"Generative adversarial networks for generating RGB-D videos","authors":"Yuki Nakahira, K. Kawamoto","doi":"10.23919/APSIPA.2018.8659648","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659648","url":null,"abstract":"Generative adversarial networks(GANs) have been successfully applied for generating high quality natural images and have been extended to the generation of RGB videos and 3D volume data. In this paper we consider the task of generating RGB-D videos, which is less extensively studied and still challenging. We explore deep GAN architectures suitable for the task, and develop 4 GAN architectures based on existing video-based GANs. With a facial expression database, we experimentally find that an extended version of the motion and content decomposed GANs, known as MoCoGAN, provides the highest quality RGB-D videos. We discuss several applications of our GAN to content creation and data augmentation, and also discuss its potential applications in behavioral experiments.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"14 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133773929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659597
Jiana Li, Xin Liao, Rongbing Hu, Xuchong Liu
As image modification and tampering, especially multiple editing operations, prevail in today's world, identifying authenticity and credibility of digital images becomes increasingly important. Recently, two editing operations, upsampling and mean filtering, have attracted increasing attention. While there are many existing image forensics techniques to identify the existence and order of specific operations in a certain processing chain, few detecting methods are concerned about the order of upsampling and mean filtering operations. Following some strongly indicative analysis in different domains of DFTs of images' p-maps, this paper discusses a newly designed method which utilizes features to determine the order of upsampling and mean filtering operations. Specifically, our goal is to use two features, the symmetry-based PSNR and the fourth order energy fitting curve, to characterize the features of operation chains in the DFTs of images' p-maps. We calculate the variance of the fitting curve and examine the change of fingerprints under different operating intensities to ensure these two features can be broadly applied to operation detection. These features are fed to SVM, effectively discriminating among five combinations of upsampling and mean filtering. The representative experiments can verify the effectiveness of the proposed method.
{"title":"Detectability of the Image Operation Order: Upsampling and Mean Filtering","authors":"Jiana Li, Xin Liao, Rongbing Hu, Xuchong Liu","doi":"10.23919/APSIPA.2018.8659597","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659597","url":null,"abstract":"As image modification and tampering, especially multiple editing operations, prevail in today's world, identifying authenticity and credibility of digital images becomes increasingly important. Recently, two editing operations, upsampling and mean filtering, have attracted increasing attention. While there are many existing image forensics techniques to identify the existence and order of specific operations in a certain processing chain, few detecting methods are concerned about the order of upsampling and mean filtering operations. Following some strongly indicative analysis in different domains of DFTs of images' p-maps, this paper discusses a newly designed method which utilizes features to determine the order of upsampling and mean filtering operations. Specifically, our goal is to use two features, the symmetry-based PSNR and the fourth order energy fitting curve, to characterize the features of operation chains in the DFTs of images' p-maps. We calculate the variance of the fitting curve and examine the change of fingerprints under different operating intensities to ensure these two features can be broadly applied to operation detection. These features are fed to SVM, effectively discriminating among five combinations of upsampling and mean filtering. The representative experiments can verify the effectiveness of the proposed method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133937583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a generative approach to construct high-quality speech synthesis from noisy speech. Studio-quality recorded speech is required to construct high-quality speech synthesis, but most of existing speech has been recorded in a noisy environment. A common method to use noisy speech for training speech synthesis models is reducing the noise before the vocoder-based parameterization. However, such multi-step processes cause an accumulation of spectral distortion. Meanwhile, statistical parametric speech synthesis (SPSS) without vocoders, which directly generates spectral parameters or waveforms, has been proposed recently. The vocoder-free SPSS will enable us to train speech synthesis models considering the noise addition process generally used in signal processing research. In the proposed approach, newly introduced noise generation models trained by a generative adversarial training algorithm randomly generates spectra of the noise. The speech synthesis models are trained to make the sum of their output and the randomly generated noise close to spectra of noisy speech. Because the noise generation model parameters fit the spectrum of the observed noise, the proposed method can alleviate the spectral distortion found in the conventional method. Experimental results demonstrate that the proposed method outperforms the conventional method in terms of synthetic speech quality.
{"title":"Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech","authors":"Masakazu Une, Yuki Saito, Shinnosuke Takamichi, Daichi Kitamura, Ryoichi Miyazaki, H. Saruwatari","doi":"10.23919/APSIPA.2018.8659691","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659691","url":null,"abstract":"This paper proposes a generative approach to construct high-quality speech synthesis from noisy speech. Studio-quality recorded speech is required to construct high-quality speech synthesis, but most of existing speech has been recorded in a noisy environment. A common method to use noisy speech for training speech synthesis models is reducing the noise before the vocoder-based parameterization. However, such multi-step processes cause an accumulation of spectral distortion. Meanwhile, statistical parametric speech synthesis (SPSS) without vocoders, which directly generates spectral parameters or waveforms, has been proposed recently. The vocoder-free SPSS will enable us to train speech synthesis models considering the noise addition process generally used in signal processing research. In the proposed approach, newly introduced noise generation models trained by a generative adversarial training algorithm randomly generates spectra of the noise. The speech synthesis models are trained to make the sum of their output and the randomly generated noise close to spectra of noisy speech. Because the noise generation model parameters fit the spectrum of the observed noise, the proposed method can alleviate the spectral distortion found in the conventional method. Experimental results demonstrate that the proposed method outperforms the conventional method in terms of synthetic speech quality.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131818743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659780
Linkai Huang, Linna Zhou, Yunbiao Guo
Nowadays, with our enhanced technology, network security has gradually been valued by more and more people. Network covert channel, as an important area of network security, was put forward these years. On the one hand, covert channels provide new and safe communication environment for network communication. However, on the other hand, some illegal persons are exploited to spread viruses, trojans, and so on. Therefore, the research on covert channels is particularly important. According to the resource attributes, network covert channels can be divided into two parts, storage-based channels and timestamps-based channels. The paper will base on the storage-based covert channels, and a new detection method based on clustering algorithm will be proposed. According to the clustering analysis of the values of various parts of the data flow packet, the clustering results are graphically displayed. Determine whether there is a covert channel in the packet. The experimental results show that the detection technology has the advantages of high accuracy, simple algorithm, and intuitive result images, and achieves the desired ideal results.
{"title":"Detecting Technology of Network Storage Covert Channel Based on OPTICS Algorithm","authors":"Linkai Huang, Linna Zhou, Yunbiao Guo","doi":"10.23919/APSIPA.2018.8659780","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659780","url":null,"abstract":"Nowadays, with our enhanced technology, network security has gradually been valued by more and more people. Network covert channel, as an important area of network security, was put forward these years. On the one hand, covert channels provide new and safe communication environment for network communication. However, on the other hand, some illegal persons are exploited to spread viruses, trojans, and so on. Therefore, the research on covert channels is particularly important. According to the resource attributes, network covert channels can be divided into two parts, storage-based channels and timestamps-based channels. The paper will base on the storage-based covert channels, and a new detection method based on clustering algorithm will be proposed. According to the clustering analysis of the values of various parts of the data flow packet, the clustering results are graphically displayed. Determine whether there is a covert channel in the packet. The experimental results show that the detection technology has the advantages of high accuracy, simple algorithm, and intuitive result images, and achieves the desired ideal results.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132295696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659720
Mamoru Sugawara, Kazunori Uruma, S. Hangai
This paper proposes a new colorization algorithm for converting grayscale image to color image by using several colored pixels. In order to obtain a properly colorized image, graph signal processing and image segmentation technique are introduced. After a graph is constructed on the given grayscale image, graph signal recovery technique based on graph Fourier transform recovers several important color information. The whole image is colorized by the Levin's algorithm. Experimental results using 5 images show 1.82dB to 10.2dB improvement in PSNR. The proposed algorithm also reduces the fading of color compared with the Levin's algorithm.
{"title":"Colorization Algorithm based on Image Segmentation and Graph Signal Processing","authors":"Mamoru Sugawara, Kazunori Uruma, S. Hangai","doi":"10.23919/APSIPA.2018.8659720","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659720","url":null,"abstract":"This paper proposes a new colorization algorithm for converting grayscale image to color image by using several colored pixels. In order to obtain a properly colorized image, graph signal processing and image segmentation technique are introduced. After a graph is constructed on the given grayscale image, graph signal recovery technique based on graph Fourier transform recovers several important color information. The whole image is colorized by the Levin's algorithm. Experimental results using 5 images show 1.82dB to 10.2dB improvement in PSNR. The proposed algorithm also reduces the fading of color compared with the Levin's algorithm.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115674882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}