Pub Date : 2017-03-11DOI: 10.1109/ICASSP.2017.7952530
Tomáš Denemark, J. Fridrich
It is widely recognized that incorporating side-information at the sender can significantly improve steganographic security in practice. Currently, most side-informed schemes for digital images utilize a high quality “precover” image that is subsequently processed and then jointly quantized and embedded with a secret. In this paper, we investigate an alternative form of side-information in the form of two JPEG images of the same scene. The second JPEG image is used to determine the preferred polarity of embedding changes and to modulate their costs. Tests on real imagery show a very significant improvement in empirical security with respect to steganography utilizing a single JPEG image.
{"title":"Steganography with two JPEGs of the same scene","authors":"Tomáš Denemark, J. Fridrich","doi":"10.1109/ICASSP.2017.7952530","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952530","url":null,"abstract":"It is widely recognized that incorporating side-information at the sender can significantly improve steganographic security in practice. Currently, most side-informed schemes for digital images utilize a high quality “precover” image that is subsequently processed and then jointly quantized and embedded with a secret. In this paper, we investigate an alternative form of side-information in the form of two JPEG images of the same scene. The second JPEG image is used to determine the preferred polarity of embedding changes and to modulate their costs. Tests on real imagery show a very significant improvement in empirical security with respect to steganography utilizing a single JPEG image.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126628239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-11DOI: 10.1109/ICASSP.2017.7953318
Samuel Pinilla, Camilo Noriega, H. Arguello
X-ray crystallography is an experimental technique to estimate the 3D atomic positions of the elements present in a crystal. This technique constructs the 3D structure from the phase of diffracted and patterned X-rays (DPX). Multiple intensity DPX measurements are acquired to solve the phase retrieval problem. The feasibility of implementing this technique depends on solving the phase retrieval problem using expensive multiple valued patterns and the Truncated Wirtinger Flow Algorithm. This paper presents a Stochastic Truncated Wirtinger Flow Algorithm (STWF) which solves the phase retrieval problem based on DPX measurements low-cost boolean block-unblock coded apertures. Several simulations are realized to demonstrate the convergence of the STWF algorithm and the optimal parameters of the boolean coded apertures. The results indicate that given the DPX measurements, the quality of reconstructed phase images using STWF attained up 24:63dB of PSNR.
{"title":"Stochastic Truncated Wirtinger Flow Algorithm for phase retrieval using boolean coded apertures","authors":"Samuel Pinilla, Camilo Noriega, H. Arguello","doi":"10.1109/ICASSP.2017.7953318","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953318","url":null,"abstract":"X-ray crystallography is an experimental technique to estimate the 3D atomic positions of the elements present in a crystal. This technique constructs the 3D structure from the phase of diffracted and patterned X-rays (DPX). Multiple intensity DPX measurements are acquired to solve the phase retrieval problem. The feasibility of implementing this technique depends on solving the phase retrieval problem using expensive multiple valued patterns and the Truncated Wirtinger Flow Algorithm. This paper presents a Stochastic Truncated Wirtinger Flow Algorithm (STWF) which solves the phase retrieval problem based on DPX measurements low-cost boolean block-unblock coded apertures. Several simulations are realized to demonstrate the convergence of the STWF algorithm and the optimal parameters of the boolean coded apertures. The results indicate that given the DPX measurements, the quality of reconstructed phase images using STWF attained up 24:63dB of PSNR.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132749520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-11DOI: 10.1109/ICASSP.2017.7953197
Weixin Zhu, Wu Guo, Guoping Hu
Speaker diarization in noisy conditions is addressed in this paper. The regression-based DNN is first adopted to map the noisy acoustic features to the clean features, and then consensus clustering of the original and mapped features is used to fuse the diarization results. The experiments are conducted on the IFLY-DIAR-II database, which is a Chinese talk show database with various noise types, such as music, applause and laughter. Compared to the baseline system using PLP features, a 21.26% relative DER improvement can be achieved using the proposed algorithm.
{"title":"Feature mapping for speaker diarization in noisy conditions","authors":"Weixin Zhu, Wu Guo, Guoping Hu","doi":"10.1109/ICASSP.2017.7953197","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953197","url":null,"abstract":"Speaker diarization in noisy conditions is addressed in this paper. The regression-based DNN is first adopted to map the noisy acoustic features to the clean features, and then consensus clustering of the original and mapped features is used to fuse the diarization results. The experiments are conducted on the IFLY-DIAR-II database, which is a Chinese talk show database with various noise types, such as music, applause and laughter. Compared to the baseline system using PLP features, a 21.26% relative DER improvement can be achieved using the proposed algorithm.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122294756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-11DOI: 10.1109/ICASSP.2017.7952508
Yuto Kobayashi, Keita Takahashi, T. Fujii
A new type of light field display called a tensor display was investigated. Although this display consists of only a few light attenuating layers located in front of a backlight, many views can be emitted in different directions simultaneously without sacrificing the resolution of each view. The transmittance pattern of each layer is calculated from a light field, namely, a set of dense multi-view images (typically dozens) that are to be observed from different directions. However, preparing such images is often cumbersome for real objects. We propose a method that does not require multi-view images as the input; instead, a focal stack composed of only a few differently focused images is directly transformed into the layer patterns. Our method greatly reduces the data acquisition cost while also maintaining the quality of the output light field. We validated the method with experiments using synthetic light field datasets and a focal stack acquired by an ordinary camera.
{"title":"From focal stacks to tensor display: A method for light field visualization without multi-view images","authors":"Yuto Kobayashi, Keita Takahashi, T. Fujii","doi":"10.1109/ICASSP.2017.7952508","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952508","url":null,"abstract":"A new type of light field display called a tensor display was investigated. Although this display consists of only a few light attenuating layers located in front of a backlight, many views can be emitted in different directions simultaneously without sacrificing the resolution of each view. The transmittance pattern of each layer is calculated from a light field, namely, a set of dense multi-view images (typically dozens) that are to be observed from different directions. However, preparing such images is often cumbersome for real objects. We propose a method that does not require multi-view images as the input; instead, a focal stack composed of only a few differently focused images is directly transformed into the layer patterns. Our method greatly reduces the data acquisition cost while also maintaining the quality of the output light field. We validated the method with experiments using synthetic light field datasets and a focal stack acquired by an ordinary camera.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121564894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-11DOI: 10.1109/ICASSP.2017.7952365
Shunyao Li, Tejaswi Nanjundaswamy, K. Rose
Conventional pixel-domain block matching temporal (inter) prediction is suboptimal, since it ignores the underlying spatial correlation. Hence in our recent research we proposed transform domain temporal prediction (TDTP), wherein spatially decorrelated transform coefficients are individually predicted. Later we proposed extended block TDTP (EB-TDTP), which fully exploits spatial correlation around reference block boundaries. However, the transform domain temporal correlation exploited by (EB-)TDTP interferes with the frequency response of sub-pixel interpolation filters. Thus, in this paper, we propose to replace the standard sub-pixel interpolation with filters which are jointly designed with EB-TDTP based on statistics of the data, for either separable or non-separable interpolation structures. We also employ a two-loop asymptotic closed-loop (ACL) approach for statistically stable off-line design. Experiments show that our framework can achieve up to 1dB gain in PSNR over HEVC.
{"title":"Jointly optimized transform domain temporal prediction and sub-pixel interpolation","authors":"Shunyao Li, Tejaswi Nanjundaswamy, K. Rose","doi":"10.1109/ICASSP.2017.7952365","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952365","url":null,"abstract":"Conventional pixel-domain block matching temporal (inter) prediction is suboptimal, since it ignores the underlying spatial correlation. Hence in our recent research we proposed transform domain temporal prediction (TDTP), wherein spatially decorrelated transform coefficients are individually predicted. Later we proposed extended block TDTP (EB-TDTP), which fully exploits spatial correlation around reference block boundaries. However, the transform domain temporal correlation exploited by (EB-)TDTP interferes with the frequency response of sub-pixel interpolation filters. Thus, in this paper, we propose to replace the standard sub-pixel interpolation with filters which are jointly designed with EB-TDTP based on statistics of the data, for either separable or non-separable interpolation structures. We also employ a two-loop asymptotic closed-loop (ACL) approach for statistically stable off-line design. Experiments show that our framework can achieve up to 1dB gain in PSNR over HEVC.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128922240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-10DOI: 10.1109/ICASSP.2017.7952471
Maria Oliver, Roberto P. Palomares, C. Ballester, G. Haro
We propose a new variational method for the completion of moving shapes through binary video inpainting that works by smoothly recovering the objects into an inpainting hole. We solve it by a simple dynamic shape analysis algorithm based on threshold dynamics. The model takes into account the optical flow and motion occlusions. The resulting inpainting algorithm diffuses the available information along the space and the visible trajectories of the pixels in time. We show its performance with examples from the Sintel dataset, which contains complex object motion and occlusions.
{"title":"Spatio-temporal binary video inpainting via threshold dynamics","authors":"Maria Oliver, Roberto P. Palomares, C. Ballester, G. Haro","doi":"10.1109/ICASSP.2017.7952471","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952471","url":null,"abstract":"We propose a new variational method for the completion of moving shapes through binary video inpainting that works by smoothly recovering the objects into an inpainting hole. We solve it by a simple dynamic shape analysis algorithm based on threshold dynamics. The model takes into account the optical flow and motion occlusions. The resulting inpainting algorithm diffuses the available information along the space and the visible trajectories of the pixels in time. We show its performance with examples from the Sintel dataset, which contains complex object motion and occlusions.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128515319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-10DOI: 10.1109/ICASSP.2017.7952950
Milind Rao, T. Javidi, Yonina C. Eldar, A. Goldsmith
We consider the problem of estimating the covariance matrix and the transition matrix of vector autoregressive (VAR) processes from partial measurements. This model encompasses settings where there are limitations in the data acquisition of the underlying measurement systems so that data is lost or corrupted by noise. An estimator for the covariance matrix of the observations is first presented. More refined estimators, factoring in structural constraints on the covariance matrix such as sparsity, bandedness, sparsity of the inverse and low-rankness are then introduced that are particularly useful in the high-dimensional regime. These estimates are then used to perform system identification by estimating the state transition matrix with or without further structural assumptions. Non-asymptotic guarantees are presented for all estimators.
{"title":"Estimation in autoregressive processes with partial observations","authors":"Milind Rao, T. Javidi, Yonina C. Eldar, A. Goldsmith","doi":"10.1109/ICASSP.2017.7952950","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952950","url":null,"abstract":"We consider the problem of estimating the covariance matrix and the transition matrix of vector autoregressive (VAR) processes from partial measurements. This model encompasses settings where there are limitations in the data acquisition of the underlying measurement systems so that data is lost or corrupted by noise. An estimator for the covariance matrix of the observations is first presented. More refined estimators, factoring in structural constraints on the covariance matrix such as sparsity, bandedness, sparsity of the inverse and low-rankness are then introduced that are particularly useful in the high-dimensional regime. These estimates are then used to perform system identification by estimating the state transition matrix with or without further structural assumptions. Non-asymptotic guarantees are presented for all estimators.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122974106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-10DOI: 10.1109/ICASSP.2017.7952287
M. Salman, Yuhui Du, V. Calhoun
Numerous studies have shown that brain functional connectivity patterns can be time-varying over periods of tens of seconds. It is important to capture inherent non-stationary connectivity states for a better understanding of the influence of disease on brain connectivity. K-means has been widely used to extract the connectivity states from dynamic functional connectivity. However, K-means is dependent on initialization and can be exponentially slow in converging due to extensive noise in dynamic functional connectivity. In this work, we propose to use an affinity propagation clustering method to estimate the connectivity states. By applying K-means and the new method separately, we analyzed dynamic functional connectivity of 82 healthy controls and 82 schizophrenia patients, and then explored group differences between schizophrenia patients and healthy controls in the identified connectivity states. Both methods revealed that group differences mainly lay in visual, sensorimotor and frontal cortices. However, the new approach found more meaningful group differences than K-means. Our finding supports that our method is promising in exploring biomarkers of mental disorders.
{"title":"Identifying FMRI dynamic connectivity states using affinity propagation clustering method: Application to schizophrenia","authors":"M. Salman, Yuhui Du, V. Calhoun","doi":"10.1109/ICASSP.2017.7952287","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952287","url":null,"abstract":"Numerous studies have shown that brain functional connectivity patterns can be time-varying over periods of tens of seconds. It is important to capture inherent non-stationary connectivity states for a better understanding of the influence of disease on brain connectivity. K-means has been widely used to extract the connectivity states from dynamic functional connectivity. However, K-means is dependent on initialization and can be exponentially slow in converging due to extensive noise in dynamic functional connectivity. In this work, we propose to use an affinity propagation clustering method to estimate the connectivity states. By applying K-means and the new method separately, we analyzed dynamic functional connectivity of 82 healthy controls and 82 schizophrenia patients, and then explored group differences between schizophrenia patients and healthy controls in the identified connectivity states. Both methods revealed that group differences mainly lay in visual, sensorimotor and frontal cortices. However, the new approach found more meaningful group differences than K-means. Our finding supports that our method is promising in exploring biomarkers of mental disorders.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125620963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-09DOI: 10.1109/ICASSP.2017.7952924
Amir Daneshmand, Ying Sun, G. Scutari, F. Facchinei
The paper studies a general class of distributed dictionary learning (DL) problems where the learning task is distributed over a multi-agent network with (possibly) time-varying (non-symmetric) connectivity. This setting is relevant, for instance, in scenarios where massive amounts of data are not collocated but collected/stored in different spatial locations. We develop a unified distributed algorithmic framework for this class of non-convex problems and establish its asymptotic convergence. The new method hinges on Successive Convex Approximation (SCA) techniques while leveraging a novel broadcast protocol to disseminate information and distribute the computation over the network, which neither requires the double-stochasticity of the consensus matrices nor the knowledge of the graph sequence to implement. To the best of our knowledge, this is the first distributed scheme with provable convergence for DL (and more generally bi-convex) problems, over (time-varying) digraphs.
{"title":"D2L: Decentralized dictionary learning over dynamic networks","authors":"Amir Daneshmand, Ying Sun, G. Scutari, F. Facchinei","doi":"10.1109/ICASSP.2017.7952924","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952924","url":null,"abstract":"The paper studies a general class of distributed dictionary learning (DL) problems where the learning task is distributed over a multi-agent network with (possibly) time-varying (non-symmetric) connectivity. This setting is relevant, for instance, in scenarios where massive amounts of data are not collocated but collected/stored in different spatial locations. We develop a unified distributed algorithmic framework for this class of non-convex problems and establish its asymptotic convergence. The new method hinges on Successive Convex Approximation (SCA) techniques while leveraging a novel broadcast protocol to disseminate information and distribute the computation over the network, which neither requires the double-stochasticity of the consensus matrices nor the knowledge of the graph sequence to implement. To the best of our knowledge, this is the first distributed scheme with provable convergence for DL (and more generally bi-convex) problems, over (time-varying) digraphs.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128227331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-09DOI: 10.1109/ICASSP.2017.7952956
François D. Côté, I. Psaromiligkos, W. Gross
Despite the importance of distributed learning, few fully distributed support vector machines exist. In this paper, not only do we provide a fully distributed nonlinear SVM; we propose the first distributed constrained-form SVM. In the fully distributed context, a dataset is distributed among networked agents that cannot divulge their data, let alone centralize the data, and can only communicate with their neighbors in the network. Our strategy is based on two algorithms: the Douglas-Rachford algorithm and the projection-gradient method. We validate our approach by demonstrating through simulations that it can train a classifier that agrees closely with the centralized solution.
{"title":"A distributed constrained-form support vector machine","authors":"François D. Côté, I. Psaromiligkos, W. Gross","doi":"10.1109/ICASSP.2017.7952956","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952956","url":null,"abstract":"Despite the importance of distributed learning, few fully distributed support vector machines exist. In this paper, not only do we provide a fully distributed nonlinear SVM; we propose the first distributed constrained-form SVM. In the fully distributed context, a dataset is distributed among networked agents that cannot divulge their data, let alone centralize the data, and can only communicate with their neighbors in the network. Our strategy is based on two algorithms: the Douglas-Rachford algorithm and the projection-gradient method. We validate our approach by demonstrating through simulations that it can train a classifier that agrees closely with the centralized solution.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129617673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}