Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288581
Shu-Hsien Wang, Chih-yu Hsu, Y. Hong
We propose a reservation-based channel access policy for multi-channel cognitive radio networks. To enhance the throughput of secondary users (SUs), SUs are allowed to select channels opportunistically according to both the local channel state information (CSI) and the spectrum sensing outcomes. SUs will then compete for the right of transmission on the chosen channel by emitting reservation packets to the access point sequentially according to their local CSI. We further devise a proper threshold on channel gains such that only the SUs whose channel gains are sufficiently high can reserve channels and the interference from SUs to the licensed network can be limited. A channel aware splitting algorithm is adopted to schedule the SU with the highest channel gain to transmit at each time instant. From simulations, the proposed channel access policy outperforms the policies that take into consideration only CSI or sensing outcomes.
{"title":"Channel and sensing aware channel access policy for multi-channel cognitive radio networks","authors":"Shu-Hsien Wang, Chih-yu Hsu, Y. Hong","doi":"10.1109/ICASSP.2012.6288581","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288581","url":null,"abstract":"We propose a reservation-based channel access policy for multi-channel cognitive radio networks. To enhance the throughput of secondary users (SUs), SUs are allowed to select channels opportunistically according to both the local channel state information (CSI) and the spectrum sensing outcomes. SUs will then compete for the right of transmission on the chosen channel by emitting reservation packets to the access point sequentially according to their local CSI. We further devise a proper threshold on channel gains such that only the SUs whose channel gains are sufficiently high can reserve channels and the interference from SUs to the licensed network can be limited. A channel aware splitting algorithm is adopted to schedule the SU with the highest channel gain to transmit at each time instant. From simulations, the proposed channel access policy outperforms the policies that take into consideration only CSI or sensing outcomes.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"20 1","pages":"3141-3144"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81492998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288876
Xiong Xiao, Chng Eng Siong, Haizhou Li
In this paper, we propose a framework for joint normalization of spectral and temporal statistics of speech features for robust speech recognition. Current feature normalization approaches normalize the spectral and temporal aspects of feature statistics separately to overcome noise and reverberation. As a result, the interaction between the spectral normalization (e.g. mean and variance normalization, MVN) and temporal normalization (e.g. temporal structure normalization, TSN) is ignored. We propose a joint spectral and temporal normalization (JSTN) framework to simultaneously normalize these two aspects of feature statistics. In JSTN, feature trajectories are filtered by linear filters and the filters' coefficients are optimized by maximizing a likelihood-based objective function. Experimental results on Aurora-5 benchmark task show that JSTN consistently out-performs the cascade of MVN and TSN on test data corrupted by both additive noise and reverberation, which validates our proposal. Specifically, JSTN reduces average word error rate by 8-9% relatively over the cascade of MVN and TSN for both artificial and real noisy data.
{"title":"Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech","authors":"Xiong Xiao, Chng Eng Siong, Haizhou Li","doi":"10.1109/ICASSP.2012.6288876","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288876","url":null,"abstract":"In this paper, we propose a framework for joint normalization of spectral and temporal statistics of speech features for robust speech recognition. Current feature normalization approaches normalize the spectral and temporal aspects of feature statistics separately to overcome noise and reverberation. As a result, the interaction between the spectral normalization (e.g. mean and variance normalization, MVN) and temporal normalization (e.g. temporal structure normalization, TSN) is ignored. We propose a joint spectral and temporal normalization (JSTN) framework to simultaneously normalize these two aspects of feature statistics. In JSTN, feature trajectories are filtered by linear filters and the filters' coefficients are optimized by maximizing a likelihood-based objective function. Experimental results on Aurora-5 benchmark task show that JSTN consistently out-performs the cascade of MVN and TSN on test data corrupted by both additive noise and reverberation, which validates our proposal. Specifically, JSTN reduces average word error rate by 8-9% relatively over the cascade of MVN and TSN for both artificial and real noisy data.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"26 1","pages":"4325-4328"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81593817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288450
Liyang Rui, K. C. Ho
The nonlinear nature of the source localization problem creates bias to a location estimate. The bias could play a significant role in limiting the performance of localization and tracking when multiple measurements at different instants are available. This paper performs bias analysis of the source location estimate obtained by the maximum likelihood estimator, where the positioning measurements can be TOA, TDOA, or AOA. The effect of bias to the mean-square localization error is examined and the amounts of bias introduced by the three types of measurements are contrasted.
{"title":"Bias analysis of source localization using the maximum likelihood estimator","authors":"Liyang Rui, K. C. Ho","doi":"10.1109/ICASSP.2012.6288450","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288450","url":null,"abstract":"The nonlinear nature of the source localization problem creates bias to a location estimate. The bias could play a significant role in limiting the performance of localization and tracking when multiple measurements at different instants are available. This paper performs bias analysis of the source location estimate obtained by the maximum likelihood estimator, where the positioning measurements can be TOA, TDOA, or AOA. The effect of bias to the mean-square localization error is examined and the amounts of bias introduced by the three types of measurements are contrasted.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"45 1","pages":"2605-2608"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81944244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288470
David M. Cohen, Douglas L. Jones, S. Narayanan
Applications such as long-term environmental monitoring and large-scale surveillance demand reliable performance from sensor nodes while operating within strict energy constraints. There is often not enough power for sensors to make measurements all of the time. In these cases, one must decide when to run each sensor. To this end, we develop a one-step optimal sensor-scheduling algorithm based on expected-utility maximization. “Utility” is an application-specific measure of the benefit from a given sensor measurement. In sensing environments that can be modeled using a hidden Markov model, selecting the appropriate combination of sensors at each time instant enables maximization of the expected utility while operating within an energy budget. For some budgets, the utility-based algorithm shows more than 300% utility gains over a constant duty-cycle scheme designed to consume the same amount of energy. These benefits are dependent on the energy budget.
{"title":"Expected-utility-based sensor selection for state estimation","authors":"David M. Cohen, Douglas L. Jones, S. Narayanan","doi":"10.1109/ICASSP.2012.6288470","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288470","url":null,"abstract":"Applications such as long-term environmental monitoring and large-scale surveillance demand reliable performance from sensor nodes while operating within strict energy constraints. There is often not enough power for sensors to make measurements all of the time. In these cases, one must decide when to run each sensor. To this end, we develop a one-step optimal sensor-scheduling algorithm based on expected-utility maximization. “Utility” is an application-specific measure of the benefit from a given sensor measurement. In sensing environments that can be modeled using a hidden Markov model, selecting the appropriate combination of sensors at each time instant enables maximization of the expected utility while operating within an energy budget. For some budgets, the utility-based algorithm shows more than 300% utility gains over a constant duty-cycle scheme designed to consume the same amount of energy. These benefits are dependent on the energy budget.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"2685-2688"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84344154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6287989
Olivier Schwander, F. Nielsen
Gaussian mixture models are a widespread tool for modeling various and complex probability density functions. They can be estimated using Expectation- Maximization or Kernel Density Estimation. Expectation- Maximization leads to compact models but may be expensive to compute whereas Kernel Density Estimation yields to large models which are cheap to build. In this paper we present new methods to get high-quality models that are both compact and fast to compute. This is accomplished with clustering methods and centroids computation. The quality of the resulting mixtures is evaluated in terms of log-likelihood and Kullback-Leibler divergence using examples from a bioinformatics application.
{"title":"Model centroids for the simplification of Kernel Density estimators","authors":"Olivier Schwander, F. Nielsen","doi":"10.1109/ICASSP.2012.6287989","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6287989","url":null,"abstract":"Gaussian mixture models are a widespread tool for modeling various and complex probability density functions. They can be estimated using Expectation- Maximization or Kernel Density Estimation. Expectation- Maximization leads to compact models but may be expensive to compute whereas Kernel Density Estimation yields to large models which are cheap to build. In this paper we present new methods to get high-quality models that are both compact and fast to compute. This is accomplished with clustering methods and centroids computation. The quality of the resulting mixtures is evaluated in terms of log-likelihood and Kullback-Leibler divergence using examples from a bioinformatics application.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"737-740"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84397645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288758
Zhenlei Yan, Jie Zhou
The rapid growth of population in social networks has posed a challenge to existing systems for recommending to a user new friends having similar interests. In this paper, we address this user recommendation problem in social networks by proposing a novel framework which utilizes users' tagging information with tensor factorization. This work brings two major contributions: (1) A tensor model is proposed to capture the potential association among user, user's interests and friends in social tagging systems; (2) A novel approach is proposed to recommend new friends based on this model. The experiments on a real-world dataset crawled from Last.fm show that the proposed method outperforms other state-of-the-art approaches.
{"title":"User recommendation with tensor factorization in social networks","authors":"Zhenlei Yan, Jie Zhou","doi":"10.1109/ICASSP.2012.6288758","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288758","url":null,"abstract":"The rapid growth of population in social networks has posed a challenge to existing systems for recommending to a user new friends having similar interests. In this paper, we address this user recommendation problem in social networks by proposing a novel framework which utilizes users' tagging information with tensor factorization. This work brings two major contributions: (1) A tensor model is proposed to capture the potential association among user, user's interests and friends in social tagging systems; (2) A novel approach is proposed to recommend new friends based on this model. The experiments on a real-world dataset crawled from Last.fm show that the proposed method outperforms other state-of-the-art approaches.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"32 1","pages":"3853-3856"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84992023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288276
Toshihisa Tanaka, Y. Washizawa, A. Kuh
Adaptive online algorithms for simultaneously extracting nonlinear eigenvectors of kernel principal component analysis (KPCA) are developed. KPCA needs all the observed samples to represent basis functions, and the same scale of eigenvalue problem as the number of samples should be solved. This paper reformulates KPCA and deduces an expression in the Euclidean space, where an algorithm for tracking generalized eigenvectors is applicable. The developed algorithm here is least mean squares (LMS)-type and recursive least squares (RLS)-type. Numerical example is then illustrated to support the analysis.
{"title":"Adaptive kernel principal components tracking","authors":"Toshihisa Tanaka, Y. Washizawa, A. Kuh","doi":"10.1109/ICASSP.2012.6288276","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288276","url":null,"abstract":"Adaptive online algorithms for simultaneously extracting nonlinear eigenvectors of kernel principal component analysis (KPCA) are developed. KPCA needs all the observed samples to represent basis functions, and the same scale of eigenvalue problem as the number of samples should be solved. This paper reformulates KPCA and deduces an expression in the Euclidean space, where an algorithm for tracking generalized eigenvectors is applicable. The developed algorithm here is least mean squares (LMS)-type and recursive least squares (RLS)-type. Numerical example is then illustrated to support the analysis.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"22 1","pages":"1905-1908"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85021729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288099
Zhenyu Liu, Dongsheng Wang, Junwei Zhou, T. Ikenaga
Rate distortion optimization (RDO) algorithm plays the vital role in the up to date hybrid video codec H.264/AVC. The RDO algorithm of H.264/AVC reference software is built up by assuming that the transformed residues are memoryless variables. However, our experiments reveal that, for some sequences, the strong temporal correlations exist in the prediction residues. This paper extends the Lagrangian optimization techniques by modeling the transformed residues as the first-order Markov source and calibrating the distortion model with the piecewise approximation function. The proposed algorithms adjust the Lagrangian multiplier dynamically to improve the overall coding quality. Comprehensive experiments testify that, as compared with the JM reference software, our optimizations can achieve up to 1.875dB coding gain. Moreover, our algorithms posses more robust coding performance and introduce less computational overhead than the Laplace distribution based methods. The inherent short process latency makes it possible to cooperate our algorithms with rate control operation. Last but not least, the proposed approach is also useful for the emerging standard, HEVC.
{"title":"Lagrangian multiplier optimization using correlations in residues","authors":"Zhenyu Liu, Dongsheng Wang, Junwei Zhou, T. Ikenaga","doi":"10.1109/ICASSP.2012.6288099","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288099","url":null,"abstract":"Rate distortion optimization (RDO) algorithm plays the vital role in the up to date hybrid video codec H.264/AVC. The RDO algorithm of H.264/AVC reference software is built up by assuming that the transformed residues are memoryless variables. However, our experiments reveal that, for some sequences, the strong temporal correlations exist in the prediction residues. This paper extends the Lagrangian optimization techniques by modeling the transformed residues as the first-order Markov source and calibrating the distortion model with the piecewise approximation function. The proposed algorithms adjust the Lagrangian multiplier dynamically to improve the overall coding quality. Comprehensive experiments testify that, as compared with the JM reference software, our optimizations can achieve up to 1.875dB coding gain. Moreover, our algorithms posses more robust coding performance and introduce less computational overhead than the Laplace distribution based methods. The inherent short process latency makes it possible to cooperate our algorithms with rate control operation. Last but not least, the proposed approach is also useful for the emerging standard, HEVC.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"2 1","pages":"1185-1188"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85233219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288213
R. Yagi, Tomohito Kajimoto, T. Nishitani
A compact implementation of a foreground segmentation processor in a multi-resolution transform domain has been proposed for HDTV signals. The proposed architecture is designed to simplify system controls by the hardware streaming and to reduce required memory capacities. It enables flowing pixels through all functional units in order, including multi-resolution spatial transform and temporal segmentation. The resultant architecture does not use memories except I/O buffers. Therefore, memory modules as well as complex address manipulation over the multiple global transforms and spatial/temporal interface is not required. The FPGA prototype chip dissipates 150 mW of power. This approach can be used for tablets and smart-phone by an ASIC implementation which will reduce the operation power to about 1/6.
{"title":"GMM foreground segmentation processor based on address free pixel streams","authors":"R. Yagi, Tomohito Kajimoto, T. Nishitani","doi":"10.1109/ICASSP.2012.6288213","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288213","url":null,"abstract":"A compact implementation of a foreground segmentation processor in a multi-resolution transform domain has been proposed for HDTV signals. The proposed architecture is designed to simplify system controls by the hardware streaming and to reduce required memory capacities. It enables flowing pixels through all functional units in order, including multi-resolution spatial transform and temporal segmentation. The resultant architecture does not use memories except I/O buffers. Therefore, memory modules as well as complex address manipulation over the multiple global transforms and spatial/temporal interface is not required. The FPGA prototype chip dissipates 150 mW of power. This approach can be used for tablets and smart-phone by an ASIC implementation which will reduce the operation power to about 1/6.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"88 1","pages":"1653-1656"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85591359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-25DOI: 10.1109/ICASSP.2012.6288090
Zhengguo Li, Chuohao Yeo, Y. H. Tan, S. Rahardja
Existing structural similarity (SSIM) index comprises of one term on luminance comparison and the other term on contrast and structure comparison. In this paper, the SSIM index is first improved by introducing three weighting factors to the second term such that it is adaptive to local intensities of two images to be compared. The improved SSIM (iSSIM) index is further extended for two images with possibly different exposures. Experimental results show that the proposed indices are more robust to large intensity changes of two images from the same scene and more sensitive to two images from different scenes than the existing SSIM index.
{"title":"A local intensity adaptive structural similarity index","authors":"Zhengguo Li, Chuohao Yeo, Y. H. Tan, S. Rahardja","doi":"10.1109/ICASSP.2012.6288090","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288090","url":null,"abstract":"Existing structural similarity (SSIM) index comprises of one term on luminance comparison and the other term on contrast and structure comparison. In this paper, the SSIM index is first improved by introducing three weighting factors to the second term such that it is adaptive to local intensities of two images to be compared. The improved SSIM (iSSIM) index is further extended for two images with possibly different exposures. Experimental results show that the proposed indices are more robust to large intensity changes of two images from the same scene and more sensitive to two images from different scenes than the existing SSIM index.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"56 1","pages":"1149-1152"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85681993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}