2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文中文

Channel and sensing aware channel access policy for multi-channel cognitive radio networks 多信道认知无线网络的信道和感知信道接入策略

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288581

Shu-Hsien Wang, Chih-yu Hsu, Y. Hong

We propose a reservation-based channel access policy for multi-channel cognitive radio networks. To enhance the throughput of secondary users (SUs), SUs are allowed to select channels opportunistically according to both the local channel state information (CSI) and the spectrum sensing outcomes. SUs will then compete for the right of transmission on the chosen channel by emitting reservation packets to the access point sequentially according to their local CSI. We further devise a proper threshold on channel gains such that only the SUs whose channel gains are sufficiently high can reserve channels and the interference from SUs to the licensed network can be limited. A channel aware splitting algorithm is adopted to schedule the SU with the highest channel gain to transmit at each time instant. From simulations, the proposed channel access policy outperforms the policies that take into consideration only CSI or sensing outcomes.

提出了一种基于预留的多信道认知无线网络信道访问策略。为了提高辅助用户(su)的吞吐量，允许su根据本地信道状态信息(CSI)和频谱感知结果随机选择信道。然后，单元将根据其本地CSI顺序向接入点发送保留数据包，从而竞争在所选信道上的传输权。我们进一步设计了信道增益的适当阈值，这样只有信道增益足够高的单元才能保留信道，并且可以限制来自单元对许可网络的干扰。采用信道感知分割算法，在每个时刻调度信道增益最高的SU进行传输。仿真结果表明，所提出的通道访问策略优于仅考虑CSI或感知结果的策略。

引用次数: 2

Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech 联合频谱和时间归一化特征对噪声和混响语音的鲁棒识别

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288876

Xiong Xiao, Chng Eng Siong, Haizhou Li

In this paper, we propose a framework for joint normalization of spectral and temporal statistics of speech features for robust speech recognition. Current feature normalization approaches normalize the spectral and temporal aspects of feature statistics separately to overcome noise and reverberation. As a result, the interaction between the spectral normalization (e.g. mean and variance normalization, MVN) and temporal normalization (e.g. temporal structure normalization, TSN) is ignored. We propose a joint spectral and temporal normalization (JSTN) framework to simultaneously normalize these two aspects of feature statistics. In JSTN, feature trajectories are filtered by linear filters and the filters' coefficients are optimized by maximizing a likelihood-based objective function. Experimental results on Aurora-5 benchmark task show that JSTN consistently out-performs the cascade of MVN and TSN on test data corrupted by both additive noise and reverberation, which validates our proposal. Specifically, JSTN reduces average word error rate by 8-9% relatively over the cascade of MVN and TSN for both artificial and real noisy data.

在本文中，我们提出了一种用于鲁棒语音识别的语音特征频谱和时间统计联合归一化的框架。目前的特征归一化方法分别对特征统计的频谱和时间方面进行归一化，以克服噪声和混响。因此，忽略了谱归一化(如均值和方差归一化，MVN)和时间归一化(如时间结构归一化，TSN)之间的相互作用。我们提出了一个联合频谱和时间归一化(JSTN)框架来同时对这两个方面的特征统计进行归一化。在JSTN中，通过线性滤波器对特征轨迹进行过滤，并通过最大化基于似然的目标函数来优化滤波器系数。在极光-5基准任务上的实验结果表明，JSTN在受加性噪声和混响破坏的测试数据上的级联性能始终优于MVN和TSN，验证了我们的建议。具体来说，JSTN相对于MVN和TSN的级联，对于人工和真实的噪声数据，平均字错误率都降低了8-9%。

{"title":"Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech","authors":"Xiong Xiao, Chng Eng Siong, Haizhou Li","doi":"10.1109/ICASSP.2012.6288876","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288876","url":null,"abstract":"In this paper, we propose a framework for joint normalization of spectral and temporal statistics of speech features for robust speech recognition. Current feature normalization approaches normalize the spectral and temporal aspects of feature statistics separately to overcome noise and reverberation. As a result, the interaction between the spectral normalization (e.g. mean and variance normalization, MVN) and temporal normalization (e.g. temporal structure normalization, TSN) is ignored. We propose a joint spectral and temporal normalization (JSTN) framework to simultaneously normalize these two aspects of feature statistics. In JSTN, feature trajectories are filtered by linear filters and the filters' coefficients are optimized by maximizing a likelihood-based objective function. Experimental results on Aurora-5 benchmark task show that JSTN consistently out-performs the cascade of MVN and TSN on test data corrupted by both additive noise and reverberation, which validates our proposal. Specifically, JSTN reduces average word error rate by 8-9% relatively over the cascade of MVN and TSN for both artificial and real noisy data.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"26 1","pages":"4325-4328"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81593817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Bias analysis of source localization using the maximum likelihood estimator 使用极大似然估计器的源定位偏差分析

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288450

Liyang Rui, K. C. Ho

The nonlinear nature of the source localization problem creates bias to a location estimate. The bias could play a significant role in limiting the performance of localization and tracking when multiple measurements at different instants are available. This paper performs bias analysis of the source location estimate obtained by the maximum likelihood estimator, where the positioning measurements can be TOA, TDOA, or AOA. The effect of bias to the mean-square localization error is examined and the amounts of bias introduced by the three types of measurements are contrasted.

源定位问题的非线性性质会对位置估计产生偏差。当在不同的时刻有多个测量时，偏差会对定位和跟踪的性能产生很大的限制。本文对最大似然估计得到的源位置估计进行偏置分析，其中定位测量值可以是TOA、TDOA或AOA。研究了偏置对均方定位误差的影响，并对比了三种测量方法引入的偏置量。

引用次数: 24

Expected-utility-based sensor selection for state estimation 基于期望效用的状态估计传感器选择

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288470

David M. Cohen, Douglas L. Jones, S. Narayanan

Applications such as long-term environmental monitoring and large-scale surveillance demand reliable performance from sensor nodes while operating within strict energy constraints. There is often not enough power for sensors to make measurements all of the time. In these cases, one must decide when to run each sensor. To this end, we develop a one-step optimal sensor-scheduling algorithm based on expected-utility maximization. “Utility” is an application-specific measure of the benefit from a given sensor measurement. In sensing environments that can be modeled using a hidden Markov model, selecting the appropriate combination of sensors at each time instant enables maximization of the expected utility while operating within an energy budget. For some budgets, the utility-based algorithm shows more than 300% utility gains over a constant duty-cycle scheme designed to consume the same amount of energy. These benefits are dependent on the energy budget.

长期环境监测和大规模监控等应用要求传感器节点在严格的能量限制下运行时具有可靠的性能。通常没有足够的能量让传感器一直进行测量。在这些情况下，必须决定何时运行每个传感器。为此，我们开发了一种基于期望效用最大化的一步最优传感器调度算法。“效用”是对给定传感器测量的效益的特定应用度量。在可以使用隐马尔可夫模型建模的传感环境中，在每个时刻选择适当的传感器组合可以在能量预算范围内实现预期效用的最大化。对于某些预算，基于效用的算法显示，在消耗相同能量的恒定占空比方案中，效用增益超过300%。这些好处取决于能源预算。

引用次数: 2

Model centroids for the simplification of Kernel Density estimators 简化核密度估计的模型质心

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6287989

Olivier Schwander, F. Nielsen

Gaussian mixture models are a widespread tool for modeling various and complex probability density functions. They can be estimated using Expectation- Maximization or Kernel Density Estimation. Expectation- Maximization leads to compact models but may be expensive to compute whereas Kernel Density Estimation yields to large models which are cheap to build. In this paper we present new methods to get high-quality models that are both compact and fast to compute. This is accomplished with clustering methods and centroids computation. The quality of the resulting mixtures is evaluated in terms of log-likelihood and Kullback-Leibler divergence using examples from a bioinformatics application.

高斯混合模型是建模各种复杂概率密度函数的广泛工具。它们可以使用期望最大化或核密度估计来估计。期望-最大化导致紧凑的模型，但可能计算昂贵，而核密度估计产生大型模型，构建成本低。在本文中，我们提出了新的方法，以获得高质量的模型，既紧凑又快速计算。这是通过聚类方法和质心计算来实现的。使用生物信息学应用的示例，根据对数似然和Kullback-Leibler散度来评估所得混合物的质量。

引用次数: 12

User recommendation with tensor factorization in social networks 基于张量分解的社交网络用户推荐

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288758

Zhenlei Yan, Jie Zhou

The rapid growth of population in social networks has posed a challenge to existing systems for recommending to a user new friends having similar interests. In this paper, we address this user recommendation problem in social networks by proposing a novel framework which utilizes users' tagging information with tensor factorization. This work brings two major contributions: (1) A tensor model is proposed to capture the potential association among user, user's interests and friends in social tagging systems; (2) A novel approach is proposed to recommend new friends based on this model. The experiments on a real-world dataset crawled from Last.fm show that the proposed method outperforms other state-of-the-art approaches.

社交网络人口的快速增长对现有的向用户推荐有相似兴趣的新朋友的系统提出了挑战。在本文中，我们提出了一个利用用户标签信息和张量分解的新框架来解决社交网络中的用户推荐问题。该工作带来了两个主要贡献:(1)提出了一个张量模型来捕捉社交标签系统中用户、用户兴趣和朋友之间的潜在关联;(2)在此模型的基础上，提出了一种新的好友推荐方法。在真实世界数据集上的实验是从Last抓取的。FM表明，所提出的方法优于其他最先进的方法。

引用次数: 10

Adaptive kernel principal components tracking 自适应核主成分跟踪

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288276

Toshihisa Tanaka, Y. Washizawa, A. Kuh

Adaptive online algorithms for simultaneously extracting nonlinear eigenvectors of kernel principal component analysis (KPCA) are developed. KPCA needs all the observed samples to represent basis functions, and the same scale of eigenvalue problem as the number of samples should be solved. This paper reformulates KPCA and deduces an expression in the Euclidean space, where an algorithm for tracking generalized eigenvectors is applicable. The developed algorithm here is least mean squares (LMS)-type and recursive least squares (RLS)-type. Numerical example is then illustrated to support the analysis.

提出了核主成分分析(KPCA)非线性特征向量同时提取的自适应在线算法。KPCA需要所有的观测样本来表示基函数，需要解决与样本数量相同的特征值尺度问题。本文对KPCA进行了重新表述，并推导出欧几里德空间中的表达式，在欧几里德空间中，广义特征向量的跟踪算法是适用的。本文开发的算法是最小均二乘(LMS)型和递推最小二乘(RLS)型。最后给出了数值算例来支持分析。

引用次数: 8

Lagrangian multiplier optimization using correlations in residues 利用残数相关性的拉格朗日乘子优化

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288099

Zhenyu Liu, Dongsheng Wang, Junwei Zhou, T. Ikenaga

Rate distortion optimization (RDO) algorithm plays the vital role in the up to date hybrid video codec H.264/AVC. The RDO algorithm of H.264/AVC reference software is built up by assuming that the transformed residues are memoryless variables. However, our experiments reveal that, for some sequences, the strong temporal correlations exist in the prediction residues. This paper extends the Lagrangian optimization techniques by modeling the transformed residues as the first-order Markov source and calibrating the distortion model with the piecewise approximation function. The proposed algorithms adjust the Lagrangian multiplier dynamically to improve the overall coding quality. Comprehensive experiments testify that, as compared with the JM reference software, our optimizations can achieve up to 1.875dB coding gain. Moreover, our algorithms posses more robust coding performance and introduce less computational overhead than the Laplace distribution based methods. The inherent short process latency makes it possible to cooperate our algorithms with rate control operation. Last but not least, the proposed approach is also useful for the emerging standard, HEVC.

速率失真优化(RDO)算法在目前的H.264/AVC混合视频编解码器中起着至关重要的作用。在H.264/AVC参考软件中，假设变换后的残差为无记忆变量，建立了RDO算法。然而，我们的实验表明，对于某些序列，预测残差存在很强的时间相关性。本文扩展了拉格朗日优化技术，将变换后的残数建模为一阶马尔可夫源，并用分段逼近函数对畸变模型进行校正。该算法通过动态调整拉格朗日乘子来提高整体编码质量。综合实验证明，与JM参考软件相比，优化后的编码增益可达1.875dB。此外，与基于拉普拉斯分布的方法相比，我们的算法具有更强的编码性能和更少的计算开销。固有的短进程延迟使得我们的算法可以配合速率控制操作。最后但并非最不重要的是，所提出的方法对新兴标准HEVC也很有用。

{"title":"Lagrangian multiplier optimization using correlations in residues","authors":"Zhenyu Liu, Dongsheng Wang, Junwei Zhou, T. Ikenaga","doi":"10.1109/ICASSP.2012.6288099","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288099","url":null,"abstract":"Rate distortion optimization (RDO) algorithm plays the vital role in the up to date hybrid video codec H.264/AVC. The RDO algorithm of H.264/AVC reference software is built up by assuming that the transformed residues are memoryless variables. However, our experiments reveal that, for some sequences, the strong temporal correlations exist in the prediction residues. This paper extends the Lagrangian optimization techniques by modeling the transformed residues as the first-order Markov source and calibrating the distortion model with the piecewise approximation function. The proposed algorithms adjust the Lagrangian multiplier dynamically to improve the overall coding quality. Comprehensive experiments testify that, as compared with the JM reference software, our optimizations can achieve up to 1.875dB coding gain. Moreover, our algorithms posses more robust coding performance and introduce less computational overhead than the Laplace distribution based methods. The inherent short process latency makes it possible to cooperate our algorithms with rate control operation. Last but not least, the proposed approach is also useful for the emerging standard, HEVC.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"2 1","pages":"1185-1188"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85233219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

GMM foreground segmentation processor based on address free pixel streams 基于地址自由像素流的GMM前景分割处理器

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288213

R. Yagi, Tomohito Kajimoto, T. Nishitani

A compact implementation of a foreground segmentation processor in a multi-resolution transform domain has been proposed for HDTV signals. The proposed architecture is designed to simplify system controls by the hardware streaming and to reduce required memory capacities. It enables flowing pixels through all functional units in order, including multi-resolution spatial transform and temporal segmentation. The resultant architecture does not use memories except I/O buffers. Therefore, memory modules as well as complex address manipulation over the multiple global transforms and spatial/temporal interface is not required. The FPGA prototype chip dissipates 150 mW of power. This approach can be used for tablets and smart-phone by an ASIC implementation which will reduce the operation power to about 1/6.

提出了一种用于HDTV信号多分辨率变换域前景分割处理器的紧凑实现方法。该架构旨在通过硬件流简化系统控制，并减少所需的内存容量。它支持像素按顺序通过所有功能单元，包括多分辨率空间变换和时间分割。最终的体系结构不使用内存，除了I/O缓冲区。因此，不需要内存模块以及多个全局转换和空间/时间接口上的复杂地址操作。FPGA原型芯片耗电150mw。这种方法可以通过ASIC实现用于平板电脑和智能手机，将运行功率降低到1/6左右。

引用次数: 3

A local intensity adaptive structural similarity index 一种局部强度自适应结构相似性指数

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288090

Zhengguo Li, Chuohao Yeo, Y. H. Tan, S. Rahardja

Existing structural similarity (SSIM) index comprises of one term on luminance comparison and the other term on contrast and structure comparison. In this paper, the SSIM index is first improved by introducing three weighting factors to the second term such that it is adaptive to local intensities of two images to be compared. The improved SSIM (iSSIM) index is further extended for two images with possibly different exposures. Experimental results show that the proposed indices are more robust to large intensity changes of two images from the same scene and more sensitive to two images from different scenes than the existing SSIM index.

现有的结构相似度(SSIM)指标包括一个亮度比较项和另一个对比度和结构比较项。本文首先通过在第二项中引入三个加权因子对SSIM指数进行改进，使其能够适应两幅待比较图像的局部强度。改进的SSIM (iSSIM)指数进一步扩展到两张可能不同曝光的图像。实验结果表明，与现有的SSIM指数相比，该指数对来自同一场景的两幅图像的大强度变化具有更强的鲁棒性，对来自不同场景的两幅图像更敏感。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀