2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)最新文献

英文中文

Adaptive filtering for non-Gaussian processes 非高斯过程的自适应滤波

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.861999

P. Kidmose

A new stochastic gradient robust filtering method, based on a non-linear amplitude transformation, is proposed. The method requires no a priori knowledge of the characteristics of the input signals and it is insensitive to the signals distribution and to the stationarity of the signals. A simulation study, applying both synthetic and real-world signals, shows that the proposed method has overall better robustness performance, in terms of modeling error, compared with state-of-the-art robust filtering methods. A remarkable property of the proposed method is that it can handle double-talk in the acoustical echo-cancellation problem.

提出了一种基于非线性振幅变换的随机梯度鲁棒滤波方法。该方法不需要先验地了解输入信号的特性，并且对信号的分布和平稳性不敏感。应用合成信号和真实信号的仿真研究表明，与现有的鲁棒滤波方法相比，该方法在建模误差方面具有更好的鲁棒性。该方法的一个显著特点是可以处理声回声消除问题中的双重对话。

引用次数: 10

Using SIMD instructions for fast likelihood calculation in LVCSR 在LVCSR中使用SIMD指令进行快速似然计算

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.861948

Stephan Kanthak, Kai Schütz, H. Ney

Most modern processor architectures provide SIMD (single instruction multiple data) instructions to speed up algorithms based on vector or matrix operations. This paper describes the use of SIMD instructions to calculate Gaussian or Laplacian densities in a large vocabulary speech recognition system. We present a simple, robust method based on scalar quantization of the mean and observation vector components without any loss in recognition performance while speeding up the whole system's runtime by a factor of 3. Combining the approach with vector space partitioning techniques accelerated the overall system by a factor of over 7. The experiments show that the approach can be also applied to Viterbi training without any loss of accuracy. All experiments were conducted on a German, 10,000-word, spontaneous speech task using two architectures, namely Intel Pentium III and SUN UltraSPARC.

大多数现代处理器架构都提供SIMD(单指令多数据)指令来加速基于向量或矩阵运算的算法。本文描述了在一个大词汇量语音识别系统中使用SIMD指令来计算高斯密度或拉普拉斯密度。我们提出了一种简单、鲁棒的方法，该方法基于均值和观测向量分量的标量量化，在不影响识别性能的情况下，将整个系统的运行时间提高了3倍。将该方法与向量空间划分技术相结合，将整个系统的速度提高了7倍以上。实验结果表明，该方法同样适用于Viterbi训练，且精度不降低。所有的实验都是在一个10000字的德语自发语音任务上进行的，使用两种架构，即Intel Pentium III和SUN UltraSPARC。

引用次数: 52

Statistical analysis of a signal separation method based on second order statistics 基于二阶统计量的信号分离方法的统计分析

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.861858

T. Gustafsson, U. Lindgren, H. Sahlin

This paper explores an existing method for signal separation, which is based on second order statistics. Here, statistical analysis of a generalized version of the original algorithm is given. The generalized method includes a weighting matrix, and a result of the statistical analysis is that the best possible weighting is found. The problem of initialization of the involved non-linear optimization is also discussed.

本文探讨了一种现有的基于二阶统计量的信号分离方法。这里，给出了原始算法的一个广义版本的统计分析。该方法包括一个加权矩阵，并通过统计分析得出了最优的可能加权。讨论了所涉及的非线性优化的初始化问题。

引用次数: 6

Behavior of a Bayesian adaptation method for incremental enrollment in speaker verification 说话人验证中增量登记的贝叶斯自适应方法的行为

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.859180

C. Fredouille, J. Mariéthoz, C. Jaboulet, J. Hennebert, C. Mokbel, F. Bimbot

Classical adaptation approaches are generally used for speaker or environment adaptation of speech recognition systems. In this paper, we use such techniques for the incremental training of client models in a speaker verification system. The initial model is trained on a very limited amount of data and then progressively updated with access data, using a segmental-EM procedure. In supervised mode (i.e. when access utterances are certified), the incremental approach yields equivalent performance to the batch one. We also investigate on the impact of various scenarios of impostor attacks during the incremental enrollment phase. All results are obtained with the Picassoft platform-the state-of-the-art speaker verification system developed in the PICASSO project.

语音识别系统一般采用经典的自适应方法对说话人或环境进行自适应。在本文中，我们将这些技术用于说话人验证系统中客户端模型的增量训练。初始模型在非常有限的数据量上进行训练，然后使用分段em过程逐步更新访问数据。在监督模式下(即当访问话语被认证时)，增量方法产生与批量方法相同的性能。我们还研究了在增量注册阶段各种冒名顶替攻击场景的影响。所有结果都是通过Picassoft平台获得的，这是毕加索项目中开发的最先进的扬声器验证系统。

引用次数: 42

Blind maximum SINR receiver for the DS-CDMA downlink 盲最大信噪比接收机用于DS-CDMA下行链路

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.860941

D. Slock, I. Ghauri

We address the problem of downlink interference rejection in a DS-CDMA system. Periodic orthogonal Walsh-Hadamard sequences spread different users' symbols followed by scrambling by a symbol aperiodic base-station specific overlay sequence. The point-to-point propagation channel from the cell-site to a certain mobile station is the same for all downlink signals (desired user as well as the intracell interference). Orthogonality of the underlying Walsh-Hadamard sequences is destroyed by multipath propagation, resulting in multiuser interference if a coherent combiner (the RAKE receiver) is employed. In this paper, we propose a blind linear equalization algorithm which equalizes for the common downlink channel, thus rendering the user signals orthogonal again. A simple code matched filter subsequently suffices to cancel the multiple access interference (MAI) from intracell users. It is shown that the receiver maximizes the signal-to-interference plus noise ratio (SINR) at its output.

研究了DS-CDMA系统中的下行干扰抑制问题。周期正交Walsh-Hadamard序列将不同用户的符号展开，然后用符号非周期基站特定覆盖序列进行置乱。从蜂窝站点到某个移动站的点对点传播信道对于所有下行信号(期望用户以及蜂窝内干扰)都是相同的。如果使用相干组合器(RAKE接收器)，则底层Walsh-Hadamard序列的正交性会被多径传播破坏，从而导致多用户干扰。本文提出了一种盲线性均衡算法，该算法对公共下行信道进行均衡，从而使用户信号再次正交。随后，一个简单的代码匹配滤波器就足以消除来自小区内用户的多址干扰。结果表明，该接收机在其输出处的信噪比(SINR)达到最大值。

引用次数: 22

Some new results on the discrete bispectrum 关于离散双谱的一些新结果

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.861915

M. Colas, G. Gelle, G. Delaunay

This paper presents some new results concerning the bispectrum of sampled signals. We show that sampling a stationary signal at Fs=2B usually implies a non-zero outer triangle (OT) domain in the bispectrum due to overlapping. Moreover, we pointed out that processes (stationary or not) sampled at Fs>3B are always zero in the OT (no overlapping). Finally, we propose an empirical method for which a non-zero OT indicates that the signal is really non-stationary and we propose to combine this approach with the Hinich stationarity test (Hinich 1990, and Hinich and Messer 1995).

本文给出了一些关于采样信号双谱的新结果。我们表明，在f =2B处采样平稳信号通常意味着由于重叠在双谱中存在非零外三角形(OT)域。此外，我们指出，在Fs>3B处采样的过程(平稳或非平稳)在OT中始终为零(没有重叠)。最后，我们提出了一种经验方法，其中非零OT表明信号确实是非平稳的，我们建议将这种方法与Hinich平稳性检验(Hinich 1990, Hinich和Messer 1995)结合起来。

引用次数: 0

Code-length-based universal extraction for blind signal separation 基于码长的盲信号通用提取

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.860136

G. Orsak, S. Douglas

We propose a blind signal separation algorithm (CLUE) that uses the sum of the individual code lengths of the extracted signals as a measure of the separation performance. The new technique combines a widely-available universal data compression routine with any single-parameter search procedure. Unlike previous approaches, the proposed method is model-free and does not rely on the moment values of the signals for its separation performance. An example shows the algorithm's efficiency in separating mixtures of image, audio, and text data.

我们提出了一种盲信号分离算法(CLUE)，它使用提取信号的单个代码长度之和作为分离性能的度量。新技术将广泛可用的通用数据压缩例程与任何单参数搜索程序相结合。与以往的方法不同，该方法是无模型的，并且不依赖于信号的矩值来实现分离性能。实例显示了该算法在分离图像、音频和文本混合数据方面的效率。

引用次数: 8

Spectral modification for concatenative speech synthesis 串联语音合成的频谱修正

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.859116

J. Wouters, Michael W. Macon

Concatenative synthesis can produce high-quality speech but is limited to the allophonic variations and voice types that were captured in the database. It would be desirable to modify speech units to remove formant discontinuities and to create new speaking styles, such as hypo- or hyper-articulated speech. Unfortunately, manipulating the spectral structure often leads to degraded speech quality. We investigate two speech modification strategies, one based on inverse filtering and the other on sinusoidal modeling, and we explain their merits and shortcomings for changing the spectral envelope in speech. We then propose a method which uses sinusoidal modeling and represents the complex sinusoidal amplitudes by an all-pole model. The all-pole model approximates the sinusoidal spectrum well, both in the amplitude and in the phase domain. We use the sinusoidal+all-pole model to control the spectral envelope in recorded speech. High-quality modified speech is generated from the model using sinusoidal synthesis. A perceptual test was conducted, which shows that the model was effective at changing vowel identities and was preferable over residual excited LPC.

串联合成可以产生高质量的语音，但受限于数据库中捕获的语音变体和语音类型。需要修改语音单元以消除共振不连续，并创建新的说话风格，例如低或高发音。不幸的是，操纵频谱结构往往会导致语音质量下降。研究了基于反滤波和正弦建模的两种语音修饰策略，并分析了它们在改变语音频谱包络方面的优缺点。然后，我们提出了一种使用正弦建模的方法，并用全极模型表示复正弦振幅。全极模型在振幅域和相位域都能很好地逼近正弦频谱。我们使用正弦波+全极模型来控制录制语音的频谱包络。利用正弦合成从模型中生成高质量的修改语音。实验结果表明，该模型能够有效地改变元音身份，优于残差LPC。

{"title":"Spectral modification for concatenative speech synthesis","authors":"J. Wouters, Michael W. Macon","doi":"10.1109/ICASSP.2000.859116","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.859116","url":null,"abstract":"Concatenative synthesis can produce high-quality speech but is limited to the allophonic variations and voice types that were captured in the database. It would be desirable to modify speech units to remove formant discontinuities and to create new speaking styles, such as hypo- or hyper-articulated speech. Unfortunately, manipulating the spectral structure often leads to degraded speech quality. We investigate two speech modification strategies, one based on inverse filtering and the other on sinusoidal modeling, and we explain their merits and shortcomings for changing the spectral envelope in speech. We then propose a method which uses sinusoidal modeling and represents the complex sinusoidal amplitudes by an all-pole model. The all-pole model approximates the sinusoidal spectrum well, both in the amplitude and in the phase domain. We use the sinusoidal+all-pole model to control the spectral envelope in recorded speech. High-quality modified speech is generated from the model using sinusoidal synthesis. A perceptual test was conducted, which shows that the model was effective at changing vowel identities and was preferable over residual excited LPC.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124911100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

An order based segmentation algorithm for low power implementation of digital filters 一种基于顺序的低功耗数字滤波器分割算法

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.860094

A. Erdogan, T. Arslan

The paper presents a new algorithm for low power implementation of digital filters. The algorithm reduces power consumption through a two phased strategy, which targets reducing the switched capacitance within the multiplier circuit. The first phase involves the segmentation of coefficients into more primitive components which could in turn be processed through a single shift and a more primitive multiplication operations. The second phase exploits the correlation among the new set of coefficients at the coefficient input of the multiplier for more reduction in switched capacitance. The paper describes the algorithm and its evaluation environment and provides results with a number of filter examples demonstrating up to 65% reduction in power compared to conventional filtering.

提出了一种低功耗实现数字滤波器的新算法。该算法通过两阶段策略降低功耗，目标是降低乘法器电路内的开关电容。第一阶段包括将系数分割成更原始的分量，这些分量可以通过单个移位和更原始的乘法操作来处理。第二阶段利用乘法器系数输入处的新系数集之间的相关性来进一步减小开关电容。本文描述了该算法及其评估环境，并提供了一些滤波器示例，表明与传统滤波相比，功耗降低了65%。

引用次数: 1

Fuzzy trellis vector quantization of images 模糊网格矢量量化图像

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

Pub Date : 2000-06-05 DOI: 10.1109/ICASSP.2000.859195

T. Haddad, A. Yongaçoğlu

This paper introduces a new codebook search algorithm for trellis vector quantization systems (TVQ). The development of the new algorithm is based on the symbol-MAP channel decoding algorithm, which is modified for data compression to deliver soft distortion-related reliability information. Following a rate-distortion theoretic approach, the soft information is used to derive a codebook search algorithm that is capable of solving the problems associated with the LBG algorithm. The derived algorithm is fuzzy in the sense that it follows a soft association rule, however, it is deterministic in the descent towards the global minimum distortion point. Although the derivation is general, the algorithm is tested using gray-scale images, which provide a nonconvex square-error distortion surface. As shown in the simulation section, the new algorithm provides lower energy codebooks (/spl sim/0.8 dB gain), while being significantly less sensitive to initial codebooks using short training image sequences.

本文介绍了一种新的网格矢量量化系统(TVQ)码本搜索算法。该算法是在符号- map信道解码算法的基础上开发的，该算法对数据压缩进行了改进，以提供与软失真相关的可靠性信息。根据率失真理论，利用软信息推导出一种码本搜索算法，该算法能够解决与LBG算法相关的问题。该算法遵循软关联规则是模糊的，但在向全局最小失真点下降的过程中是确定性的。虽然推导是一般的，但该算法在灰度图像上进行了测试，该图像提供了一个非凸方误差失真表面。如仿真部分所示，新算法提供了更低能量的码本(/spl sim/0.8 dB增益)，同时对使用短训练图像序列的初始码本的敏感性明显降低。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀