首页 > 最新文献

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing最新文献

英文 中文
Frequency hopping patterns for simultaneous multiple-beam sonar imaging 同时多波束声纳成像的跳频模式
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169477
P. Cassereau, J. Jaffe
This paper describes the design of frequency-hopped signals for a multi-beam imaging system. A frequency hopping pattern is a frequency-coded uniform pulse train. The signal is divided into M time intervals, with each interval assigned a different frequency chosen from a set of N frequencies. A set of N patterns composed of N-1 frequencies can be generated using first-order Reed-Solomon codewords. These patterns exhibit very good correlation properties. In a frequency-hopped multi-beam imaging system, each beam is associated with a pattern and transmits a coded waveform. All N beams can be transmitted simultaneously resulting in a high scan-rate, high resolution imaging device. Furthermore, in the presence of noise and medium spreading effects, a frequency-hopped imaging device performs better than conventional systems by showing better noise rejection and less sensitivity to spreading effects.
本文介绍了多波束成像系统的跳频信号设计。跳频模式是一种频率编码的均匀脉冲序列。信号被分成M个时间间隔,每个时间间隔从N个频率中选择一个不同的频率。使用一阶里德-所罗门码字可以生成由N-1个频率组成的N组模式。这些模式表现出很好的相关性。在跳频多波束成像系统中,每个波束都与一个模式相关联,并发送一个编码波形。所有N光束可以同时传输,从而产生高扫描速率,高分辨率成像设备。此外,在存在噪声和介质扩散效应的情况下,跳频成像装置表现出更好的噪声抑制和更低的对扩散效应的敏感性,从而优于传统系统。
{"title":"Frequency hopping patterns for simultaneous multiple-beam sonar imaging","authors":"P. Cassereau, J. Jaffe","doi":"10.1109/ICASSP.1987.1169477","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169477","url":null,"abstract":"This paper describes the design of frequency-hopped signals for a multi-beam imaging system. A frequency hopping pattern is a frequency-coded uniform pulse train. The signal is divided into M time intervals, with each interval assigned a different frequency chosen from a set of N frequencies. A set of N patterns composed of N-1 frequencies can be generated using first-order Reed-Solomon codewords. These patterns exhibit very good correlation properties. In a frequency-hopped multi-beam imaging system, each beam is associated with a pattern and transmits a coded waveform. All N beams can be transmitted simultaneously resulting in a high scan-rate, high resolution imaging device. Furthermore, in the presence of noise and medium spreading effects, a frequency-hopped imaging device performs better than conventional systems by showing better noise rejection and less sensitivity to spreading effects.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133929256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
An improved, highly parallel rank-one eigenvector update method with signal processing applications 基于信号处理的一种改进的、高度并行的秩一特征向量更新方法
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169500
R. DeGroat, R. Roberts
In this paper, we discuss rank-one eigenvector updating schemes that are appropriate for tracking time-varying, narrow-band signals in noise. We show that significant reductions in computation are achieved by updating the eigenvalue decomposition (EVD) of a reduced rank version of the data covariance matrix, and that reduced rank updating yields a lower threshold breakdown than full rank updating. We also show that previously published eigenvector updating algorithms [1], [10], suffer from a linear build-up of roundoff error which becomes significant when large numbers of recursive updates are performed. We then show that exponential weighting together with pairwise Gram Schmidt partial orthogonalization at each update virtually eliminates the build-up of error making the rank-one update a useful numerical tool for recursive updating. Finally, we compare the frequency estimation performance of reduced rank weighted linear prediction and the LMS algorithm.
在本文中,我们讨论了适合于跟踪时变窄带噪声信号的秩一特征向量更新方案。我们表明,通过更新数据协方差矩阵的降阶版本的特征值分解(EVD)可以显著减少计算量,并且降阶更新产生比全秩更新更低的阈值分解。我们还表明,先前发布的特征向量更新算法[1],[10]遭受舍入误差的线性累积,当执行大量递归更新时,这种误差变得显着。然后,我们展示了指数加权和两两Gram Schmidt部分正交化在每次更新时实际上消除了误差的积累,使排名一更新成为递归更新的有用数值工具。最后,比较了降阶加权线性预测和LMS算法的频率估计性能。
{"title":"An improved, highly parallel rank-one eigenvector update method with signal processing applications","authors":"R. DeGroat, R. Roberts","doi":"10.1109/ICASSP.1987.1169500","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169500","url":null,"abstract":"In this paper, we discuss rank-one eigenvector updating schemes that are appropriate for tracking time-varying, narrow-band signals in noise. We show that significant reductions in computation are achieved by updating the eigenvalue decomposition (EVD) of a reduced rank version of the data covariance matrix, and that reduced rank updating yields a lower threshold breakdown than full rank updating. We also show that previously published eigenvector updating algorithms [1], [10], suffer from a linear build-up of roundoff error which becomes significant when large numbers of recursive updates are performed. We then show that exponential weighting together with pairwise Gram Schmidt partial orthogonalization at each update virtually eliminates the build-up of error making the rank-one update a useful numerical tool for recursive updating. Finally, we compare the frequency estimation performance of reduced rank weighted linear prediction and the LMS algorithm.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133093205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Context-dependent phonetic Markov models for large vocabulary speech recognition 上下文相关语音马尔可夫模型用于大词汇量语音识别
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169604
Anne-Marie Derouault
One approach to large vocabulary speech recognition, is to build phonetic Markov models, and to concatenate them to obtain word models. In previous work, we already designed a recognizer based on 40 phonetic Markov machines, which accepts a 10,000 words vocabulary ([3]), and recently 200,000 words vocabulary ([5]). Since there is one machine per phoneme, these models obviously do not account for coarticulatory effects, which may lead to recognition errors. In this paper, we improve the phonetic models by using general principles about coarticulation effects on automatic phoneme recognition. We show that both the analysis of the errors made by the recognizer, and linguistic facts about phonetic context influence, suggest a method for choosing context dependent models. This method allows to limit the growing of the number of phonems, and still account for the most important coarticulation effects. We present our experiments with a system applying these principles to a set of models for French. With this new system including context-dependant machines, the phoneme recognition rate goes from 82.2% to 85.3%, and the error rate on words with a 10,000 word dictionary, is decreased from 11.2 to 9.8%.
大词汇量语音识别的一种方法是建立语音马尔可夫模型,并将它们连接起来获得单词模型。在之前的工作中,我们已经设计了一个基于40个语音马尔可夫机的识别器,该识别器接受10,000个单词的词汇量([3]),最近接受200,000个单词的词汇量([5])。由于每个音素有一台机器,这些模型显然没有考虑到协同发音效应,这可能导致识别错误。本文利用协同发音在自动音素识别中的一般原理,对语音模型进行了改进。我们表明,无论是对识别器所犯错误的分析,还是对语音语境影响的语言学事实,都提出了一种选择语境依赖模型的方法。这种方法允许限制音素数量的增长,并且仍然考虑到最重要的协同发音效果。我们展示了我们的实验系统,将这些原则应用于法语的一组模型。在包含上下文相关机器的新系统中,音素识别率从82.2%提高到85.3%,在1万字字典中,单词的错误率从11.2%下降到9.8%。
{"title":"Context-dependent phonetic Markov models for large vocabulary speech recognition","authors":"Anne-Marie Derouault","doi":"10.1109/ICASSP.1987.1169604","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169604","url":null,"abstract":"One approach to large vocabulary speech recognition, is to build phonetic Markov models, and to concatenate them to obtain word models. In previous work, we already designed a recognizer based on 40 phonetic Markov machines, which accepts a 10,000 words vocabulary ([3]), and recently 200,000 words vocabulary ([5]). Since there is one machine per phoneme, these models obviously do not account for coarticulatory effects, which may lead to recognition errors. In this paper, we improve the phonetic models by using general principles about coarticulation effects on automatic phoneme recognition. We show that both the analysis of the errors made by the recognizer, and linguistic facts about phonetic context influence, suggest a method for choosing context dependent models. This method allows to limit the growing of the number of phonems, and still account for the most important coarticulation effects. We present our experiments with a system applying these principles to a set of models for French. With this new system including context-dependant machines, the phoneme recognition rate goes from 82.2% to 85.3%, and the error rate on words with a 10,000 word dictionary, is decreased from 11.2 to 9.8%.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115969744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
A stochastic segment model for phoneme-based continuous speech recognition 基于音素的连续语音识别随机片段模型
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169700
Salim Roukos, M. O. Dunham
Developing accurate and robust phonetic models for the different speech sounds is a major challenge for high performance continuous speech recognition. In this paper, we introduce a new approach, called the stochastic segment model, for modelling a variable-length phonetic segment X, an L-long sequence of feature vectors. The stochastic segment model consists of 1) time-warping the variable-length segment X into a fixed-length segment Y called a resampled segment, and 2) a joint density function of the parameters of the resampled segment Y, which in this work is assumed Gaussian. In this paper, we describe the stochastic segment model, the recognition algorithm, and the iterative training algorithm for estimating segment models from continuous speech. For speaker-dependent continuous speech recognition, the segment model reduces the word error rate by one third over a hidden Markov phonetic model.
为不同的语音建立准确、鲁棒的语音模型是实现高性能连续语音识别的主要挑战。在本文中,我们引入了一种新的方法,称为随机段模型,用于建模一个变长语音段X,一个l长的特征向量序列。随机段模型包括:1)将变长段X时间规整为被称为重采样段的定长段Y; 2)重采样段Y参数的联合密度函数,在本文中假设为高斯分布。在本文中,我们描述了随机片段模型、识别算法以及从连续语音中估计片段模型的迭代训练算法。对于依赖于说话人的连续语音识别,该分段模型比隐马尔可夫语音模型降低了三分之一的单词错误率。
{"title":"A stochastic segment model for phoneme-based continuous speech recognition","authors":"Salim Roukos, M. O. Dunham","doi":"10.1109/ICASSP.1987.1169700","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169700","url":null,"abstract":"Developing accurate and robust phonetic models for the different speech sounds is a major challenge for high performance continuous speech recognition. In this paper, we introduce a new approach, called the stochastic segment model, for modelling a variable-length phonetic segment X, an L-long sequence of feature vectors. The stochastic segment model consists of 1) time-warping the variable-length segment X into a fixed-length segment Y called a resampled segment, and 2) a joint density function of the parameters of the resampled segment Y, which in this work is assumed Gaussian. In this paper, we describe the stochastic segment model, the recognition algorithm, and the iterative training algorithm for estimating segment models from continuous speech. For speaker-dependent continuous speech recognition, the segment model reduces the word error rate by one third over a hidden Markov phonetic model.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115973941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Exact recursive least squares algorithms for ARMA modeling 精确递归最小二乘算法的ARMA建模
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169813
S. Prasad, S. Joshi
The present paper aims to present an entirely new approach for the development of "exact" recursive least squares algorithms for ARMA filtering and modeling when the inputs (asssumed here to be "white") are not observable. The approach is heavily based on the recently proposed "predictor-space" representation of ARMA processes 131 and theu se of some new, moreg eneral projection operator update formulas, breifly summarized here.
本文旨在提出一种全新的方法,用于在输入(这里假设为“白色”)不可观测时开发用于ARMA滤波和建模的“精确”递归最小二乘算法。该方法在很大程度上基于最近提出的ARMA过程的“预测空间”表示131和一些新的、更通用的投影算子更新公式的使用,这里简要总结。
{"title":"Exact recursive least squares algorithms for ARMA modeling","authors":"S. Prasad, S. Joshi","doi":"10.1109/ICASSP.1987.1169813","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169813","url":null,"abstract":"The present paper aims to present an entirely new approach for the development of \"exact\" recursive least squares algorithms for ARMA filtering and modeling when the inputs (asssumed here to be \"white\") are not observable. The approach is heavily based on the recently proposed \"predictor-space\" representation of ARMA processes 131 and theu se of some new, moreg eneral projection operator update formulas, breifly summarized here.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128425810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A fast QR/Frequency-domain RLS adaptive filter 一种快速QR/频域RLS自适应滤波器
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169610
J. Cioffi
There has been considerable recent interest in QR factorization for recursive solution to the least-squares adaptive-filtering problem, mainly because of the good numerical properties of QR factorizations. Early work by Gentleman and Kung (1981) and McWhirter (1983) has produced triangular systolic arrays of N2/2 processors that solve the Recursive Least Squares (RLS) adaptive-filtering problem (where N is the size of the adaptive filter). Here, we introduce a more computationally efficient solution to the QR RLS problem that requires only O(N) computations per time update, when the input has the usual shift-invariant property. Thus, computation and implementation requirements are reduced by an order of magnitude. The new algorithms are based on a structure that is neither a transversal filter nor a lattice, but can be best characterized by a functionally equivalent set of parameters that represent the time-varying "least-squares frequency transforms" of the input sequences. Numerical stability can be insured by implementing computations as 2 × 2 orthogonal (Givens) rotations.
近年来,由于QR分解具有良好的数值性质,人们对最小二乘自适应滤波问题递归解的QR分解产生了相当大的兴趣。Gentleman和Kung(1981)以及McWhirter(1983)的早期工作已经产生了N2/2处理器的三角形收缩阵列,解决了递归最小二乘(RLS)自适应滤波问题(其中N是自适应滤波器的大小)。在这里,我们引入了一个计算效率更高的QR RLS问题的解决方案,当输入具有通常的移位不变性时,每次更新只需要O(N)次计算。因此,计算和实现需求减少了一个数量级。新算法基于一种结构,既不是横向滤波器也不是晶格,但可以用一组功能等效的参数来最好地表征,这些参数表示输入序列的时变“最小二乘频率变换”。数值稳定性可以通过实现2 × 2正交(给定)旋转的计算来保证。
{"title":"A fast QR/Frequency-domain RLS adaptive filter","authors":"J. Cioffi","doi":"10.1109/ICASSP.1987.1169610","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169610","url":null,"abstract":"There has been considerable recent interest in QR factorization for recursive solution to the least-squares adaptive-filtering problem, mainly because of the good numerical properties of QR factorizations. Early work by Gentleman and Kung (1981) and McWhirter (1983) has produced triangular systolic arrays of N2/2 processors that solve the Recursive Least Squares (RLS) adaptive-filtering problem (where N is the size of the adaptive filter). Here, we introduce a more computationally efficient solution to the QR RLS problem that requires only O(N) computations per time update, when the input has the usual shift-invariant property. Thus, computation and implementation requirements are reduced by an order of magnitude. The new algorithms are based on a structure that is neither a transversal filter nor a lattice, but can be best characterized by a functionally equivalent set of parameters that represent the time-varying \"least-squares frequency transforms\" of the input sequences. Numerical stability can be insured by implementing computations as 2 × 2 orthogonal (Givens) rotations.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131286451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Array signal processing with interconnected Neuron-like elements 用相互连接的神经元样元件处理阵列信号
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169330
R. Rastogi, P. Gupta, R. Kumaresan
Estimation of angles of arrival of plane waves from data observed at an array of sensors is performed with a network of interconnected, instantaneous, saturating non-linear elements called neurons. The networks use the observed data to decide which among a large number of hypothesized angles of arrivals best fits the data. A possible stochastic-digital implementation of such a network is also indicated.
从传感器阵列观测到的数据中估计平面波到达的角度是通过一个由相互连接的、瞬时的、饱和的非线性元素(称为神经元)组成的网络来完成的。网络使用观察到的数据来决定在大量假设的到达角度中哪一个最适合数据。还指出了这种网络的一种可能的随机数字实现。
{"title":"Array signal processing with interconnected Neuron-like elements","authors":"R. Rastogi, P. Gupta, R. Kumaresan","doi":"10.1109/ICASSP.1987.1169330","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169330","url":null,"abstract":"Estimation of angles of arrival of plane waves from data observed at an array of sensors is performed with a network of interconnected, instantaneous, saturating non-linear elements called neurons. The networks use the observed data to decide which among a large number of hypothesized angles of arrivals best fits the data. A possible stochastic-digital implementation of such a network is also indicated.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"241 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114464633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Distance measure for speech recognition based on the smoothed group delay spectrum 基于平滑群延迟谱的语音识别距离测量
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169476
F. Itakura, T. Umezaki
We present a novel spectral distance measure based on the smoothed LPC group delay spectrum which gives a stable recognition performance under variable frequency transfer characteristics and additive noise. The weight of the n-th cepstral coefficients in our measure is given byW_{n} = n^{s}. exp(-n^{2}/2tau^{2})which can be adjusted by selecting proper values ofsand τ. In order to optimize the parameters of this distance measure, extensive experiments are carried out in a speaker-dependent isolated word recognition system using a standard dynamic time warping technique. The input speech data used here is a set of phonetically very similar 68 Japanese city name pairs spoken by male speakers. The experimental results show that our distance measure gives a robust recognition rate in spite of the variation in frequency characteristics and signal to noise ratio(SNR). In noisy situations of segmental SNR 20 dB, the recognition rate was more than 13% higher than that obtained by using the standard Euclidean cepstral distance measure. Finally, it is shown that the optimum value ofsis approximately 1, and the optimum range of τΔT is about 1 ms.
提出了一种基于平滑LPC群延迟谱的频谱距离测量方法,该方法在可变频率传输特性和加性噪声条件下具有稳定的识别性能。在我们的测量中,第n个倒谱系数的权重由{w_n} = n^{s}给出。exp (-n^2{/}2 tau ^2{)},可以通过选择适当的sand τ值来调整。为了优化这种距离度量的参数,我们在一个依赖于说话人的孤立词识别系统中使用标准的动态时间规整技术进行了大量的实验。这里使用的输入语音数据是一组语音非常相似的68个日本城市名称对,由男性说话者说出。实验结果表明,在频率特性和信噪比变化的情况下,我们的距离测量方法具有良好的鲁棒识别率。在信噪比为20 dB的噪声情况下,识别率大于13% higher than that obtained by using the standard Euclidean cepstral distance measure. Finally, it is shown that the optimum value ofsis approximately 1, and the optimum range of τΔT is about 1 ms.
{"title":"Distance measure for speech recognition based on the smoothed group delay spectrum","authors":"F. Itakura, T. Umezaki","doi":"10.1109/ICASSP.1987.1169476","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169476","url":null,"abstract":"We present a novel spectral distance measure based on the smoothed LPC group delay spectrum which gives a stable recognition performance under variable frequency transfer characteristics and additive noise. The weight of the n-th cepstral coefficients in our measure is given byW_{n} = n^{s}. exp(-n^{2}/2tau^{2})which can be adjusted by selecting proper values ofsand τ. In order to optimize the parameters of this distance measure, extensive experiments are carried out in a speaker-dependent isolated word recognition system using a standard dynamic time warping technique. The input speech data used here is a set of phonetically very similar 68 Japanese city name pairs spoken by male speakers. The experimental results show that our distance measure gives a robust recognition rate in spite of the variation in frequency characteristics and signal to noise ratio(SNR). In noisy situations of segmental SNR 20 dB, the recognition rate was more than 13% higher than that obtained by using the standard Euclidean cepstral distance measure. Finally, it is shown that the optimum value ofsis approximately 1, and the optimum range of τΔT is about 1 ms.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129386887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
Signal representation and processing in the mixed time-frequency domain 混合时频域信号表示与处理
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169487
K. Yu
This paper is concerned with mixed time-frequency signal processing using Wigner Distribution Function (WDF). This approach is based upon the generation of the mixed time-frequency representation (MTFR) of a signal, processing of that representation in the mixed time-frequency domain, and obtaining a filtered output by an inverse operation or approximation procedure. Various signal processing operations are formulated. Validity condition for the resulting MTFR are investigated. An inverse operation can be applied as an exact procedure if the output MTFR is a valid WDF, it can be regarded as an approximation procedure if the resulting MTFR is not admissible. Other approximations can be formulated. One scheme depends on the projection of the MTFR onto the space of valid WDF. The other scheme depends on the modification of the filtering function in a minimal way such that the resulting MTFR is valid.
本文研究了用维格纳分布函数(WDF)处理混合时频信号。该方法基于信号的混合时频表示(MTFR)的生成,在混合时频域中处理该表示,并通过逆操作或近似过程获得滤波输出。制定了各种信号处理操作。对所得MTFR的有效性条件进行了研究。如果输出的MTFR是有效的WDF,则逆操作可以作为精确过程应用;如果产生的MTFR不允许,则可以将其视为近似过程。其他的近似也可以公式化。一种方案依赖于MTFR在有效WDF空间上的投影。另一种方案依赖于以最小的方式修改过滤函数,从而使所得到的MTFR是有效的。
{"title":"Signal representation and processing in the mixed time-frequency domain","authors":"K. Yu","doi":"10.1109/ICASSP.1987.1169487","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169487","url":null,"abstract":"This paper is concerned with mixed time-frequency signal processing using Wigner Distribution Function (WDF). This approach is based upon the generation of the mixed time-frequency representation (MTFR) of a signal, processing of that representation in the mixed time-frequency domain, and obtaining a filtered output by an inverse operation or approximation procedure. Various signal processing operations are formulated. Validity condition for the resulting MTFR are investigated. An inverse operation can be applied as an exact procedure if the output MTFR is a valid WDF, it can be regarded as an approximation procedure if the resulting MTFR is not admissible. Other approximations can be formulated. One scheme depends on the projection of the MTFR onto the space of valid WDF. The other scheme depends on the modification of the filtering function in a minimal way such that the resulting MTFR is valid.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127784533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Signal modeling by exponential segments and application in voiced speech analysis 指数分段信号建模及其在语音分析中的应用
Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169572
S. Parthasarathy, D. Tufts
The analysis of signals that can be represented as a linear combination of exponentially damped sinusoids where the values of damping factors, frequencies, and the linear combination coefficients change at certain transition times is considered. These transitions represent the opening and closing of the glottis in the case of speech signals. Techniques are presented for the accurate estimation of the exponential parameters and the times of transition, from noise corrupted observations of the signal. The exponential parameters are obtained by improved linear prediction techniques using low-rank approximations, and further refined by an iterative least-squares technique with stability constraints imposed on the damping factors. Optimal estimates (in the least-squares sense) of the time of transition are presented. Our knowledge of the signal structure is used to obtain improved performance and also a computationally efficient estimation algorithm. Experiments with real, connected speech indicate that the speech waveforms can be accurately represented from a small number of parameters using the analysis presented here.
对可以表示为指数阻尼正弦波的线性组合的信号进行分析,其中阻尼因子、频率和线性组合系数的值在一定的过渡时间内变化。在语音信号的情况下,这些转换代表声门的打开和关闭。提出了从噪声破坏的信号观测中准确估计指数参数和过渡时间的技术。指数参数采用改进的低秩近似线性预测技术获得,并通过对阻尼因子施加稳定性约束的迭代最小二乘技术进一步细化。给出了过渡时间的最优估计(在最小二乘意义上)。我们对信号结构的了解被用来获得更好的性能和计算效率高的估计算法。用真实的连通语音进行的实验表明,使用本文提出的分析方法可以从少量参数中准确地表示语音波形。
{"title":"Signal modeling by exponential segments and application in voiced speech analysis","authors":"S. Parthasarathy, D. Tufts","doi":"10.1109/ICASSP.1987.1169572","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169572","url":null,"abstract":"The analysis of signals that can be represented as a linear combination of exponentially damped sinusoids where the values of damping factors, frequencies, and the linear combination coefficients change at certain transition times is considered. These transitions represent the opening and closing of the glottis in the case of speech signals. Techniques are presented for the accurate estimation of the exponential parameters and the times of transition, from noise corrupted observations of the signal. The exponential parameters are obtained by improved linear prediction techniques using low-rank approximations, and further refined by an iterative least-squares technique with stability constraints imposed on the damping factors. Optimal estimates (in the least-squares sense) of the time of transition are presented. Our knowledge of the signal structure is used to obtain improved performance and also a computationally efficient estimation algorithm. Experiments with real, connected speech indicate that the speech waveforms can be accurately represented from a small number of parameters using the analysis presented here.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134059338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1