ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing最新文献

英文中文

Modeling spectral speech transitions using temporal decomposition techniques 使用时间分解技术建模频谱语音转换

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169742

G. Ahlbom, F. Bimbot, G. Chollet

ATAL [1] introduced a technique for decomposing speech into phone-length temporal events in terms of overlapping and interacting articulatory gestures. This paper reports on simplifications of this technique with applications to acoustic-phonetic synthesis. Spectral evolution is represented by time-indexed trajectories in the p-dimensional space of Log-Area Ratios{y_{i}= Ln ((1+k_{i})/(1-k_{i}))}where kiare the reflection coefficients obtained from short-time stationary LPC analysis. The vocal tract configuration (spectral vector) associated with each interpolation function belongs to a finite set of articulatory targets (vector quantization code book). A set of speech segments ("polysons") has been encoded using this technique. It includes diphones, demi-syllables, and other units that are difficult to segment. Temporal decomposition using target spectra can break the complex encoding of these segments. In particular, coarticulation effects are analyticaiy explained and modeled. It is demonstrated that these new tools provide an adequate environment in our search for better rules in acoustic speech synthesis.

ATAL[1]引入了一种技术，根据重叠和相互作用的发音手势，将语音分解为电话长度的时间事件。本文报道了该技术的简化及其在声音合成中的应用。光谱演化由p维空间中Log-Area ratio (y_{i}= Ln ((1+k_{i})/(1-k_{i}))}的时间索引轨迹表示，其中ki为短时平稳LPC分析得到的反射系数。声道结构(谱矢量)与每个插值函数相关联，属于有限的发音目标集合(矢量量化代码书)。一组语音片段(“多义词”)已经使用这种技术进行了编码。它包括双音、半音节和其他难以分割的单位。利用目标光谱进行时间分解可以打破这些片段的复杂编码。特别是，协同衔接效应的分析解释和建模。结果表明，这些新工具为我们寻找更好的声学语音合成规则提供了充分的环境。

{"title":"Modeling spectral speech transitions using temporal decomposition techniques","authors":"G. Ahlbom, F. Bimbot, G. Chollet","doi":"10.1109/ICASSP.1987.1169742","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169742","url":null,"abstract":"ATAL [1] introduced a technique for decomposing speech into phone-length temporal events in terms of overlapping and interacting articulatory gestures. This paper reports on simplifications of this technique with applications to acoustic-phonetic synthesis. Spectral evolution is represented by time-indexed trajectories in the p-dimensional space of Log-Area Ratios{y_{i}= Ln ((1+k_{i})/(1-k_{i}))}where kiare the reflection coefficients obtained from short-time stationary LPC analysis. The vocal tract configuration (spectral vector) associated with each interpolation function belongs to a finite set of articulatory targets (vector quantization code book). A set of speech segments (\"polysons\") has been encoded using this technique. It includes diphones, demi-syllables, and other units that are difficult to segment. Temporal decomposition using target spectra can break the complex encoding of these segments. In particular, coarticulation effects are analyticaiy explained and modeled. It is demonstrated that these new tools provide an adequate environment in our search for better rules in acoustic speech synthesis.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129622445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

A speaker-stress resistant HMM isolated word recognizer 一种抗说话人应力的HMM孤立词识别器

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169551

D. Paul

Most current speech recognition systems are sensitive to variations in speaker style, the following is the result of an effort to make a Hidden Markov Model (HMM) Isolated Word Recognizer (IWR) tolerant to such speech changes caused by speaker stress. More than an order-of-magnitude reduction of the error rate was achieved for a 105 word simulated stress database and a 0% error rate was achieved for the TI 20 isolated word database.

目前大多数语音识别系统对说话人风格的变化都很敏感，下面是一个隐马尔可夫模型孤立词识别器(IWR)的研究成果，该模型可以容忍说话人压力引起的这种语音变化。对于105个单词的模拟应力数据库，错误率降低了一个数量级以上，对于TI 20孤立单词数据库，错误率达到了0%。

引用次数: 59

Serial/Parallel architectures for area-efficient vector multiplication 用于面积高效的矢量乘法的串行/并行架构

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169690

Stewart Smith, P. Denyer

The use of standard-part multiply/accumulators in digital signal processing is often in the computation of vector products. In the realm of custom VLSI, direct computation of vector products can result in area savings over classical multiply/accumulate methods. A methodology is presented for composition of VLSI architectures for direct vector multiplication, based on three fundamental computational elements. These are register, data selecter, and carry-save add-shift (CSAS) computer. The CSAS computer is a linear array of gated carry-save adders which performs shifting accumulation of partial results. Two's complement serial/parallel carry-save accumulation provides performance, while the use of symmetric-coded distributed arithmetic eliminates redundant computation to effect area-savings.

标准乘法累加器在数字信号处理中的应用通常是在矢量积的计算中。在定制VLSI领域，直接计算向量积可以比经典的乘法/累加方法节省面积。提出了一种基于三个基本计算元素的VLSI结构直接向量乘法组成方法。这些是寄存器，数据选择器和进位-保存-加移位(CSAS)计算机。CSAS计算机是一组对部分结果进行移位累加的门控存进位加法器的线性阵列。两个互补的串行/并行进位节省积累提供了性能，而使用对称编码的分布式算法消除了冗余计算，以达到节省面积的效果。

引用次数: 4

Non-linear adaptive signal processor 非线性自适应信号处理器

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169602

M. Lagunas, F. Vallverdú, M. Santamaria

This paper is a first attempt to give formalism to non-linear system design and in which context, related with similar linear processing techniques, they are located. A summary on the relation-ship of linear objectives and classical adaptive algorithms, in non-linear design problems, introduces the paper; giving the potential of random search techniques in order to open the different problems in non-linear objectives that could be handled with them. After, the similarity between probability distribution functions and power spectral density in linear processing is shown. This is supported by a nice example of non-linear system design. Finally, some prospective work is reported in the problem of adaptive companding design.

本文首次尝试将非线性系统设计形式化，并在此背景下与类似的线性处理技术相关。本文综述了非线性设计问题中线性目标与经典自适应算法的关系;提供随机搜索技术的潜力，以便在非线性目标中打开不同的问题，可以用它们来处理。在此基础上，分析了概率分布函数与功率谱密度在线性处理中的相似性。非线性系统设计的一个很好的例子支持了这一点。最后，对自适应扩展设计问题进行了展望。

引用次数: 0

Suppression and detection of impulse type interference using adaptive median hybrid filters 使用自适应中值混合滤波器抑制和检测脉冲型干扰

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169749

A. Nieminen, P. Heinonen, Y. Neuvo

In this paper, we introduce a new type of nonlinear filters, the Adaptive Median Hybrid (AMH) filters, for the suppression and detection of short duration interferences. In the AMH filters, adaptive filter substructures are used to estimate the current signal value from the future and past signal values. The output of the overall filter is the median of the adaptive filter outputs and the current signal value. This kind of nonlinear filter structure is shown to adapt and preserve rapid changes in signal characteristics well. However, it filters out short duration interferences. By examining the difference between the original and filtered data, interferences can be detected. We introduce two types of AMH filters, the AMH filter with separate adaptive substructures (SAMH) and the AMH filter with coupled substructures (CAMH), which have different convergence properties and implementation. We use both synthetic and real data (speech and electroencephalogram (EEG)) to show the applicability of the proposed filters.

本文介绍了一种用于抑制和检测短持续时间干扰的新型非线性滤波器——自适应中值混合滤波器(AMH)。在AMH滤波器中，自适应滤波器子结构用于从未来和过去信号值估计当前信号值。整体滤波器的输出是自适应滤波器输出和当前信号值的中值。这种非线性滤波器结构能很好地适应和保持信号特性的快速变化。然而，它过滤掉短时间的干扰。通过检查原始数据和过滤后的数据之间的差异，可以检测到干扰。介绍了两种具有不同收敛特性和实现方法的AMH滤波器，分别是具有独立自适应子结构的AMH滤波器(SAMH)和具有耦合子结构的AMH滤波器(CAMH)。我们使用合成数据和真实数据(语音和脑电图)来证明所提出滤波器的适用性。

引用次数: 14

Reconstructing a finite length sequence from several of its correlation lags 利用相关滞后重构有限长度序列

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169415

A. Steinhardt

In this paper we present an algorithm which answers the following question: Given a finite number of correlation lags, what is the shortest length sequence which could have produced these correlations? This question is equivalent to asking for the minimum order moving average (all-zero) model which can match a given set of correlations. The algorithm applies to both the case of uniform correlations and missing lag correlations. The algorithm involves quadratic programming coupled with a new representation of the boundary of correlations derived from finite sequences in terms of the spectral decomposition of a certain class of banded Toeplitz matrices.

在本文中，我们提出了一种算法，它回答了以下问题:给定有限数量的相关滞后，可以产生这些相关性的最短长度序列是什么?这个问题相当于要求最小订单移动平均(全零)模型，该模型可以匹配给定的一组相关性。该算法适用于均匀相关和缺失滞后相关的情况。该算法涉及二次规划，并结合一类带状Toeplitz矩阵的谱分解来表示有限序列的关联边界。

引用次数: 2

A data-driven organization of the dynamic programming beam search for continuous speech recognition 一个数据驱动的动态规划组织波束搜索，用于连续语音识别

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169844

H. Ney, D. Mergel, A. Noll, A. Paeseler

This paper describes a data-driven organization of the dynamic programming beam search for large vocabulary, continuous speech recognition. This organization can be viewed as an extension of the one-pass dynamic programming algorithm for connected word recognition. In continuous speech recognition we are faced with a huge search space, and search hypotheses have to be formed at the 10-ms level. The organization of the search presented has the following characteristics. Its computational cost is proportional only to the number of hypotheses actually generated and is independent of the overall size of the potential search space. There is no limit on the number of word hypotheses, there is only a limit to the overall number of hypotheses due to memory constraints. The implementation of the search has been studied and tested on a continuous speech data base comprising 20672 words.

本文介绍了一种数据驱动的动态规划组织波束搜索，用于大词汇量、连续语音识别。这种组织可以看作是连接词识别的一遍动态规划算法的扩展。在连续语音识别中，我们面临着巨大的搜索空间，搜索假设必须在10毫秒的水平上形成。所提出的搜索组织具有以下特点。它的计算成本只与实际生成的假设数量成正比，与潜在搜索空间的总体大小无关。单词假设的数量没有限制，只是由于记忆的限制，假设的总数有限制。在一个包含20672个单词的连续语音数据库上对搜索的实现进行了研究和测试。

引用次数: 129

Bit reverse unscrambling for a radix-2MFFT 对基数2mfft的位反向解扰

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169492

C. Burrus

The traditional Cooley-Tukey and the prime factor FFT algorithms either produce the output in scrambled order or the input data order must be prescrambled. Several methods for scrambling and unscrambling the DFT are presented. The new result in this paper is the observation that the radix-4, radix-8, or any radix-2mFFT can be modified to give the output in the same bit-reversed order as the radix-2 FFT.

传统的Cooley-Tukey算法和素因子FFT算法要么以乱序输出，要么必须对输入数据顺序进行预乱。给出了几种对DFT进行置乱和解置的方法。本文的新结果是观察到可以修改基数-4、基数-8或任何基数- 2mfft，以获得与基数-2 FFT相同的反向顺序的输出。

引用次数: 6

Generalized linear inversion applied to seismic data in one and two dimensions 广义线性反演在一维和二维地震资料中的应用

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169379

J. Justice, S. Dougherty

Generalized linear inversion (GLI) is a parameter estimation technique which shows great potential for use in solving inverse problems in many fields, including exploration seismology. We suggest a particular implementation of the procedure which may be used for simultaneous parameter estimation, and illustrate its use with the 1-D seismic deconvolution problem. The procedure is easily extended to the multidimensional case, and we illustrate this extension by computing depth and velocity structure in a flat-layer model using multiple offset data.

广义线性反演(GLI)是一种参数估计技术，在解决包括勘探地震学在内的许多领域的反演问题中显示出巨大的应用潜力。我们提出了一个可用于同时参数估计的程序的特殊实现，并说明了它在一维地震反褶积问题中的应用。该方法很容易扩展到多维情况，我们通过使用多个偏移量数据计算平面层模型中的深度和速度结构来说明这种扩展。

引用次数: 0

On the automatic segmentation of speech signals 语音信号的自动分割

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

Pub Date : 1987-04-06 DOI: 10.1109/ICASSP.1987.1169628

T. Svendsen, F. Soong

For large vocabulary and continuous speech recognition, the sub-word-unit-based approach is a viable alternative to the whole-word-unit-based approach. For preparing a large inventory of subword units, an automatic segmentation is preferrable to manual segmentation as it substantially reduces the work associated with the generation of templates and gives more consistent results. In this paper we discuss some methods for automatically segmenting speech into phonetic units. Three different approaches are described, one based on template matching, one based on detecting the spectral changes that occur at the boundaries between phonetic units and one based on a constrained-clustering vector quantization approach. An evaluation of the performance of the automatic segmentation methods is given.

对于大词汇量和连续语音识别，基于子词单元的方法是一种可行的替代方法。对于准备大量子词单元，自动分词比手动分词更可取，因为它大大减少了与模板生成相关的工作，并提供了更一致的结果。本文讨论了语音自动切分的几种方法。描述了三种不同的方法，一种基于模板匹配，一种基于检测语音单位边界处发生的频谱变化，一种基于约束聚类矢量量化方法。对自动分割方法的性能进行了评价。

引用次数: 161

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀