首页 > 最新文献

1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)最新文献

英文 中文
A new method for blind source separation of nonstationary signals 一种新的非平稳信号盲源分离方法
Douglas L. Jones
Many algorithms for blind source separation have been introduced in the past few years, most of which assume statistically stationary sources. In many applications, such as separation of speech or fading communications signals, the sources are nonstationary. We present a new adaptive algorithm for blind source separation of nonstationary signals which relies only on the nonstationary nature of the sources to achieve separation. The algorithm is an efficient, online, stochastic gradient update based on minimizing the average squared cross-output-channel-correlations along with deviation from unity average energy in each output channel. Advantages of this algorithm over existing methods include increased computational efficiency, a simple on-line, adaptive implementation requiring only multiplications and additions, and the ability to blindly separate nonstationary sources regardless of their detailed statistical structure.
在过去的几年中,人们提出了许多盲源分离算法,其中大多数算法都假定源在统计上是平稳的。在许多应用中,例如语音分离或衰落通信信号,源是非平稳的。本文提出了一种新的自适应非平稳信号盲分离算法,该算法仅依靠信号的非平稳特性来实现盲分离。该算法是一种高效的、在线的、基于最小化交叉输出通道相关的平均平方以及每个输出通道中与单位平均能量的偏差的随机梯度更新。与现有方法相比,该算法的优点包括提高计算效率,简单的在线自适应实现,只需要乘法和加法,以及盲目分离非平稳源的能力,而不考虑其详细的统计结构。
{"title":"A new method for blind source separation of nonstationary signals","authors":"Douglas L. Jones","doi":"10.1109/ICASSP.1999.761367","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.761367","url":null,"abstract":"Many algorithms for blind source separation have been introduced in the past few years, most of which assume statistically stationary sources. In many applications, such as separation of speech or fading communications signals, the sources are nonstationary. We present a new adaptive algorithm for blind source separation of nonstationary signals which relies only on the nonstationary nature of the sources to achieve separation. The algorithm is an efficient, online, stochastic gradient update based on minimizing the average squared cross-output-channel-correlations along with deviation from unity average energy in each output channel. Advantages of this algorithm over existing methods include increased computational efficiency, a simple on-line, adaptive implementation requiring only multiplications and additions, and the ability to blindly separate nonstationary sources regardless of their detailed statistical structure.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115120046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Two spatio-temporal decorrelation learning algorithms and their application to multichannel blind deconvolution 两种时空去相关学习算法及其在多通道盲反卷积中的应用
Seungjin Choi, A. Cichocki, S. Amari
We present and compare two different spatio-temporal decorrelation learning algorithms for updating the weights of a linear feedforward network with FIR synapses (MIMO FIR filter). Both standard gradient and the natural gradient are employed to derive the spatio-temporal decorrelation algorithms. These two algorithms are applied to multichannel blind deconvolution task and their performance is compared. The rigorous derivation of algorithms and computer simulation results are presented.
我们提出并比较了两种不同的时空去相关学习算法,用于更新具有FIR突触的线性前馈网络(MIMO FIR滤波器)的权重。采用标准梯度和自然梯度推导了时空去相关算法。将这两种算法应用于多通道盲反褶积任务,并对其性能进行了比较。给出了算法的严格推导和计算机仿真结果。
{"title":"Two spatio-temporal decorrelation learning algorithms and their application to multichannel blind deconvolution","authors":"Seungjin Choi, A. Cichocki, S. Amari","doi":"10.1109/ICASSP.1999.759932","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.759932","url":null,"abstract":"We present and compare two different spatio-temporal decorrelation learning algorithms for updating the weights of a linear feedforward network with FIR synapses (MIMO FIR filter). Both standard gradient and the natural gradient are employed to derive the spatio-temporal decorrelation algorithms. These two algorithms are applied to multichannel blind deconvolution task and their performance is compared. The rigorous derivation of algorithms and computer simulation results are presented.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115267935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A neural network for data association 用于数据关联的神经网络
M. Winter, G. Favier
This paper presents a new neural solution for solving the data association problem. This problem, also known as the multidimensional assignment problem, arises in data fusion systems like radar and sonar targets tracking, robotic vision... Since it leads to an NP-complete combinatorial optimization, the optimal solution can not be reached in an acceptable calculation time, and the use of approximation methods like the Lagrangian relaxation is necessary. In this paper, we propose an alternative approach based on a Hopfield neural model. We show that it converges to an interesting solution that respects the constraints of the association problem. Some simulation results are presented to illustrate the behaviour of the proposed neural solution for an artificial association problem.
本文提出了一种新的神经网络方法来解决数据关联问题。这个问题,也被称为多维分配问题,出现在数据融合系统中,如雷达和声纳目标跟踪,机器人视觉……由于它会导致np完全组合优化,因此无法在可接受的计算时间内得到最优解,因此需要使用拉格朗日松弛等近似方法。在本文中,我们提出一种基于Hopfield神经模型的替代方法。我们证明了它收敛到一个有趣的解,它尊重关联问题的约束。给出了一些仿真结果来说明所提出的人工关联问题的神经解的行为。
{"title":"A neural network for data association","authors":"M. Winter, G. Favier","doi":"10.1109/ICASSP.1999.759921","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.759921","url":null,"abstract":"This paper presents a new neural solution for solving the data association problem. This problem, also known as the multidimensional assignment problem, arises in data fusion systems like radar and sonar targets tracking, robotic vision... Since it leads to an NP-complete combinatorial optimization, the optimal solution can not be reached in an acceptable calculation time, and the use of approximation methods like the Lagrangian relaxation is necessary. In this paper, we propose an alternative approach based on a Hopfield neural model. We show that it converges to an interesting solution that respects the constraints of the association problem. Some simulation results are presented to illustrate the behaviour of the proposed neural solution for an artificial association problem.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115675376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Wireless MPEG-4 video on Texas Instruments DSP chips 德州仪器DSP芯片上的无线MPEG-4视频
M. Budagavi, W. Heinzelman, Jennifer Webb, R. Talluri
Technology has advanced in recent years to the point where multimedia communicators are beginning to emerge. These communicators are low-power, portable devices that can transmit and receive multimedia data through the wireless network. Due to the high computational complexity involved and the low-power constraint in wireless applications, these devices require the use of processors that are powerful and are at the same time very power-efficient. In order to facilitate interoperability, it is important that these devices use standardized compression and communication algorithms. As a first step in implementing multimedia terminals, Texas Instruments (TI) has demonstrated real-time MPEG-4 video decoding (simple profile) on TMS320C54x, TI's low power, high performance DSP chip. In addition, TI has outlined a system-level solution to transmitting video across wireless networks, including channel coding and communication protocols.
近年来,随着技术的进步,多媒体通信设备开始出现。这些通信器是低功耗的便携式设备,可以通过无线网络传输和接收多媒体数据。由于无线应用中涉及的高计算复杂性和低功耗限制,这些设备需要使用功能强大且同时非常节能的处理器。为了促进互操作性,重要的是这些设备使用标准化的压缩和通信算法。作为实现多媒体终端的第一步,德州仪器(TI)在其低功耗、高性能的DSP芯片TMS320C54x上演示了实时MPEG-4视频解码(简单配置文件)。此外,TI还概述了通过无线网络传输视频的系统级解决方案,包括信道编码和通信协议。
{"title":"Wireless MPEG-4 video on Texas Instruments DSP chips","authors":"M. Budagavi, W. Heinzelman, Jennifer Webb, R. Talluri","doi":"10.1109/ICASSP.1999.758378","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758378","url":null,"abstract":"Technology has advanced in recent years to the point where multimedia communicators are beginning to emerge. These communicators are low-power, portable devices that can transmit and receive multimedia data through the wireless network. Due to the high computational complexity involved and the low-power constraint in wireless applications, these devices require the use of processors that are powerful and are at the same time very power-efficient. In order to facilitate interoperability, it is important that these devices use standardized compression and communication algorithms. As a first step in implementing multimedia terminals, Texas Instruments (TI) has demonstrated real-time MPEG-4 video decoding (simple profile) on TMS320C54x, TI's low power, high performance DSP chip. In addition, TI has outlined a system-level solution to transmitting video across wireless networks, including channel coding and communication protocols.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123131702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Subspace state space model identification for speech enhancement 语音增强的子空间状态空间模型识别
É. Grivel, M. Gabrea, M. Najim
This paper deals with Kalman filter-based enhancement of a speech signal contaminated by a white noise, using a single microphone system. Such a problem can be stated as a realization issue in the framework of identification. For such a purpose we propose to identify the state space model by using subspace non-iterative algorithms based on orthogonal projections. Unlike estimate-maximize (EM)-based algorithms, this approach provides, in a single iteration from noisy observations, the matrices related to state space model and the covariance matrices that are necessary to perform Kalman filtering. In addition no voice activity detector is required unlike existing methods. Both methods proposed here are compared with classical approaches.
本文研究了基于卡尔曼滤波的单传声器系统中受白噪声污染语音信号的增强。这样的问题可以在识别的框架中作为实现问题来陈述。为此,我们提出使用基于正交投影的子空间非迭代算法来识别状态空间模型。与基于估计最大化(EM)的算法不同,该方法在噪声观测的单次迭代中提供与状态空间模型相关的矩阵和执行卡尔曼滤波所需的协方差矩阵。此外,与现有方法不同,不需要语音活动检测器。本文提出的两种方法都与经典方法进行了比较。
{"title":"Subspace state space model identification for speech enhancement","authors":"É. Grivel, M. Gabrea, M. Najim","doi":"10.1109/ICASSP.1999.759787","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.759787","url":null,"abstract":"This paper deals with Kalman filter-based enhancement of a speech signal contaminated by a white noise, using a single microphone system. Such a problem can be stated as a realization issue in the framework of identification. For such a purpose we propose to identify the state space model by using subspace non-iterative algorithms based on orthogonal projections. Unlike estimate-maximize (EM)-based algorithms, this approach provides, in a single iteration from noisy observations, the matrices related to state space model and the covariance matrices that are necessary to perform Kalman filtering. In addition no voice activity detector is required unlike existing methods. Both methods proposed here are compared with classical approaches.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124468333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
An 8 kbit/s ACELP coder with improved background noise performance 8 kbit/s ACELP编码器,提高了背景噪声性能
R. Hagen, E. Ekudden
This paper describes an 8 kbit/s ACELP speech coder with high performance for both speech and non-speech signals such as background noise. While the traditional waveform matching LPAS structure employed in many existing speech coders provides high quality for speech signals, it has significant performance limitations for, for example, background noise. The coder presented here employs a novel adaptive gain coding technique using energy matching in combination with a traditional waveform matching criterion providing high quality for both speech and background noise. The coder has a basic structure similar to that of the 7.4 kbit/s D-AMPS EFR coder, with a 10 th order LPC, high resolution adaptive codebook and a 4 pulse algebraic codebook. The performance for speech signals is equivalent to or better than that of state-of-the-art 8 kbit/s coders, while for background noise conditions the performance is significantly improved.
本文介绍了一种8kbit /s的ACELP语音编码器,该编码器对语音和非语音信号(如背景噪声)都具有高性能。虽然许多现有语音编码器采用的传统波形匹配LPAS结构可以提供高质量的语音信号,但它具有明显的性能限制,例如背景噪声。本文提出的编码器采用了一种新的自适应增益编码技术,将能量匹配与传统的波形匹配准则相结合,为语音和背景噪声提供了高质量的编码。编码器的基本结构与7.4 kbit/s的D-AMPS EFR编码器相似,具有10阶LPC、高分辨率自适应码本和4脉冲代数码本。语音信号的性能相当于或优于最先进的8kbit /s编码器,而在背景噪声条件下,性能显着提高。
{"title":"An 8 kbit/s ACELP coder with improved background noise performance","authors":"R. Hagen, E. Ekudden","doi":"10.1109/ICASSP.1999.758053","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758053","url":null,"abstract":"This paper describes an 8 kbit/s ACELP speech coder with high performance for both speech and non-speech signals such as background noise. While the traditional waveform matching LPAS structure employed in many existing speech coders provides high quality for speech signals, it has significant performance limitations for, for example, background noise. The coder presented here employs a novel adaptive gain coding technique using energy matching in combination with a traditional waveform matching criterion providing high quality for both speech and background noise. The coder has a basic structure similar to that of the 7.4 kbit/s D-AMPS EFR coder, with a 10 th order LPC, high resolution adaptive codebook and a 4 pulse algebraic codebook. The performance for speech signals is equivalent to or better than that of state-of-the-art 8 kbit/s coders, while for background noise conditions the performance is significantly improved.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121827610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Beam-augmented space-time adaptive processing 波束增强时空自适应处理
Y. Seliktar, Douglas B. Williams, E. J. Holder
Combined monostatic clutter (MSC) and terrain scattered interference (TSI) pose a difficult challenge for adaptive radar processing. Mitigation techniques exist for each interference alone but are insufficient for their combined effects. Current approaches separate the problem into two stages where TSI is suppressed first and then MSC. The problem with this cascade approach is that during the initial TSI suppression stage, the MSC becomes corrupted. In this paper an innovative technique is introduced for achieving a significant improvement in cancellation performance for both MSC and TSI, even when the jammer appears in the mainbeam. The majority of the interference rejection, both TSI and MSC, is accomplished with an MSC filter, with further TSI suppression accomplished via an additional tapped reference beam. Simultaneous optimization of the MSC filter weights and reference beam weights yields the desired processor. Performance results using Mountain-top data demonstrate the superiority of the proposed processor over existing processors.
单站杂波(MSC)和地形散射干扰(TSI)相结合给自适应雷达处理带来了难题。针对每种干扰单独存在缓解技术,但对它们的综合影响却不够。目前的方法将问题分为两个阶段,首先抑制TSI,然后抑制MSC。这种级联方法的问题是,在最初的TSI抑制阶段,MSC被破坏。本文介绍了一种创新技术,即使干扰机出现在主波束中,也能显著提高MSC和TSI的对消性能。大部分的干扰抑制,无论是TSI还是MSC,都是通过一个MSC滤波器来实现的,进一步的TSI抑制通过一个额外的抽头参考波束来实现。同时优化MSC滤波器权重和参考波束权重产生所需的处理器。使用Mountain-top数据的性能结果证明了所建议的处理器优于现有处理器。
{"title":"Beam-augmented space-time adaptive processing","authors":"Y. Seliktar, Douglas B. Williams, E. J. Holder","doi":"10.1109/ICASSP.1999.761356","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.761356","url":null,"abstract":"Combined monostatic clutter (MSC) and terrain scattered interference (TSI) pose a difficult challenge for adaptive radar processing. Mitigation techniques exist for each interference alone but are insufficient for their combined effects. Current approaches separate the problem into two stages where TSI is suppressed first and then MSC. The problem with this cascade approach is that during the initial TSI suppression stage, the MSC becomes corrupted. In this paper an innovative technique is introduced for achieving a significant improvement in cancellation performance for both MSC and TSI, even when the jammer appears in the mainbeam. The majority of the interference rejection, both TSI and MSC, is accomplished with an MSC filter, with further TSI suppression accomplished via an additional tapped reference beam. Simultaneous optimization of the MSC filter weights and reference beam weights yields the desired processor. Performance results using Mountain-top data demonstrate the superiority of the proposed processor over existing processors.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116881831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Synthesis of array architectures for block matching motion estimation: design exploration using the tool DG2VHDL 用于块匹配运动估计的阵列体系结构的综合:使用DG2VHDL工具的设计探索
J. Bonk, A. Stone, E. Manolakos
We present a design case study using DG2VHDL a tool which bridges the gap between an abstract graphical description of a DSP algorithm and its concrete hardware description language (HDL) representation, DG2VHDL automatically translates a dependence graph (DG) into a synthesizable, behavioral VHDL entity that can be input to industrial strength behavioral compilers for producing silicon implementations of the algorithm (FPGAs, ASICs). Full search block matching motion estimation was selected for its current applications (MPEG, HDTV, video conferencing) as well as for the richness of literature and architectural exploration over the last decade. We will not only demonstrate here that the behavioral VHDL code produced automatically by the tool leads, after behavioral synthesis, to an efficient distributed memory and control modular array architecture, but will also provide comparative statistics for several new FS-BMA architectures derived for real-time motion estimation.
我们提出了一个使用DG2VHDL的设计案例研究,DG2VHDL是一个工具,它弥合了DSP算法的抽象图形描述与其具体硬件描述语言(HDL)表示之间的差距,DG2VHDL自动将依赖图(DG)转换为可合成的行为VHDL实体,该实体可以输入到工业强度的行为编译器中,用于生产算法的硅实现(fpga, asic)。选择全搜索块匹配运动估计是基于其当前的应用(MPEG, HDTV,视频会议)以及过去十年来丰富的文献和架构探索。我们不仅将在这里展示由该工具自动生成的行为VHDL代码,在行为合成之后,导致高效的分布式内存和控制模块化阵列架构,而且还将为几个新的FS-BMA架构提供比较统计数据,用于实时运动估计。
{"title":"Synthesis of array architectures for block matching motion estimation: design exploration using the tool DG2VHDL","authors":"J. Bonk, A. Stone, E. Manolakos","doi":"10.1109/ICASSP.1999.758301","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758301","url":null,"abstract":"We present a design case study using DG2VHDL a tool which bridges the gap between an abstract graphical description of a DSP algorithm and its concrete hardware description language (HDL) representation, DG2VHDL automatically translates a dependence graph (DG) into a synthesizable, behavioral VHDL entity that can be input to industrial strength behavioral compilers for producing silicon implementations of the algorithm (FPGAs, ASICs). Full search block matching motion estimation was selected for its current applications (MPEG, HDTV, video conferencing) as well as for the richness of literature and architectural exploration over the last decade. We will not only demonstrate here that the behavioral VHDL code produced automatically by the tool leads, after behavioral synthesis, to an efficient distributed memory and control modular array architecture, but will also provide comparative statistics for several new FS-BMA architectures derived for real-time motion estimation.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116902134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Towards a robust/fast continuous speech recognition system using a voiced-unvoiced decision 实现一种鲁棒/快速连续语音识别系统
D. O'Shaughnessy, H. Tolba
We show that the concept of voiced-unvoiced (VU) classification of speech sounds can be incorporated not only in speech analysis or speech enhancement processes, but also can be useful for recognition processes. That is, the incorporation of such a classification in a continuous speech recognition (CSR) system not only improves its performance in low SNR environments, but also limits the time and the necessary memory to carry out the process of the recognition. The proposed V-U classification of the speech sounds has two principal functions: (1) it allows the enhancement of the voiced and unvoiced parts of speech separately; (2) it limits the Viterbi (1967) search space, and consequently the process of recognition can be carried out in real time without degrading the performance of the system. We prove via experiments that such a system outperforms the baseline HTK when a V-U decision is included in both front- and far-end of the HTK-based recognizer.
我们的研究表明,语音的浊音-浊音(VU)分类概念不仅可以用于语音分析或语音增强过程,而且可以用于识别过程。也就是说,在连续语音识别(CSR)系统中加入这种分类不仅可以提高其在低信噪比环境下的性能,而且还限制了进行识别过程的时间和必要的内存。所提出的语音V-U分类有两个主要功能:(1)它允许分别增强浊音和不浊音;(2)它限制了Viterbi(1967)的搜索空间,因此可以在不降低系统性能的情况下实时进行识别过程。我们通过实验证明,当基于HTK的识别器的前端和远端都包含V-U决策时,该系统优于基线HTK。
{"title":"Towards a robust/fast continuous speech recognition system using a voiced-unvoiced decision","authors":"D. O'Shaughnessy, H. Tolba","doi":"10.1109/ICASSP.1999.758150","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758150","url":null,"abstract":"We show that the concept of voiced-unvoiced (VU) classification of speech sounds can be incorporated not only in speech analysis or speech enhancement processes, but also can be useful for recognition processes. That is, the incorporation of such a classification in a continuous speech recognition (CSR) system not only improves its performance in low SNR environments, but also limits the time and the necessary memory to carry out the process of the recognition. The proposed V-U classification of the speech sounds has two principal functions: (1) it allows the enhancement of the voiced and unvoiced parts of speech separately; (2) it limits the Viterbi (1967) search space, and consequently the process of recognition can be carried out in real time without degrading the performance of the system. We prove via experiments that such a system outperforms the baseline HTK when a V-U decision is included in both front- and far-end of the HTK-based recognizer.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121248032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
DSPS education: an industry leader's experiences and expectations DSPS教育:行业领导者的经验与期望
C. Hewes, P. Rajasekaran
Texas Instruments is the industry leader in providing digital signal processing solutions to a variety of system applications including wireless communications, modems, hard disk drives, and many others. In this paper, the key roles of university research and education are described. The relationship of TI to the university community is reviewed. TI's expectations from university programs are also outlined.
德州仪器是为各种系统应用(包括无线通信、调制解调器、硬盘驱动器等)提供数字信号处理解决方案的行业领导者。本文阐述了大学科研和教育的关键作用。回顾了TI与大学社区的关系。TI对大学课程的期望也被概述。
{"title":"DSPS education: an industry leader's experiences and expectations","authors":"C. Hewes, P. Rajasekaran","doi":"10.1109/ICASSP.1999.758329","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758329","url":null,"abstract":"Texas Instruments is the industry leader in providing digital signal processing solutions to a variety of system applications including wireless communications, modems, hard disk drives, and many others. In this paper, the key roles of university research and education are described. The relationship of TI to the university community is reviewed. TI's expectations from university programs are also outlined.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127226773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1