Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.761367
Douglas L. Jones
Many algorithms for blind source separation have been introduced in the past few years, most of which assume statistically stationary sources. In many applications, such as separation of speech or fading communications signals, the sources are nonstationary. We present a new adaptive algorithm for blind source separation of nonstationary signals which relies only on the nonstationary nature of the sources to achieve separation. The algorithm is an efficient, online, stochastic gradient update based on minimizing the average squared cross-output-channel-correlations along with deviation from unity average energy in each output channel. Advantages of this algorithm over existing methods include increased computational efficiency, a simple on-line, adaptive implementation requiring only multiplications and additions, and the ability to blindly separate nonstationary sources regardless of their detailed statistical structure.
{"title":"A new method for blind source separation of nonstationary signals","authors":"Douglas L. Jones","doi":"10.1109/ICASSP.1999.761367","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.761367","url":null,"abstract":"Many algorithms for blind source separation have been introduced in the past few years, most of which assume statistically stationary sources. In many applications, such as separation of speech or fading communications signals, the sources are nonstationary. We present a new adaptive algorithm for blind source separation of nonstationary signals which relies only on the nonstationary nature of the sources to achieve separation. The algorithm is an efficient, online, stochastic gradient update based on minimizing the average squared cross-output-channel-correlations along with deviation from unity average energy in each output channel. Advantages of this algorithm over existing methods include increased computational efficiency, a simple on-line, adaptive implementation requiring only multiplications and additions, and the ability to blindly separate nonstationary sources regardless of their detailed statistical structure.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115120046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.759932
Seungjin Choi, A. Cichocki, S. Amari
We present and compare two different spatio-temporal decorrelation learning algorithms for updating the weights of a linear feedforward network with FIR synapses (MIMO FIR filter). Both standard gradient and the natural gradient are employed to derive the spatio-temporal decorrelation algorithms. These two algorithms are applied to multichannel blind deconvolution task and their performance is compared. The rigorous derivation of algorithms and computer simulation results are presented.
{"title":"Two spatio-temporal decorrelation learning algorithms and their application to multichannel blind deconvolution","authors":"Seungjin Choi, A. Cichocki, S. Amari","doi":"10.1109/ICASSP.1999.759932","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.759932","url":null,"abstract":"We present and compare two different spatio-temporal decorrelation learning algorithms for updating the weights of a linear feedforward network with FIR synapses (MIMO FIR filter). Both standard gradient and the natural gradient are employed to derive the spatio-temporal decorrelation algorithms. These two algorithms are applied to multichannel blind deconvolution task and their performance is compared. The rigorous derivation of algorithms and computer simulation results are presented.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115267935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.759921
M. Winter, G. Favier
This paper presents a new neural solution for solving the data association problem. This problem, also known as the multidimensional assignment problem, arises in data fusion systems like radar and sonar targets tracking, robotic vision... Since it leads to an NP-complete combinatorial optimization, the optimal solution can not be reached in an acceptable calculation time, and the use of approximation methods like the Lagrangian relaxation is necessary. In this paper, we propose an alternative approach based on a Hopfield neural model. We show that it converges to an interesting solution that respects the constraints of the association problem. Some simulation results are presented to illustrate the behaviour of the proposed neural solution for an artificial association problem.
{"title":"A neural network for data association","authors":"M. Winter, G. Favier","doi":"10.1109/ICASSP.1999.759921","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.759921","url":null,"abstract":"This paper presents a new neural solution for solving the data association problem. This problem, also known as the multidimensional assignment problem, arises in data fusion systems like radar and sonar targets tracking, robotic vision... Since it leads to an NP-complete combinatorial optimization, the optimal solution can not be reached in an acceptable calculation time, and the use of approximation methods like the Lagrangian relaxation is necessary. In this paper, we propose an alternative approach based on a Hopfield neural model. We show that it converges to an interesting solution that respects the constraints of the association problem. Some simulation results are presented to illustrate the behaviour of the proposed neural solution for an artificial association problem.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115675376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.758378
M. Budagavi, W. Heinzelman, Jennifer Webb, R. Talluri
Technology has advanced in recent years to the point where multimedia communicators are beginning to emerge. These communicators are low-power, portable devices that can transmit and receive multimedia data through the wireless network. Due to the high computational complexity involved and the low-power constraint in wireless applications, these devices require the use of processors that are powerful and are at the same time very power-efficient. In order to facilitate interoperability, it is important that these devices use standardized compression and communication algorithms. As a first step in implementing multimedia terminals, Texas Instruments (TI) has demonstrated real-time MPEG-4 video decoding (simple profile) on TMS320C54x, TI's low power, high performance DSP chip. In addition, TI has outlined a system-level solution to transmitting video across wireless networks, including channel coding and communication protocols.
{"title":"Wireless MPEG-4 video on Texas Instruments DSP chips","authors":"M. Budagavi, W. Heinzelman, Jennifer Webb, R. Talluri","doi":"10.1109/ICASSP.1999.758378","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758378","url":null,"abstract":"Technology has advanced in recent years to the point where multimedia communicators are beginning to emerge. These communicators are low-power, portable devices that can transmit and receive multimedia data through the wireless network. Due to the high computational complexity involved and the low-power constraint in wireless applications, these devices require the use of processors that are powerful and are at the same time very power-efficient. In order to facilitate interoperability, it is important that these devices use standardized compression and communication algorithms. As a first step in implementing multimedia terminals, Texas Instruments (TI) has demonstrated real-time MPEG-4 video decoding (simple profile) on TMS320C54x, TI's low power, high performance DSP chip. In addition, TI has outlined a system-level solution to transmitting video across wireless networks, including channel coding and communication protocols.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123131702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.759787
É. Grivel, M. Gabrea, M. Najim
This paper deals with Kalman filter-based enhancement of a speech signal contaminated by a white noise, using a single microphone system. Such a problem can be stated as a realization issue in the framework of identification. For such a purpose we propose to identify the state space model by using subspace non-iterative algorithms based on orthogonal projections. Unlike estimate-maximize (EM)-based algorithms, this approach provides, in a single iteration from noisy observations, the matrices related to state space model and the covariance matrices that are necessary to perform Kalman filtering. In addition no voice activity detector is required unlike existing methods. Both methods proposed here are compared with classical approaches.
{"title":"Subspace state space model identification for speech enhancement","authors":"É. Grivel, M. Gabrea, M. Najim","doi":"10.1109/ICASSP.1999.759787","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.759787","url":null,"abstract":"This paper deals with Kalman filter-based enhancement of a speech signal contaminated by a white noise, using a single microphone system. Such a problem can be stated as a realization issue in the framework of identification. For such a purpose we propose to identify the state space model by using subspace non-iterative algorithms based on orthogonal projections. Unlike estimate-maximize (EM)-based algorithms, this approach provides, in a single iteration from noisy observations, the matrices related to state space model and the covariance matrices that are necessary to perform Kalman filtering. In addition no voice activity detector is required unlike existing methods. Both methods proposed here are compared with classical approaches.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124468333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.758053
R. Hagen, E. Ekudden
This paper describes an 8 kbit/s ACELP speech coder with high performance for both speech and non-speech signals such as background noise. While the traditional waveform matching LPAS structure employed in many existing speech coders provides high quality for speech signals, it has significant performance limitations for, for example, background noise. The coder presented here employs a novel adaptive gain coding technique using energy matching in combination with a traditional waveform matching criterion providing high quality for both speech and background noise. The coder has a basic structure similar to that of the 7.4 kbit/s D-AMPS EFR coder, with a 10 th order LPC, high resolution adaptive codebook and a 4 pulse algebraic codebook. The performance for speech signals is equivalent to or better than that of state-of-the-art 8 kbit/s coders, while for background noise conditions the performance is significantly improved.
{"title":"An 8 kbit/s ACELP coder with improved background noise performance","authors":"R. Hagen, E. Ekudden","doi":"10.1109/ICASSP.1999.758053","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758053","url":null,"abstract":"This paper describes an 8 kbit/s ACELP speech coder with high performance for both speech and non-speech signals such as background noise. While the traditional waveform matching LPAS structure employed in many existing speech coders provides high quality for speech signals, it has significant performance limitations for, for example, background noise. The coder presented here employs a novel adaptive gain coding technique using energy matching in combination with a traditional waveform matching criterion providing high quality for both speech and background noise. The coder has a basic structure similar to that of the 7.4 kbit/s D-AMPS EFR coder, with a 10 th order LPC, high resolution adaptive codebook and a 4 pulse algebraic codebook. The performance for speech signals is equivalent to or better than that of state-of-the-art 8 kbit/s coders, while for background noise conditions the performance is significantly improved.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121827610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.761356
Y. Seliktar, Douglas B. Williams, E. J. Holder
Combined monostatic clutter (MSC) and terrain scattered interference (TSI) pose a difficult challenge for adaptive radar processing. Mitigation techniques exist for each interference alone but are insufficient for their combined effects. Current approaches separate the problem into two stages where TSI is suppressed first and then MSC. The problem with this cascade approach is that during the initial TSI suppression stage, the MSC becomes corrupted. In this paper an innovative technique is introduced for achieving a significant improvement in cancellation performance for both MSC and TSI, even when the jammer appears in the mainbeam. The majority of the interference rejection, both TSI and MSC, is accomplished with an MSC filter, with further TSI suppression accomplished via an additional tapped reference beam. Simultaneous optimization of the MSC filter weights and reference beam weights yields the desired processor. Performance results using Mountain-top data demonstrate the superiority of the proposed processor over existing processors.
{"title":"Beam-augmented space-time adaptive processing","authors":"Y. Seliktar, Douglas B. Williams, E. J. Holder","doi":"10.1109/ICASSP.1999.761356","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.761356","url":null,"abstract":"Combined monostatic clutter (MSC) and terrain scattered interference (TSI) pose a difficult challenge for adaptive radar processing. Mitigation techniques exist for each interference alone but are insufficient for their combined effects. Current approaches separate the problem into two stages where TSI is suppressed first and then MSC. The problem with this cascade approach is that during the initial TSI suppression stage, the MSC becomes corrupted. In this paper an innovative technique is introduced for achieving a significant improvement in cancellation performance for both MSC and TSI, even when the jammer appears in the mainbeam. The majority of the interference rejection, both TSI and MSC, is accomplished with an MSC filter, with further TSI suppression accomplished via an additional tapped reference beam. Simultaneous optimization of the MSC filter weights and reference beam weights yields the desired processor. Performance results using Mountain-top data demonstrate the superiority of the proposed processor over existing processors.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116881831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.758301
J. Bonk, A. Stone, E. Manolakos
We present a design case study using DG2VHDL a tool which bridges the gap between an abstract graphical description of a DSP algorithm and its concrete hardware description language (HDL) representation, DG2VHDL automatically translates a dependence graph (DG) into a synthesizable, behavioral VHDL entity that can be input to industrial strength behavioral compilers for producing silicon implementations of the algorithm (FPGAs, ASICs). Full search block matching motion estimation was selected for its current applications (MPEG, HDTV, video conferencing) as well as for the richness of literature and architectural exploration over the last decade. We will not only demonstrate here that the behavioral VHDL code produced automatically by the tool leads, after behavioral synthesis, to an efficient distributed memory and control modular array architecture, but will also provide comparative statistics for several new FS-BMA architectures derived for real-time motion estimation.
{"title":"Synthesis of array architectures for block matching motion estimation: design exploration using the tool DG2VHDL","authors":"J. Bonk, A. Stone, E. Manolakos","doi":"10.1109/ICASSP.1999.758301","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758301","url":null,"abstract":"We present a design case study using DG2VHDL a tool which bridges the gap between an abstract graphical description of a DSP algorithm and its concrete hardware description language (HDL) representation, DG2VHDL automatically translates a dependence graph (DG) into a synthesizable, behavioral VHDL entity that can be input to industrial strength behavioral compilers for producing silicon implementations of the algorithm (FPGAs, ASICs). Full search block matching motion estimation was selected for its current applications (MPEG, HDTV, video conferencing) as well as for the richness of literature and architectural exploration over the last decade. We will not only demonstrate here that the behavioral VHDL code produced automatically by the tool leads, after behavioral synthesis, to an efficient distributed memory and control modular array architecture, but will also provide comparative statistics for several new FS-BMA architectures derived for real-time motion estimation.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116902134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.758150
D. O'Shaughnessy, H. Tolba
We show that the concept of voiced-unvoiced (VU) classification of speech sounds can be incorporated not only in speech analysis or speech enhancement processes, but also can be useful for recognition processes. That is, the incorporation of such a classification in a continuous speech recognition (CSR) system not only improves its performance in low SNR environments, but also limits the time and the necessary memory to carry out the process of the recognition. The proposed V-U classification of the speech sounds has two principal functions: (1) it allows the enhancement of the voiced and unvoiced parts of speech separately; (2) it limits the Viterbi (1967) search space, and consequently the process of recognition can be carried out in real time without degrading the performance of the system. We prove via experiments that such a system outperforms the baseline HTK when a V-U decision is included in both front- and far-end of the HTK-based recognizer.
{"title":"Towards a robust/fast continuous speech recognition system using a voiced-unvoiced decision","authors":"D. O'Shaughnessy, H. Tolba","doi":"10.1109/ICASSP.1999.758150","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758150","url":null,"abstract":"We show that the concept of voiced-unvoiced (VU) classification of speech sounds can be incorporated not only in speech analysis or speech enhancement processes, but also can be useful for recognition processes. That is, the incorporation of such a classification in a continuous speech recognition (CSR) system not only improves its performance in low SNR environments, but also limits the time and the necessary memory to carry out the process of the recognition. The proposed V-U classification of the speech sounds has two principal functions: (1) it allows the enhancement of the voiced and unvoiced parts of speech separately; (2) it limits the Viterbi (1967) search space, and consequently the process of recognition can be carried out in real time without degrading the performance of the system. We prove via experiments that such a system outperforms the baseline HTK when a V-U decision is included in both front- and far-end of the HTK-based recognizer.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121248032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-03-15DOI: 10.1109/ICASSP.1999.758329
C. Hewes, P. Rajasekaran
Texas Instruments is the industry leader in providing digital signal processing solutions to a variety of system applications including wireless communications, modems, hard disk drives, and many others. In this paper, the key roles of university research and education are described. The relationship of TI to the university community is reviewed. TI's expectations from university programs are also outlined.
{"title":"DSPS education: an industry leader's experiences and expectations","authors":"C. Hewes, P. Rajasekaran","doi":"10.1109/ICASSP.1999.758329","DOIUrl":"https://doi.org/10.1109/ICASSP.1999.758329","url":null,"abstract":"Texas Instruments is the industry leader in providing digital signal processing solutions to a variety of system applications including wireless communications, modems, hard disk drives, and many others. In this paper, the key roles of university research and education are described. The relationship of TI to the university community is reviewed. TI's expectations from university programs are also outlined.","PeriodicalId":228491,"journal":{"name":"1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127226773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}