Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.859155
Jirí Navrátil, Jan Kleindienst, Stephane H Maes
This paper describes an implementation of the concept of conversational speech biometrics approach to personal authentication in the telephony environment. An application-independent module including a natural language-enabled part for verbal verification and identification and an acoustic speaker recognition engine for voice-print analysis are combined using a special verification/identification protocol which allows the application to adapt the session according to the dialog development and the query/command security. The results validate the feasibility and advantages of the concept of integrating the speaker and speech recognition technology. Users familiar with the system can log into the system with 2.7% or 3.2% false rejection and ca. 3/spl times/10/sup -11/% or 10/sup -6/% false acceptance rates in about 40 sec or 20 sec respectively. This makes speaker recognition for the first time deployable for high security applications even with today's technology-a claim that can't be made with other speaker recognition technology. The system has a client-server architecture and is suitable for various applications and platforms.
{"title":"An instantiable speech biometrics module with natural language interface: implementation in the telephony environment","authors":"Jirí Navrátil, Jan Kleindienst, Stephane H Maes","doi":"10.1109/ICASSP.2000.859155","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.859155","url":null,"abstract":"This paper describes an implementation of the concept of conversational speech biometrics approach to personal authentication in the telephony environment. An application-independent module including a natural language-enabled part for verbal verification and identification and an acoustic speaker recognition engine for voice-print analysis are combined using a special verification/identification protocol which allows the application to adapt the session according to the dialog development and the query/command security. The results validate the feasibility and advantages of the concept of integrating the speaker and speech recognition technology. Users familiar with the system can log into the system with 2.7% or 3.2% false rejection and ca. 3/spl times/10/sup -11/% or 10/sup -6/% false acceptance rates in about 40 sec or 20 sec respectively. This makes speaker recognition for the first time deployable for high security applications even with today's technology-a claim that can't be made with other speaker recognition technology. The system has a client-server architecture and is suitable for various applications and platforms.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123834395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.861174
A. Abdel-Samad, A. Tewfik
We present a novel m-ary tree hierarchical search strategy for stationary radar target localization in the presence of white Gaussian noise. This is done in the context of a discretized version of the problem of optimal beamforming, or radar transmit and receive pattern design. We assume that the target is equally likely to be in one of M discrete cells and that we have L observations at our disposal. We recursively group the search cells into m groups until the size of each group reduces to one cell, thus creating a m-ary search tree of depth log/sub m/(M). We, then, allocate the available L observations among the tree levels in a manner that maximizes the probability of correctly locating the target. We compare the performance of the novel search strategy with that of previous techniques and demonstrate its superior performance.
{"title":"Hierarchical radar target localization","authors":"A. Abdel-Samad, A. Tewfik","doi":"10.1109/ICASSP.2000.861174","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.861174","url":null,"abstract":"We present a novel m-ary tree hierarchical search strategy for stationary radar target localization in the presence of white Gaussian noise. This is done in the context of a discretized version of the problem of optimal beamforming, or radar transmit and receive pattern design. We assume that the target is equally likely to be in one of M discrete cells and that we have L observations at our disposal. We recursively group the search cells into m groups until the size of each group reduces to one cell, thus creating a m-ary search tree of depth log/sub m/(M). We, then, allocate the available L observations among the tree levels in a manner that maximizes the probability of correctly locating the target. We compare the performance of the novel search strategy with that of previous techniques and demonstrate its superior performance.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124121450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.861153
T. Esmailian, P. Gulak, F. Kschischang
In-building power lines have been considered as a medium for high speed data transmission for applications like home networking and Internet access. Frequency selectivity and time variation of this medium in addition to the high level of narrow-band and impulsive interference makes multi-carrier modulation, and especially its popular variant discrete multitone (DMT), an attractive modulation candidate for this application. This paper presents the results of our measurements of the high frequency characteristics of ordinary in-building power lines, as well as simulation results of a DMT transceiver system in an in-building power line environment.
{"title":"A discrete multitone power line communications system","authors":"T. Esmailian, P. Gulak, F. Kschischang","doi":"10.1109/ICASSP.2000.861153","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.861153","url":null,"abstract":"In-building power lines have been considered as a medium for high speed data transmission for applications like home networking and Internet access. Frequency selectivity and time variation of this medium in addition to the high level of narrow-band and impulsive interference makes multi-carrier modulation, and especially its popular variant discrete multitone (DMT), an attractive modulation candidate for this application. This paper presents the results of our measurements of the high frequency characteristics of ordinary in-building power lines, as well as simulation results of a DMT transceiver system in an in-building power line environment.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124219659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.860168
P. Bonato, Z. Erim
The identification of the timing of the discharges of groups of muscle fibers (motor units) is of utmost importance in research into the strategies employed by the central nervous system in producing muscle force as well as in the clinical diagnosis of neuromuscular diseases. The process involves the recognition of unique shapes (action potentials) contributed by different motor units at random times throughout a muscle contraction. This paper addresses a specific aspect, of the identification process: the decomposition of the compound signal when the action potentials of two or more motor units are superimposed. We propose a cross-time-frequency-based procedure to identify which two (out of a previously identified collection of waveforms) are included in a superposition. The procedure also determines the relative delay of the two waveforms.
{"title":"Decomposition of superimposed waveforms using the cross time frequency transform","authors":"P. Bonato, Z. Erim","doi":"10.1109/ICASSP.2000.860168","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.860168","url":null,"abstract":"The identification of the timing of the discharges of groups of muscle fibers (motor units) is of utmost importance in research into the strategies employed by the central nervous system in producing muscle force as well as in the clinical diagnosis of neuromuscular diseases. The process involves the recognition of unique shapes (action potentials) contributed by different motor units at random times throughout a muscle contraction. This paper addresses a specific aspect, of the identification process: the decomposition of the compound signal when the action potentials of two or more motor units are superimposed. We propose a cross-time-frequency-based procedure to identify which two (out of a previously identified collection of waveforms) are included in a superposition. The procedure also determines the relative delay of the two waveforms.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":" 47","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120830116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.861979
D. Joachim, J. Deller
Optimal bounding ellipsoid (OBE) identification algorithms are noted for their simplicity and ability to leverage model error-bound knowledge for improved parameter convergence. However, the OBE convergence rate is dependent on the pointwise "tightness" of the model error-bound estimates. Since the least upper bound on the model error is often unknown, the convergence rate is compromised by the need to overestimate error-bounds lest the integrity of the process be violated by underestimation. We present an effective under-bounding safeguard against system model violations in OBE processing. Simulation examples in state estimation and speech processing demonstrate the efficacy of the under-bounding safeguard.
{"title":"Adaptive optimal bounded-ellipsoid identification with an error under-bounding safeguard: applications in state estimation and speech processing","authors":"D. Joachim, J. Deller","doi":"10.1109/ICASSP.2000.861979","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.861979","url":null,"abstract":"Optimal bounding ellipsoid (OBE) identification algorithms are noted for their simplicity and ability to leverage model error-bound knowledge for improved parameter convergence. However, the OBE convergence rate is dependent on the pointwise \"tightness\" of the model error-bound estimates. Since the least upper bound on the model error is often unknown, the convergence rate is compromised by the need to overestimate error-bounds lest the integrity of the process be violated by underestimation. We present an effective under-bounding safeguard against system model violations in OBE processing. Simulation examples in state estimation and speech processing demonstrate the efficacy of the under-bounding safeguard.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"41 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120919006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.859104
Hossein Najaf-Zadeh, P. Kabal
In this work we consider adaptive bit allocation for perceptual coding of narrowband audio signals at low rates (down to 8 kbit/s). Two different strategies are used to shape the audible noise spectrum. In one approach, the quantization noise spectrum is shaped in parallel with the masking threshold curve. This way the noise is equally audible in different frequency bands. The other approach generates a flat noise spectrum above the masking threshold. The noise power is not equally distributed over the frequency range, hence it is audible to various extents at different frequencies.
{"title":"Perceptual bit allocation for low rate coding of narrowband audio","authors":"Hossein Najaf-Zadeh, P. Kabal","doi":"10.1109/ICASSP.2000.859104","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.859104","url":null,"abstract":"In this work we consider adaptive bit allocation for perceptual coding of narrowband audio signals at low rates (down to 8 kbit/s). Two different strategies are used to shape the audible noise spectrum. In one approach, the quantization noise spectrum is shaped in parallel with the masking threshold curve. This way the noise is equally audible in different frequency bands. The other approach generates a flat noise spectrum above the masking threshold. The noise power is not equally distributed over the frequency range, hence it is audible to various extents at different frequencies.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116291819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.860144
G. Rovithakis, M. Maniadakis, M. Zervakis
The use of neural network structures for feature extraction and classification is addressed here. More precisely, a nonlinear filter based on higher order neural networks (HONN) whose weights are updated by stable learning laws is used to extract the characteristic features of fluorescence spectra corresponding to human tissue samples of different states. The features are then classified with a multi-layer perceptron (MLP). The high rates of success together with the small time needed to analyze the signals, proves our method very attractive for real time applications.
{"title":"Artificial neural networks for feature extraction and classification of vascular tissue fluorescence spectrums","authors":"G. Rovithakis, M. Maniadakis, M. Zervakis","doi":"10.1109/ICASSP.2000.860144","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.860144","url":null,"abstract":"The use of neural network structures for feature extraction and classification is addressed here. More precisely, a nonlinear filter based on higher order neural networks (HONN) whose weights are updated by stable learning laws is used to extract the characteristic features of fluorescence spectra corresponding to human tissue samples of different states. The features are then classified with a multi-layer perceptron (MLP). The high rates of success together with the small time needed to analyze the signals, proves our method very attractive for real time applications.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121472115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.861879
V. Melnik, K. Egiazarian, I. Shmulevich, P. Kuosmanen
We propose to generate a library of median pyramidal decompositions organized as a tree. It is shown that the best decomposition on this tree can be chosen using the techniques developed in wavelet theory. Based on the tree of decompositions, a denoising algorithm is introduced. Numerical simulations have shown that the proposed algorithm is more effective for signal denoising than the method based on the traditional median pyramidal transform.
{"title":"A tree of median pyramidal decompositions with an application to signal denoising","authors":"V. Melnik, K. Egiazarian, I. Shmulevich, P. Kuosmanen","doi":"10.1109/ICASSP.2000.861879","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.861879","url":null,"abstract":"We propose to generate a library of median pyramidal decompositions organized as a tree. It is shown that the best decomposition on this tree can be chosen using the techniques developed in wavelet theory. Based on the tree of decompositions, a denoising algorithm is introduced. Numerical simulations have shown that the proposed algorithm is more effective for signal denoising than the method based on the traditional median pyramidal transform.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121508005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.861925
L. Ng, G. Burnett, J. Holzrichter, T. Gable
Low power EM radar-like sensors have made it possible to measure properties of the human speech production system in real-time, without acoustic interference. This greatly enhances the quality and quantify of information for many speech related applications (see Holzrichter, Burnett, Ng, and Lea, J. Acoustic. Soc. Am. 103 (1) 622 (1998)). By using combined glottal-EM-sensor-and acoustic-signals, segments of voiced, unvoiced, and no-speech can be reliably defined. Real-time de-noising filters can be constructed to remove noise from the user's corresponding speech signal.
低功率类似雷达的电磁传感器使得在没有声音干扰的情况下实时测量人类语音产生系统的特性成为可能。这大大提高了许多语音相关应用的信息质量和量化(见Holzrichter, Burnett, Ng, and Lea, J. Acoustic)。Soc。Am. 103(1) 622(1998))。通过结合声门电磁传感器和声学信号,可以可靠地定义浊音、非浊音和非浊音段。可以构建实时去噪滤波器来去除用户相应语音信号中的噪声。
{"title":"Denoising of human speech using combined acoustic and EM sensor signal processing","authors":"L. Ng, G. Burnett, J. Holzrichter, T. Gable","doi":"10.1109/ICASSP.2000.861925","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.861925","url":null,"abstract":"Low power EM radar-like sensors have made it possible to measure properties of the human speech production system in real-time, without acoustic interference. This greatly enhances the quality and quantify of information for many speech related applications (see Holzrichter, Burnett, Ng, and Lea, J. Acoustic. Soc. Am. 103 (1) 622 (1998)). By using combined glottal-EM-sensor-and acoustic-signals, segments of voiced, unvoiced, and no-speech can be reliably defined. Real-time de-noising filters can be constructed to remove noise from the user's corresponding speech signal.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"50 s27","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113957111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-06-05DOI: 10.1109/ICASSP.2000.862055
A. Asif, José M. F. Moura
We investigate the properties of block matrices with block banded inverses to derive efficient matrix inversion algorithms for such matrices. In particular, we derive the following: (1) a recursive algorithm to invert a full matrix whose inverse is structured as a block tridiagonal matrix; (2) a recursive algorithm to compute the inverse of a structured block tridiagonal matrix. These algorithms are exact. They reduce the computational complexity respectively by two and one orders of magnitude over the direct inversion of the associated matrices. We apply these algorithms to develop a computationally efficient approximate implementation of the Kalman-Bucy filter (KBf) that we refer to as the local KBf. The computational effort of the local KBf is reduced by a factor of I/sup 2/ over the exact KBf while exhibiting near-optimal performance.
{"title":"Inversion of block matrices with block banded inverses: application to Kalman-Bucy filtering","authors":"A. Asif, José M. F. Moura","doi":"10.1109/ICASSP.2000.862055","DOIUrl":"https://doi.org/10.1109/ICASSP.2000.862055","url":null,"abstract":"We investigate the properties of block matrices with block banded inverses to derive efficient matrix inversion algorithms for such matrices. In particular, we derive the following: (1) a recursive algorithm to invert a full matrix whose inverse is structured as a block tridiagonal matrix; (2) a recursive algorithm to compute the inverse of a structured block tridiagonal matrix. These algorithms are exact. They reduce the computational complexity respectively by two and one orders of magnitude over the direct inversion of the associated matrices. We apply these algorithms to develop a computationally efficient approximate implementation of the Kalman-Bucy filter (KBf) that we refer to as the local KBf. The computational effort of the local KBf is reduced by a factor of I/sup 2/ over the exact KBf while exhibiting near-optimal performance.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"65 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113981235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}