Pub Date : 1998-05-15DOI: 10.1109/ICASSP.1998.681827
S. Chen, D. Donoho
We apply basis pursuit, an atomic decomposition technique, for spectrum estimation. Compared with several modern time series methods, our approach can greatly reduce the problem of power leakage; it is able to superresolve; moreover, it works well with noisy and unevenly sampled signals. We present experiments on bizarrely spaced radial velocity data from one of the newly-discovered extrasolar planetary systems.
{"title":"Application of basis pursuit in spectrum estimation","authors":"S. Chen, D. Donoho","doi":"10.1109/ICASSP.1998.681827","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.681827","url":null,"abstract":"We apply basis pursuit, an atomic decomposition technique, for spectrum estimation. Compared with several modern time series methods, our approach can greatly reduce the problem of power leakage; it is able to superresolve; moreover, it works well with noisy and unevenly sampled signals. We present experiments on bizarrely spaced radial velocity data from one of the newly-discovered extrasolar planetary systems.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123312914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-15DOI: 10.1109/ICASSP.1998.679690
D. Androutsos, K. Plataniotis, A. Venetsanopoulos
We present a technique for coarsely extracting the regions of natural color images which contain directional detail, e.g., edges, texture, etc., which we then use for image database indexing. As a measure of color activity, we use a perceptually modified distance measure based on the sum-of-angles criterion. We then apply histogram thresholding techniques to separate the image into smooth color regions and busy regions where edge, texture and colour activity exists. Database indices are then created from the busy regions using the directional detail histogram technique and retrieval is performed using these.
{"title":"Extraction of detailed image regions for content-based image retrieval","authors":"D. Androutsos, K. Plataniotis, A. Venetsanopoulos","doi":"10.1109/ICASSP.1998.679690","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.679690","url":null,"abstract":"We present a technique for coarsely extracting the regions of natural color images which contain directional detail, e.g., edges, texture, etc., which we then use for image database indexing. As a measure of color activity, we use a perceptually modified distance measure based on the sum-of-angles criterion. We then apply histogram thresholding techniques to separate the image into smooth color regions and busy regions where edge, texture and colour activity exists. Database indices are then created from the busy regions using the directional detail histogram technique and retrieval is performed using these.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121085384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.675394
M. Kurimo
A method is presented to correct phoneme strings produced by a vocabulary independent speech recognizer. The method first extracts the N best matching result strings using mixture density hidden Markov models (HMMs) trained by neural networks. Then the strings are corrected by the rules generated automatically by the dynamically expanding context (DEC). Finally, the corrected string candidates and the extra alternatives proposed by the DEC are ranked according to the likelihood score of the best HMM path to generate the obtained string. The experiments show that N need not be very large and the method is able to decrease recognition errors from a test data that even has no common words with the training data of the speech recognizer.
{"title":"Improving vocabulary independent HMM decoding results by using the dynamically expanding context","authors":"M. Kurimo","doi":"10.1109/ICASSP.1998.675394","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.675394","url":null,"abstract":"A method is presented to correct phoneme strings produced by a vocabulary independent speech recognizer. The method first extracts the N best matching result strings using mixture density hidden Markov models (HMMs) trained by neural networks. Then the strings are corrected by the rules generated automatically by the dynamically expanding context (DEC). Finally, the corrected string candidates and the extra alternatives proposed by the DEC are ranked according to the likelihood score of the best HMM path to generate the obtained string. The experiments show that N need not be very large and the method is able to decrease recognition errors from a test data that even has no common words with the training data of the speech recognizer.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114989298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.674490
J. Hung, Jia-Lin Shen, Lin-Shan Lee
The parallel model combination (PMC) technique has been shown to achieve very good performance for speech recognition under noisy conditions. In this approach, the speech signal and the noise are assumed uncorrelated during modeling. A new correlated PMC is proposed by properly estimating and modeling the nonzero correlation between the speech signal and the noise. Preliminary experimental results show that this correlated PMC can provide significant improvements over the original PMC in terms of both the model differences and the recognition accuracies. Error rate reduction on the order of 14% can be achieved.
{"title":"Improved robustness for speech recognition under noisy conditions using correlated parallel model combination","authors":"J. Hung, Jia-Lin Shen, Lin-Shan Lee","doi":"10.1109/ICASSP.1998.674490","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.674490","url":null,"abstract":"The parallel model combination (PMC) technique has been shown to achieve very good performance for speech recognition under noisy conditions. In this approach, the speech signal and the noise are assumed uncorrelated during modeling. A new correlated PMC is proposed by properly estimating and modeling the nonzero correlation between the speech signal and the noise. Preliminary experimental results show that this correlated PMC can provide significant improvements over the original PMC in terms of both the model differences and the recognition accuracies. Error rate reduction on the order of 14% can be achieved.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115605192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.675356
Stanley F. Chen, K. Seymore, R. Rosenfeld
We present novel techniques for performing topic adaptation on an n-gram language model. Given training text labeled with topic information, we automatically identify the most relevant topics for new text. We adapt our language model toward these topics using an exponential model, by adjusting the probabilities in our model to agree with those found in the topical subset of the training data. For efficiency, we do not normalize the model; that is, we do not require that the "probabilities" in the language model sum to 1. With these techniques, we were able to achieve a modest reduction in speech recognition word-error rate in the broadcast news domain.
{"title":"Topic adaptation for language modeling using unnormalized exponential models","authors":"Stanley F. Chen, K. Seymore, R. Rosenfeld","doi":"10.1109/ICASSP.1998.675356","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.675356","url":null,"abstract":"We present novel techniques for performing topic adaptation on an n-gram language model. Given training text labeled with topic information, we automatically identify the most relevant topics for new text. We adapt our language model toward these topics using an exponential model, by adjusting the probabilities in our model to agree with those found in the topical subset of the training data. For efficiency, we do not normalize the model; that is, we do not require that the \"probabilities\" in the language model sum to 1. With these techniques, we were able to achieve a modest reduction in speech recognition word-error rate in the broadcast news domain.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115699124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.674383
Jean-Benoît Pierrot, J. Lindberg, J. Koolwaaij, H. Hutter, D. Genoud, M. Blomberg, F. Bimbot
The issue of a priori threshold setting in speaker verification is a key problem for field applications. In the context of the Caller Verification in Banking and Telecommunications (CAVE) project, we compared several methods for estimating speaker-independent and speaker-dependent decision thresholds. Relevant parameters are estimated from development data only, i.e. without resorting to additional client data. The various approaches are tested on the Dutch SESP database.
{"title":"A comparison of a priori threshold setting procedures for speaker verification in the CAVE project","authors":"Jean-Benoît Pierrot, J. Lindberg, J. Koolwaaij, H. Hutter, D. Genoud, M. Blomberg, F. Bimbot","doi":"10.1109/ICASSP.1998.674383","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.674383","url":null,"abstract":"The issue of a priori threshold setting in speaker verification is a key problem for field applications. In the context of the Caller Verification in Banking and Telecommunications (CAVE) project, we compared several methods for estimating speaker-independent and speaker-dependent decision thresholds. Relevant parameters are estimated from development data only, i.e. without resorting to additional client data. The various approaches are tested on the Dutch SESP database.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117216614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.674460
S. Das, D. Nix, M. Picheny
There are several reasons why conventional speech recognition systems modeled on adult data fail to perform satisfactorily on children's speech input. For instance, children's vocal characteristics differ significantly from those of adults. In addition, their choices of vocabulary and sentence construction modalities usually do not conform to adult patterns. We describe comparative studies demonstrating the performance gain realized by adopting to children's acoustic and language model data to construct a children's speech recognition system.
{"title":"Improvements in children's speech recognition performance","authors":"S. Das, D. Nix, M. Picheny","doi":"10.1109/ICASSP.1998.674460","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.674460","url":null,"abstract":"There are several reasons why conventional speech recognition systems modeled on adult data fail to perform satisfactorily on children's speech input. For instance, children's vocal characteristics differ significantly from those of adults. In addition, their choices of vocabulary and sentence construction modalities usually do not conform to adult patterns. We describe comparative studies demonstrating the performance gain realized by adopting to children's acoustic and language model data to construct a children's speech recognition system.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127253811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.674363
H. Vu, László Lois
We have developed a new general distance measure that not only can be used in a vector quantization (VQ) of the line spectrum frequency (LSF) parameters but performs well in the LSF transformed domain. The new distance is based on the spectral sensitivity of the LSF and their transformed coefficients. In addition, the fixed scaling factor is used to decrease the sensitivity of the spectral error at higher frequencies. Experimental results have shown that the proposed distance measure leads to as good as or better performance of VQ compared to other methods in the field of LSF coding. The use of this distance as the weighting function of the LSFs' transformed parameters is also suggested.
{"title":"A new general distance measure for quantization of LSF and their transformed coefficients","authors":"H. Vu, László Lois","doi":"10.1109/ICASSP.1998.674363","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.674363","url":null,"abstract":"We have developed a new general distance measure that not only can be used in a vector quantization (VQ) of the line spectrum frequency (LSF) parameters but performs well in the LSF transformed domain. The new distance is based on the spectral sensitivity of the LSF and their transformed coefficients. In addition, the fixed scaling factor is used to decrease the sensitivity of the spectral error at higher frequencies. Experimental results have shown that the proposed distance measure leads to as good as or better performance of VQ compared to other methods in the field of LSF coding. The use of this distance as the weighting function of the LSFs' transformed parameters is also suggested.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127323769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.679584
Christian Lütkemeyer, Hans-Martin Blüthgen, T. Noll
A transversal equalizer with half Baud spaced taps in the center and extended with Baud spaced taps on both sides is presented. This hybrid equalizer combines the benefits of Baud spaced equalizers-like superior equalization of notches in the middle of the transmission band-and fractionally spaced equalizers, which have a superior performance when equalizing asymmetric notches in the slope of the transmission band, when the same number of coefficients are used. The hybrid equalizer offers a reduced sensitivity to sampling time changes and the ability to model the matched filter in the receiver as the fractionally spaced equalizer. The problem of tap-wandering, which is present in fractionally spaced equalizers, is reduced due to the reduced degree of freedom in the coefficient adjustment.
{"title":"A hybrid equalizer merging the advantages of Baud spaced and fractionally spaced equalizers","authors":"Christian Lütkemeyer, Hans-Martin Blüthgen, T. Noll","doi":"10.1109/ICASSP.1998.679584","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.679584","url":null,"abstract":"A transversal equalizer with half Baud spaced taps in the center and extended with Baud spaced taps on both sides is presented. This hybrid equalizer combines the benefits of Baud spaced equalizers-like superior equalization of notches in the middle of the transmission band-and fractionally spaced equalizers, which have a superior performance when equalizing asymmetric notches in the slope of the transmission band, when the same number of coefficients are used. The hybrid equalizer offers a reduced sensitivity to sampling time changes and the ability to model the matched filter in the receiver as the fractionally spaced equalizer. The problem of tap-wandering, which is present in fractionally spaced equalizers, is reduced due to the reduced degree of freedom in the coefficient adjustment.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124902383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-05-12DOI: 10.1109/ICASSP.1998.679576
Peng-Huat Chua, C. See, A. Nehorai
We develop vector-sensor array processing to estimate the angles-of-arrival (AOAs) and time delays of multipath channels in the space-time-polarization domain. A MUSIC-type algorithm for joint angle and delay estimation with a vector-sensor array is derived. Potential applications include multipath channel estimation and mobile localization. Simulation results show that the space-time-polarization parameterization of the multipath channels results in improved accuracy and resolution performance.
{"title":"Vector-sensor array processing for estimating angles and times of arrival of multipath communication signals","authors":"Peng-Huat Chua, C. See, A. Nehorai","doi":"10.1109/ICASSP.1998.679576","DOIUrl":"https://doi.org/10.1109/ICASSP.1998.679576","url":null,"abstract":"We develop vector-sensor array processing to estimate the angles-of-arrival (AOAs) and time delays of multipath channels in the space-time-polarization domain. A MUSIC-type algorithm for joint angle and delay estimation with a vector-sensor array is derived. Potential applications include multipath channel estimation and mobile localization. Simulation results show that the space-time-polarization parameterization of the multipath channels results in improved accuracy and resolution performance.","PeriodicalId":419805,"journal":{"name":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125109736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}