Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770278
Zeng Zhihua, Xiao Zimei
Pitch predictors are important in achieving high-quality speech for linear prediction-based analysis-by-synthesis (LPAS) coders. Multi-tap pitch predictors with vector quantization (VQ) of the predictor coefficients have been adopted in more and more LPAS coders because they can provide high prediction gain. Higher-tap pitch predictors (tap>3) have better performance but they are seldom used due to the high computational complexity of vector quantizing the pitch predictor coefficients. This paper proposes a new VQ scheme for multi-tap pitch predictors, the likelihood error criterion-based vector quantization (LEC-VQ). Experiments show that the proposed VQ method can efficiently reduce the computational complexity while maintaining high speech quality.
{"title":"Fast VQ of multi-tap pitch predictor coefficients","authors":"Zeng Zhihua, Xiao Zimei","doi":"10.1109/ICOSP.1998.770278","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770278","url":null,"abstract":"Pitch predictors are important in achieving high-quality speech for linear prediction-based analysis-by-synthesis (LPAS) coders. Multi-tap pitch predictors with vector quantization (VQ) of the predictor coefficients have been adopted in more and more LPAS coders because they can provide high prediction gain. Higher-tap pitch predictors (tap>3) have better performance but they are seldom used due to the high computational complexity of vector quantizing the pitch predictor coefficients. This paper proposes a new VQ scheme for multi-tap pitch predictors, the likelihood error criterion-based vector quantization (LEC-VQ). Experiments show that the proposed VQ method can efficiently reduce the computational complexity while maintaining high speech quality.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115627300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770790
Xu Yanjun, Du Limin, Hou Ziqiang
Visual feature extraction is one of the most important techniques in audiovisual bimodal speech recognition, and also remains a very challenging area in image understanding. A shiftable multiscale transform is introduced into the construction of an active shape model. It uses the pyramidal data to describe the structure of an image, which is invariant to illumination and perspective variability and thus contributes a lot to the improvement of the robustness of the model. A segmental downhill simplex method is also put forward to improve the minimization procedure of lip localization. It employs a kind of "coarse-to-fine" strategy to speed up the convergence and improve the robustness of lip localization. Experiments support the validity of the new method, and show better robustness and higher efficiency.
{"title":"A novel lip localization method based on shiftable wavelets transform","authors":"Xu Yanjun, Du Limin, Hou Ziqiang","doi":"10.1109/ICOSP.1998.770790","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770790","url":null,"abstract":"Visual feature extraction is one of the most important techniques in audiovisual bimodal speech recognition, and also remains a very challenging area in image understanding. A shiftable multiscale transform is introduced into the construction of an active shape model. It uses the pyramidal data to describe the structure of an image, which is invariant to illumination and perspective variability and thus contributes a lot to the improvement of the robustness of the model. A segmental downhill simplex method is also put forward to improve the minimization procedure of lip localization. It employs a kind of \"coarse-to-fine\" strategy to speed up the convergence and improve the robustness of lip localization. Experiments support the validity of the new method, and show better robustness and higher efficiency.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115779959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770771
Jianping Fan, G. Fujita, Jun Yu, Koji Miyanohana, T. Onoye, N. Ishiura, Lide Wu, I. Shirakawa
In this paper, a novel object-oriented hierarchical video segmentation and representation algorithm is proposed based on a four-component video model, where the local variance contrast and the frame difference contrast are selected for generating the 2D spatiotemporal entropy. The extracted object is first represented by a group of (4/spl times/4) blocks coarsely, then the intra-block edge extraction on edge blocks and the joint spatiotemporal similarity test among neighboring blocks are further performed for determining meaningful real objects. This proposed hierarchical segmentation algorithm may be very useful for MPEG-4 applications. A novel fast algorithm is also introduced for reducing the search burden. Moreover, this unsupervised algorithm also makes automatic image and video segmentation possible.
{"title":"Hierarchical object-oriented video segmentation and representation algorithm","authors":"Jianping Fan, G. Fujita, Jun Yu, Koji Miyanohana, T. Onoye, N. Ishiura, Lide Wu, I. Shirakawa","doi":"10.1109/ICOSP.1998.770771","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770771","url":null,"abstract":"In this paper, a novel object-oriented hierarchical video segmentation and representation algorithm is proposed based on a four-component video model, where the local variance contrast and the frame difference contrast are selected for generating the 2D spatiotemporal entropy. The extracted object is first represented by a group of (4/spl times/4) blocks coarsely, then the intra-block edge extraction on edge blocks and the joint spatiotemporal similarity test among neighboring blocks are further performed for determining meaningful real objects. This proposed hierarchical segmentation algorithm may be very useful for MPEG-4 applications. A novel fast algorithm is also introduced for reducing the search burden. Moreover, this unsupervised algorithm also makes automatic image and video segmentation possible.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123901267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770834
G. Xinbo, Xue Zhong, Li Jie, Xie Weixin
Fuzzy clustering is an important branch of unsupervised classification, and has been widely used in pattern recognition and image processing. However, most existing fuzzy clustering algorithms are sensitive to initialization, and strongly depend on the number of clusters, which limits their applications. Moreover, it these algorithms also need to know the type and number of prototypes in advance in multi-type prototype fuzzy clustering. To overcome these limitations, a method for acquiring a priori knowledge about the clustering prototype is proposed in this paper, which obtains better performance in initializing multi-type prototype fuzzy clustering.
{"title":"An initialization method for multi-type prototype fuzzy clustering","authors":"G. Xinbo, Xue Zhong, Li Jie, Xie Weixin","doi":"10.1109/ICOSP.1998.770834","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770834","url":null,"abstract":"Fuzzy clustering is an important branch of unsupervised classification, and has been widely used in pattern recognition and image processing. However, most existing fuzzy clustering algorithms are sensitive to initialization, and strongly depend on the number of clusters, which limits their applications. Moreover, it these algorithms also need to know the type and number of prototypes in advance in multi-type prototype fuzzy clustering. To overcome these limitations, a method for acquiring a priori knowledge about the clustering prototype is proposed in this paper, which obtains better performance in initializing multi-type prototype fuzzy clustering.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124085507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770152
H. Zhou, M.L. Luo
We present a Hough transform (HT)-based technique for quasar recognition. The main purpose of quasar recognition is to identify redshifts. There are two main approaches to calculating redshifts. One is a direct way based on recognizing emission peaks firstly, the other is a statistical one. We apply HT, a statistical way to compute redshifts. In addition, we employ two post-processing techniques for HT: one uses a linear kernel to recover the weakened true peak in the accumulator array; the other is a voting technique to eliminate sharp false peaks. Moreover, we use a weighted HT to improve the recognition results further. Experimental results are satisfactory.
{"title":"A novel technique for quasar recognition","authors":"H. Zhou, M.L. Luo","doi":"10.1109/ICOSP.1998.770152","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770152","url":null,"abstract":"We present a Hough transform (HT)-based technique for quasar recognition. The main purpose of quasar recognition is to identify redshifts. There are two main approaches to calculating redshifts. One is a direct way based on recognizing emission peaks firstly, the other is a statistical one. We apply HT, a statistical way to compute redshifts. In addition, we employ two post-processing techniques for HT: one uses a linear kernel to recover the weakened true peak in the accumulator array; the other is a voting technique to eliminate sharp false peaks. Moreover, we use a weighted HT to improve the recognition results further. Experimental results are satisfactory.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124367187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770214
Yu Yue, Zhou Jian, Wang Yiliang, L. Fengting, Ge Chenghui
Because the discrete wavelet transform (DWT) can be computed effectively with a fast algorithm, the DWT is often used to approximate the continuous wavelet transform (CWT) and wavelet series transform (WST). Approximation accuracy is considered as an open problem in wavelet theory. In this paper, we firstly give three parts that affect the approximation accuracy. Based on sampling theory for wavelet subspaces, two kinds of prefilters are given; one can exactly compute the WST for any signal in this wavelet subspace and the other one can effectively approximate the true WST. Finally, numerical examples are given to show that our algorithms are effective.
{"title":"On the computation of wavelet series transform","authors":"Yu Yue, Zhou Jian, Wang Yiliang, L. Fengting, Ge Chenghui","doi":"10.1109/ICOSP.1998.770214","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770214","url":null,"abstract":"Because the discrete wavelet transform (DWT) can be computed effectively with a fast algorithm, the DWT is often used to approximate the continuous wavelet transform (CWT) and wavelet series transform (WST). Approximation accuracy is considered as an open problem in wavelet theory. In this paper, we firstly give three parts that affect the approximation accuracy. Based on sampling theory for wavelet subspaces, two kinds of prefilters are given; one can exactly compute the WST for any signal in this wavelet subspace and the other one can effectively approximate the true WST. Finally, numerical examples are given to show that our algorithms are effective.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114431272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770766
V. Barakat, R. Goutte, R. Prost
We propose a new Tikhonov-Miller restoration method where an a priori model of the solution is included. In sharp contrast with the classical method, this approach allows local information to be incorporated. The main difficulty is to express this local information in a model. We show that this new method can lead to better results than the usual Tikhonov-Miller approach, if a parametric a priori model is used.
{"title":"Including a parametric model in Tikhonov-Miller image restoration","authors":"V. Barakat, R. Goutte, R. Prost","doi":"10.1109/ICOSP.1998.770766","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770766","url":null,"abstract":"We propose a new Tikhonov-Miller restoration method where an a priori model of the solution is included. In sharp contrast with the classical method, this approach allows local information to be incorporated. The main difficulty is to express this local information in a model. We show that this new method can lead to better results than the usual Tikhonov-Miller approach, if a parametric a priori model is used.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114616686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770285
M. Inman, D. Danforth, S. Hangai, K. Sato
In this study, we show that the use of hidden Markov models (HMMs) significantly enhances the success rate of speaker identification over time. The segment boundary information derived from HMMs provides a means of normalizing the formant patterns obtained from a digital cochlear filter, which we also describe. The use of the digital cochlear filter and HMMs in our study was motivated by two well-known problems in speech recognition generally, i.e. phonetic tempo variability and variability over temporal units of a given length, typically days. We show how these problems can be minimized to achieve more robust speaker identification.
{"title":"Speaker identification using hidden Markov models","authors":"M. Inman, D. Danforth, S. Hangai, K. Sato","doi":"10.1109/ICOSP.1998.770285","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770285","url":null,"abstract":"In this study, we show that the use of hidden Markov models (HMMs) significantly enhances the success rate of speaker identification over time. The segment boundary information derived from HMMs provides a means of normalizing the formant patterns obtained from a digital cochlear filter, which we also describe. The use of the digital cochlear filter and HMMs in our study was motivated by two well-known problems in speech recognition generally, i.e. phonetic tempo variability and variability over temporal units of a given length, typically days. We show how these problems can be minimized to achieve more robust speaker identification.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114700234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770791
Shi Dong-cheng, Han Liqiang, W. Hongzhi
We describe the aperiodic matrix model of deconvolution. Its kernel matrix is high order, and we cannot calculate this large matrix in a practical image restoration problem directly, but it can be handled by the DFT. We show a kind of maximum entropy algorithm for image restoration, based on the aperiodic matrix model, FFT algorithm and conjugate gradient algorithm (CGA). In experimental results its total computational burden and the memory requirement is moderate. It can run on a common PC computer.
{"title":"A maximum entropy algorithm based on the aperiodic model of deconvolution for image restoration","authors":"Shi Dong-cheng, Han Liqiang, W. Hongzhi","doi":"10.1109/ICOSP.1998.770791","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770791","url":null,"abstract":"We describe the aperiodic matrix model of deconvolution. Its kernel matrix is high order, and we cannot calculate this large matrix in a practical image restoration problem directly, but it can be handled by the DFT. We show a kind of maximum entropy algorithm for image restoration, based on the aperiodic matrix model, FFT algorithm and conjugate gradient algorithm (CGA). In experimental results its total computational burden and the memory requirement is moderate. It can run on a common PC computer.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115093891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-10-12DOI: 10.1109/ICOSP.1998.770859
Zhuang Daming, Ma Shao-han, Jiang Mingyan
This paper gives an improved sufficient condition of ensuring the stability of delayed cellular neural networks (DCNN) by means of the analysis of a correspondent Lyapunov function. This result improved the limit given by Civalleri and gives an optional upper limit of the stability of DCNN.
{"title":"The upper limit of the stability of delay-type cellular neural networks","authors":"Zhuang Daming, Ma Shao-han, Jiang Mingyan","doi":"10.1109/ICOSP.1998.770859","DOIUrl":"https://doi.org/10.1109/ICOSP.1998.770859","url":null,"abstract":"This paper gives an improved sufficient condition of ensuring the stability of delayed cellular neural networks (DCNN) by means of the analysis of a correspondent Lyapunov function. This result improved the limit given by Civalleri and gives an optional upper limit of the stability of DCNN.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116814061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}