Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.480318
D. Florêncio, R. Schafer
Sampling and reconstruction are usually analyzed under the framework of linear signal processing. Powerful tools like the Fourier transform and optimum linear filter design techniques, allow for a very precise analysis of the process. In particular, an optimum linear filter of any length can be derived under most situations. Many of these tools are not available for non-linear systems, and it is usually difficult to find an optimum non-linear system under any criteria. The authors analyze the possibility of using non-linear filtering in the interpolation of subsampled images. They show that a very simple (5/spl times/5) non-linear reconstruction filter outperforms (for the images analyzed) linear filters of up to 256/spl times/256, including optimum (separable) Wiener filters of any size.
{"title":"Post-sampling aliasing control for natural images","authors":"D. Florêncio, R. Schafer","doi":"10.1109/ICASSP.1995.480318","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480318","url":null,"abstract":"Sampling and reconstruction are usually analyzed under the framework of linear signal processing. Powerful tools like the Fourier transform and optimum linear filter design techniques, allow for a very precise analysis of the process. In particular, an optimum linear filter of any length can be derived under most situations. Many of these tools are not available for non-linear systems, and it is usually difficult to find an optimum non-linear system under any criteria. The authors analyze the possibility of using non-linear filtering in the interpolation of subsampled images. They show that a very simple (5/spl times/5) non-linear reconstruction filter outperforms (for the images analyzed) linear filters of up to 256/spl times/256, including optimum (separable) Wiener filters of any size.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114468046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.480103
M. Potkonjak, J. Rabaey
We introduce the algorithm selection problem for power minimization. After demonstrating the high impact of this synthesis task on the power consumption of the final implementation using a case study, we studied its computational complexity. We present an efficient optimization intensive algorithm for power minimization using algorithm selection. We applied the marginal utility-based algorithm for algorithm selection on three DSP examples. A table illustrates the effectiveness of the power optimization using algorithm selection on one audio (LMS DCT transform domain filter) and two video (NTSC formatter and DPCM coder) applications. On several DSP examples more than an order of magnitude reduction in power is demonstrated.
{"title":"Power minimization in DSP application specific systems using algorithm selection","authors":"M. Potkonjak, J. Rabaey","doi":"10.1109/ICASSP.1995.480103","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480103","url":null,"abstract":"We introduce the algorithm selection problem for power minimization. After demonstrating the high impact of this synthesis task on the power consumption of the final implementation using a case study, we studied its computational complexity. We present an efficient optimization intensive algorithm for power minimization using algorithm selection. We applied the marginal utility-based algorithm for algorithm selection on three DSP examples. A table illustrates the effectiveness of the power optimization using algorithm selection on one audio (LMS DCT transform domain filter) and two video (NTSC formatter and DPCM coder) applications. On several DSP examples more than an order of magnitude reduction in power is demonstrated.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"09 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114568273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.479923
E. Salari, Sheng Lin
A new motion compensated predictive coding based on object region segmentation is proposed for image sequence coding at low bit-rates. The motion compensated prediction involves segmentation, motion detection, and motion estimation for moving objects. Segmentation is carried out on the reconstructed images in both the encoder and decoder. This will eliminate the need to transmit the region shape information. Also, motion vector prediction is performed in both the encoder and decoder leading to a significant reduction of overhead for motion information. Motion compensated prediction errors are transformed using the discrete cosine transform (DCT) and the coefficients are quantized and entropy coded as recommended by the CCITT. Computer simulation shows that the proposed coding algorithm significantly reduces the block artifact which is a dominant distortion associated with the conventional block matching algorithms at low bit-rates.
{"title":"Segmentation based coding algorithm for low bit-rate video","authors":"E. Salari, Sheng Lin","doi":"10.1109/ICASSP.1995.479923","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479923","url":null,"abstract":"A new motion compensated predictive coding based on object region segmentation is proposed for image sequence coding at low bit-rates. The motion compensated prediction involves segmentation, motion detection, and motion estimation for moving objects. Segmentation is carried out on the reconstructed images in both the encoder and decoder. This will eliminate the need to transmit the region shape information. Also, motion vector prediction is performed in both the encoder and decoder leading to a significant reduction of overhead for motion information. Motion compensated prediction errors are transformed using the discrete cosine transform (DCT) and the coefficients are quantized and entropy coded as recommended by the CCITT. Computer simulation shows that the proposed coding algorithm significantly reduces the block artifact which is a dominant distortion associated with the conventional block matching algorithms at low bit-rates.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121894339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.480485
J. Mau, J. Valot, Damien Minaud
We present a solution for the construction of orthogonal time-varying filter banks without transient filters. To reach this result the idea is the following: all the various filter banks used in the time-varying decomposition are not arbitrary, but are linked together and in fact are derived from an unique initial orthogonal filter bank. With this new technique, the perfect reconstruction (PR) property is always guaranteed even if we switch abruptly from one filter bank to an other without the use of transient filters. We will explain, by taking an initial M-band orthogonal filter bank which performs a regular M-band frequency splitting, how to derive various mutually orthogonal filter banks with almost any arbitrary time/frequency resolution, even able to perform irregular frequency splitting like for example in a wavelet decomposition.
{"title":"Time-varying orthogonal filter banks without transient filters","authors":"J. Mau, J. Valot, Damien Minaud","doi":"10.1109/ICASSP.1995.480485","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480485","url":null,"abstract":"We present a solution for the construction of orthogonal time-varying filter banks without transient filters. To reach this result the idea is the following: all the various filter banks used in the time-varying decomposition are not arbitrary, but are linked together and in fact are derived from an unique initial orthogonal filter bank. With this new technique, the perfect reconstruction (PR) property is always guaranteed even if we switch abruptly from one filter bank to an other without the use of transient filters. We will explain, by taking an initial M-band orthogonal filter bank which performs a regular M-band frequency splitting, how to derive various mutually orthogonal filter banks with almost any arbitrary time/frequency resolution, even able to perform irregular frequency splitting like for example in a wavelet decomposition.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122178732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.480082
S. Golden, B. Friedlander
In this paper we approximate arbitrary complex signals by modeling both the logarithm of the amplitude and the phase of the complex signal as finite-order polynomials in time. We refer to a signal of this type as an exponential polynomial signal (EPS). We propose an algorithm to estimate any desired coefficient for this signal model. We also show how the mean-squared error of the estimate can be determined by using a first-order perturbation analysis. A Monte Carlo simulation is used to verify the validity of the perturbation analysis. The performance of the algorithm is illustrated by comparing the mean-squared error of the estimate to the Cramer-Rao bound for a particular example.
{"title":"Estimation and statistical analysis for exponential polynomial signals","authors":"S. Golden, B. Friedlander","doi":"10.1109/ICASSP.1995.480082","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480082","url":null,"abstract":"In this paper we approximate arbitrary complex signals by modeling both the logarithm of the amplitude and the phase of the complex signal as finite-order polynomials in time. We refer to a signal of this type as an exponential polynomial signal (EPS). We propose an algorithm to estimate any desired coefficient for this signal model. We also show how the mean-squared error of the estimate can be determined by using a first-order perturbation analysis. A Monte Carlo simulation is used to verify the validity of the perturbation analysis. The performance of the algorithm is illustrated by comparing the mean-squared error of the estimate to the Cramer-Rao bound for a particular example.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129509878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.480490
Chungyong Lee, Douglas B. Williams
An iterative method for reducing noise in contaminated chaotic signals is proposed. This method estimates the deviation of the observed signal from the nearest noise-free signal satisfying the system dynamics in order to get a noise-reduced (or enhanced) signal. To calculate the deviation we minimize a cost function composed of two parts: one containing information that represents how close the enhanced signal is to the observed signal and another including constraints that fit the dynamics of the system. This method has a simple structure and is flexible in the choice of the parts of the cost function. The proposed method is compared with Farmer's method which is known to have good performance in mild signal-to-noise ratios but has a more complex structure.
{"title":"A noise reduction method for chaotic signals","authors":"Chungyong Lee, Douglas B. Williams","doi":"10.1109/ICASSP.1995.480490","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480490","url":null,"abstract":"An iterative method for reducing noise in contaminated chaotic signals is proposed. This method estimates the deviation of the observed signal from the nearest noise-free signal satisfying the system dynamics in order to get a noise-reduced (or enhanced) signal. To calculate the deviation we minimize a cost function composed of two parts: one containing information that represents how close the enhanced signal is to the observed signal and another including constraints that fit the dynamics of the system. This method has a simple structure and is flexible in the choice of the parts of the cost function. The proposed method is compared with Farmer's method which is known to have good performance in mild signal-to-noise ratios but has a more complex structure.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129519764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.479538
S. Hayakawa, F. Itakura
In our previous studies, we have shown the effectiveness of using the information in the higher frequency band for speaker recognition. However, the energy spectrum of speech in the higher frequency band is weak, except for some fricative sounds. Therefore, it is important to investigate the speaker individual information in that region under noisy conditions. In this study, we examine the influence of additive noises on the performance of speaker recognition using the higher frequency band. Experimental results show that high performance is obtained in the wideband case under many typical noisy conditions. It is also shown that the higher frequency band is more stable against noises than the lower one. For that reason, the higher frequency band gives good performance even if the SNR of the higher frequency region is worse than the lower one.
{"title":"The influence of noise on the speaker recognition performance using the higher frequency band","authors":"S. Hayakawa, F. Itakura","doi":"10.1109/ICASSP.1995.479538","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479538","url":null,"abstract":"In our previous studies, we have shown the effectiveness of using the information in the higher frequency band for speaker recognition. However, the energy spectrum of speech in the higher frequency band is weak, except for some fricative sounds. Therefore, it is important to investigate the speaker individual information in that region under noisy conditions. In this study, we examine the influence of additive noises on the performance of speaker recognition using the higher frequency band. Experimental results show that high performance is obtained in the wideband case under many typical noisy conditions. It is also shown that the higher frequency band is more stable against noises than the lower one. For that reason, the higher frequency band gives good performance even if the SNR of the higher frequency region is worse than the lower one.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129628574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.480052
Wen-Shiung Chen, E. Yang, Zhen Zhang
In this paper, a variant of address vector quantization (ADVQ) algorithm for image compression using conditional entropy lossless coding is presented. The motivation of the proposed approach is derived from Shannon's basic entropy concept that conditional entropy is less than joint entropy.
{"title":"A variant of address vector quantization for image compression using lossless conditional entropy coding","authors":"Wen-Shiung Chen, E. Yang, Zhen Zhang","doi":"10.1109/ICASSP.1995.480052","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480052","url":null,"abstract":"In this paper, a variant of address vector quantization (ADVQ) algorithm for image compression using conditional entropy lossless coding is presented. The motivation of the proposed approach is derived from Shannon's basic entropy concept that conditional entropy is less than joint entropy.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129749673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.480567
M. Khansari, E. Dubois
We show how the Pade table can be utilized to develop a new lattice structure for general two-channel bi-orthogonal perfect reconstruction (PR) filter banks. This is achieved through characterization of all two-channel bi-orthogonal PR filter banks. The parameter space found using this method is unique for each filter bank. Similarly to any other lattice structure, the PR property is achieved structurally and quantization of the parameters of the lattice does not effect this property. Furthermore, we demonstrate that for a given filter, the set of all complementary filters can be uniquely specified by two parameters, namely the end-to-end delay of the system and a scalar quantity.
{"title":"Lattice structure for two-band perfect reconstruction filter banks using Pade approximation","authors":"M. Khansari, E. Dubois","doi":"10.1109/ICASSP.1995.480567","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480567","url":null,"abstract":"We show how the Pade table can be utilized to develop a new lattice structure for general two-channel bi-orthogonal perfect reconstruction (PR) filter banks. This is achieved through characterization of all two-channel bi-orthogonal PR filter banks. The parameter space found using this method is unique for each filter bank. Similarly to any other lattice structure, the PR property is achieved structurally and quantization of the parameters of the lattice does not effect this property. Furthermore, we demonstrate that for a given filter, the set of all complementary filters can be uniquely specified by two parameters, namely the end-to-end delay of the system and a scalar quantity.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129861607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-05-09DOI: 10.1109/ICASSP.1995.479682
H. Yehia, M. Honda, F. Itakura
A method used to determine the vocal-tract cross-sectional area function from acoustical measurements at the lips is analyzed. Under the framework described by Sondhi and Gopinath (1971) and implemented by Sondhi and Resnick (1983), a sensitivity analysis of the vocal-tract area function, derived from the impedance or reflectance at the lips is performed. It indicates that, in the ideal case, the area function is not heavily affected by random distortions of the impulse response at the lips. Simulations and real measurements show that the method works relatively well, except for regions behind narrow constrictions. In this case, an excitation pulse with high energy, as well as a fine sampling, proved to be important. The excitation used is a time stretched pulse. It produces an excitation with high energy without the necessity of a high power sound generator device.
{"title":"Acoustic measurements of the vocal-tract area function: sensitivity analysis and experiments","authors":"H. Yehia, M. Honda, F. Itakura","doi":"10.1109/ICASSP.1995.479682","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479682","url":null,"abstract":"A method used to determine the vocal-tract cross-sectional area function from acoustical measurements at the lips is analyzed. Under the framework described by Sondhi and Gopinath (1971) and implemented by Sondhi and Resnick (1983), a sensitivity analysis of the vocal-tract area function, derived from the impedance or reflectance at the lips is performed. It indicates that, in the ideal case, the area function is not heavily affected by random distortions of the impulse response at the lips. Simulations and real measurements show that the method works relatively well, except for regions behind narrow constrictions. In this case, an excitation pulse with high energy, as well as a fine sampling, proved to be important. The excitation used is a time stretched pulse. It produces an excitation with high energy without the necessity of a high power sound generator device.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128231807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}