Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625602
S. Levine, Tony S. Verma, Julius O. Smith
We describe an improved method of generating more accurate sinusoidal parameters (amplitude, frequency, phase) from a wideband polyphonic audio source in a multiresolution, non-aliased fashion. This significantly improves upon previous work of sinusoidal modeling that assumes a single-pitched monophonic source, such as speech or an individual musical instrument. In addition to a more general analysis, we can now perform high-quality transformations such as time-stretching and pitch-shifting on polyphonic audio with ease.
{"title":"Alias-free, multiresolution sinusoidal modeling for polyphonic, wideband audio","authors":"S. Levine, Tony S. Verma, Julius O. Smith","doi":"10.1109/ASPAA.1997.625602","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625602","url":null,"abstract":"We describe an improved method of generating more accurate sinusoidal parameters (amplitude, frequency, phase) from a wideband polyphonic audio source in a multiresolution, non-aliased fashion. This significantly improves upon previous work of sinusoidal modeling that assumes a single-pitched monophonic source, such as speech or an individual musical instrument. In addition to a more general analysis, we can now perform high-quality transformations such as time-stretching and pitch-shifting on polyphonic audio with ease.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129007202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625584
I.L.D.M. Merks, M. M. Boone, A. Berkhout
This paper describes the design and implementation of a binaural directional hearing aid. This hearing aid consists of a microphone array of five directional microphones integrated into the front of a pair of spectacles. The signals of the microphones are processed with the aid of double beamforming into a left-ear and a right-ear signal. The directivity pattern of the left-ear signal has its main lobe at a small angle to the left, and the directivity pattern of the right-ear signal at a small angle to the right. These different main lobes cause an interaural level difference (ILD). In natural conditions, an ILD enables the human auditory brain to localize sound sources and to significantly improve speech intelligibility in noise. A computer simulation and an implementation in analogue electronics show that the main lobes for the left-ear and right-ear realize sufficient ILD at high frequencies to enable an effective localization of sound sources.
{"title":"Design of a broadside array for a binaural hearing aid","authors":"I.L.D.M. Merks, M. M. Boone, A. Berkhout","doi":"10.1109/ASPAA.1997.625584","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625584","url":null,"abstract":"This paper describes the design and implementation of a binaural directional hearing aid. This hearing aid consists of a microphone array of five directional microphones integrated into the front of a pair of spectacles. The signals of the microphones are processed with the aid of double beamforming into a left-ear and a right-ear signal. The directivity pattern of the left-ear signal has its main lobe at a small angle to the left, and the directivity pattern of the right-ear signal at a small angle to the right. These different main lobes cause an interaural level difference (ILD). In natural conditions, an ILD enables the human auditory brain to localize sound sources and to significantly improve speech intelligibility in noise. A computer simulation and an implementation in analogue electronics show that the main lobes for the left-ear and right-ear realize sufficient ILD at high frequencies to enable an effective localization of sound sources.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132497759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625620
F. Baumgarte
In a variety of applications the processing of arbitrary sound signals requires models for loudness perception or auditory masking with improved accuracy, compared to psychoacoustical models known so far. In general, perceptual models can only reach higher accuracy due to special assumptions concerning signal characteristics. The presented human ear model overcomes these restrictions because of the physiological modelling approach of sound processing in the ear, which is valid, independent from the signal characteristics. The results shown indicate that psychoacoustic observations in terms of loudness and masking are closely met. Additionally, the basilar membrane motion in the inner ear is obtained as intermediate quantity in accordance with physiological measurements, supporting the hypotheses about outer hair cell operation in the inner ear.
{"title":"A physiological ear model for specific loudness and masking","authors":"F. Baumgarte","doi":"10.1109/ASPAA.1997.625620","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625620","url":null,"abstract":"In a variety of applications the processing of arbitrary sound signals requires models for loudness perception or auditory masking with improved accuracy, compared to psychoacoustical models known so far. In general, perceptual models can only reach higher accuracy due to special assumptions concerning signal characteristics. The presented human ear model overcomes these restrictions because of the physiological modelling approach of sound processing in the ear, which is valid, independent from the signal characteristics. The results shown indicate that psychoacoustic observations in terms of loudness and masking are closely met. Additionally, the basilar membrane motion in the inner ear is obtained as intermediate quantity in accordance with physiological measurements, supporting the hypotheses about outer hair cell operation in the inner ear.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129800500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625587
S. Quackenbush, J. Johnston
Advanced Audio Coding (AAC), part of ISO/MPEG-2, issued as an international standard in April, 1997. It supports single or multiple channel audio programs and delivers excellent audio quality at or below 64 kbps/channel by exploiting the compression capabilities of a high-resolution filterbank, backward-adaptive prediction, joint channel coding, nonlinear quantizers and noiseless (Huffman) coding. This paper describes the flexible Huffman coding algorithm used in AAC and discusses the compression provided by this component of the standard.
{"title":"Noiseless coding of quantized spectral components in MPEG-2 Advanced Audio Coding","authors":"S. Quackenbush, J. Johnston","doi":"10.1109/ASPAA.1997.625587","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625587","url":null,"abstract":"Advanced Audio Coding (AAC), part of ISO/MPEG-2, issued as an international standard in April, 1997. It supports single or multiple channel audio programs and delivers excellent audio quality at or below 64 kbps/channel by exploiting the compression capabilities of a high-resolution filterbank, backward-adaptive prediction, joint channel coding, nonlinear quantizers and noiseless (Huffman) coding. This paper describes the flexible Huffman coding algorithm used in AAC and discusses the compression provided by this component of the standard.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127909883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625614
A. Wang, J.O. Smith
Infinite impulse response (IIR) recursive linear digital filters are widely used because of their low computational cost and low storage overhead requirements. Finite impulse response (FIR) filters, on the other hand, allow the possibility of implementing linear-phase linear digital filters which have constant group delay across all frequencies. The tradeoff is that to achieve similar magnitude transfer functions, FIR filters usually require much larger filter orders than their IIR counterparts. We describe an algorithm for the efficient implementation of certain classes of FIR filters. We introduce an extension of the truncated IIR (TIIR) algorithm which allows the truncation of arbitrary IIR filter tails. Our algorithm allows the possibility of implementing polynomial impulse responses. Additionally, we present an analysis of the effects of limited numerical precision and provide design guidelines for designing systems with acceptable noise tolerance.
{"title":"Some properties of tail-canceling IIR filters","authors":"A. Wang, J.O. Smith","doi":"10.1109/ASPAA.1997.625614","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625614","url":null,"abstract":"Infinite impulse response (IIR) recursive linear digital filters are widely used because of their low computational cost and low storage overhead requirements. Finite impulse response (FIR) filters, on the other hand, allow the possibility of implementing linear-phase linear digital filters which have constant group delay across all frequencies. The tradeoff is that to achieve similar magnitude transfer functions, FIR filters usually require much larger filter orders than their IIR counterparts. We describe an algorithm for the efficient implementation of certain classes of FIR filters. We introduce an extension of the truncated IIR (TIIR) algorithm which allows the truncation of arbitrary IIR filter tails. Our algorithm allows the possibility of implementing polynomial impulse responses. Additionally, we present an analysis of the effects of limited numerical precision and provide design guidelines for designing systems with acceptable noise tolerance.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126100251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625582
M. Siqueira, A. Alwan, R. Speece
Acoustic feedback is a problem in hearing aids that contain a substantial amount of gain, hearing aids that are used in conjunction with vented or open molds, and in-the-ear hearing aids. Acoustic feedback is both annoying and reduces the maximum usable gain of hearing-aid devices. This paper studies analytically the steady-state convergence behavior of LMS-based adaptive algorithms when operating in continuous adaptation to reduce acoustic feedback. A bias is found in the adaptive filter's estimate of the hearing-aid feedback path. A method for reducing this bias and producing an improved estimate of the feedback path is analyzed. It is shown that by the use of delays in the forward path of the hearing aid plant, it is possible to reduce the bias considerably.
{"title":"Steady-state analysis of continuous adaptation systems in hearing aids","authors":"M. Siqueira, A. Alwan, R. Speece","doi":"10.1109/ASPAA.1997.625582","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625582","url":null,"abstract":"Acoustic feedback is a problem in hearing aids that contain a substantial amount of gain, hearing aids that are used in conjunction with vented or open molds, and in-the-ear hearing aids. Acoustic feedback is both annoying and reduces the maximum usable gain of hearing-aid devices. This paper studies analytically the steady-state convergence behavior of LMS-based adaptive algorithms when operating in continuous adaptation to reduce acoustic feedback. A bias is found in the adaptive filter's estimate of the hearing-aid feedback path. A method for reducing this bias and producing an improved estimate of the feedback path is analyzed. It is shown that by the use of delays in the forward path of the hearing aid plant, it is possible to reduce the bias considerably.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125896406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-01DOI: 10.1109/ASPAA.1997.625586
S. Voran
We describe six algorithms for bit-allocation in audio coding. Each algorithm stems from the minimization of a different perceptually-motivated objective function. Three of these objective functions are extensions of existing ones, and three are new. Closed-form bit-allocation equations result in five cases, and an iterative approach is required in the sixth.
{"title":"Perception-based bit-allocation algorithms for audio coding","authors":"S. Voran","doi":"10.1109/ASPAA.1997.625586","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625586","url":null,"abstract":"We describe six algorithms for bit-allocation in audio coding. Each algorithm stems from the minimization of a different perceptually-motivated objective function. Three of these objective functions are extensions of existing ones, and three are new. Closed-form bit-allocation equations result in five cases, and an iterative approach is required in the sixth.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125176005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-09-16DOI: 10.1109/ASPAA.1997.625607
S. Godsill, C. H. Tan
This paper is concerned with the removal of low frequency transient noise from old gramophone recordings and film sound tracks. Low frequency transients occur as a result of large breakages or discontinuities in the recorded medium which excite a long-term resonance in the playback apparatus. We present a signal separation-based approach to this problem. Audio signals and noise transients are modelled as autoregressive (AR) processes which are additively superimposed to give the observed waveform. A maximum a posteriori method is presented for separation of the two processes. A modification of this scheme allows for modelling of the large discontinuity at the start of each noise transient and successful restorations are demonstrated. A more practical scheme is then developed which uses a Kalman filter to implement the separation. In order to avoid low frequency distortions to the audio signal, the excitation variance of the noise transient model is tapered exponentially to zero away from the discontinuity. The method is fully automated and more practical to implement than existing schemes for removal of such defects. Results indicate a high level of performance.
{"title":"Removal of low frequency transient noise from old recordings using model-based signal separation techniques","authors":"S. Godsill, C. H. Tan","doi":"10.1109/ASPAA.1997.625607","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625607","url":null,"abstract":"This paper is concerned with the removal of low frequency transient noise from old gramophone recordings and film sound tracks. Low frequency transients occur as a result of large breakages or discontinuities in the recorded medium which excite a long-term resonance in the playback apparatus. We present a signal separation-based approach to this problem. Audio signals and noise transients are modelled as autoregressive (AR) processes which are additively superimposed to give the observed waveform. A maximum a posteriori method is presented for separation of the two processes. A modification of this scheme allows for modelling of the large discontinuity at the start of each noise transient and successful restorations are demonstrated. A more practical scheme is then developed which uses a Kalman filter to implement the separation. In order to avoid low frequency distortions to the audio signal, the excitation variance of the noise transient model is tapered exponentially to zero away from the discontinuity. The method is fully automated and more practical to implement than existing schemes for removal of such defects. Results indicate a high level of performance.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133105509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-10-01DOI: 10.1109/ASPAA.1997.625628
G. Elko
The spatial correlation function between directional microphones is useful in the design and analysis of the performance of these microphones in actual acoustic noise fields. These correlation functions are well known for omnidirectional receivers, but not well known for directional receivers. This paper investigates the spatial correlation functions for N/sup th/-order differential microphones in spherically isotropic noise fields. The results are used to calculate the amount of achievable cancellation from an adaptive noise cancellation application using combinations of differential microphones to remove unwanted noise from a desired signal. The results are also useful in determining signal-to-noise ratio gains from arbitrarily positioned differential microphone elements in microphone array applications.
{"title":"Adaptive noise cancellation with directional microphones","authors":"G. Elko","doi":"10.1109/ASPAA.1997.625628","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625628","url":null,"abstract":"The spatial correlation function between directional microphones is useful in the design and analysis of the performance of these microphones in actual acoustic noise fields. These correlation functions are well known for omnidirectional receivers, but not well known for directional receivers. This paper investigates the spatial correlation functions for N/sup th/-order differential microphones in spherically isotropic noise fields. The results are used to calculate the amount of achievable cancellation from an adaptive noise cancellation application using combinations of differential microphones to remove unwanted noise from a desired signal. The results are also useful in determining signal-to-noise ratio gains from arbitrarily positioned differential microphone elements in microphone array applications.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117309836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1997.625635
K. Eneman, M. Moonen
For many years now, subband and frequency-domain adaptive filtering techniques have been proposed for the cancellation of long acoustic echoes. Classical LMS based algorithms are less attractive as their computation load is higher and the convergence behaviour for coloured far-end inputs is worse. We specify 3 realization conditions for DFT modulated subband schemes. Standard subband adaptive filters cannot fulfil all conditions. We show that frequency-domain based algorithms can be considered as a special case of subband adaptive filtering and that the realization conditions can be fulfilled in this case.
{"title":"Filter bank constraints for subband and frequency-domain adaptive filters","authors":"K. Eneman, M. Moonen","doi":"10.1109/ASPAA.1997.625635","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625635","url":null,"abstract":"For many years now, subband and frequency-domain adaptive filtering techniques have been proposed for the cancellation of long acoustic echoes. Classical LMS based algorithms are less attractive as their computation load is higher and the convergence behaviour for coloured far-end inputs is worse. We specify 3 realization conditions for DFT modulated subband schemes. Standard subband adaptive filters cannot fulfil all conditions. We show that frequency-domain based algorithms can be considered as a special case of subband adaptive filtering and that the realization conditions can be fulfilled in this case.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130702369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}