Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634110
J. Herre, E. Eberlein, K. Brandenburg
InjJOdWiQn Low bit rate coding of high quality digital audio uses perceptual criteria to shape the quantization noise. [ 11 is an example for such an algorithm. Modelling of the hearing process is necessary to get knowledge about the required noise shaping. Such models used to estimate the actual hearing threshold of the human ear and in this way determine the e m r limit that must not be exceeded for a transparent coding of the signal. Traditional perceptual models consider rnasking effects which state that under certain circumstances small signals cannot be detected by the listener in the presence of a 1ar;ge signal, that they have been "masked". The masking depends on the signal's spectral characteristics and its structure in time. Up to now the dependencies of some parameters are research topics. One example is the local predictability of a signal, also hown as 'tonality' ([2]) which has a strong influence on the masking ability of a signal. This paper presents a useful tool for psychoacoustic research: The Real Time Perceptual Threshold Simulator.
{"title":"A Real Time Perceptual Threshold Simulator","authors":"J. Herre, E. Eberlein, K. Brandenburg","doi":"10.1109/ASPAA.1991.634110","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634110","url":null,"abstract":"InjJOdWiQn Low bit rate coding of high quality digital audio uses perceptual criteria to shape the quantization noise. [ 11 is an example for such an algorithm. Modelling of the hearing process is necessary to get knowledge about the required noise shaping. Such models used to estimate the actual hearing threshold of the human ear and in this way determine the e m r limit that must not be exceeded for a transparent coding of the signal. Traditional perceptual models consider rnasking effects which state that under certain circumstances small signals cannot be detected by the listener in the presence of a 1ar;ge signal, that they have been \"masked\". The masking depends on the signal's spectral characteristics and its structure in time. Up to now the dependencies of some parameters are research topics. One example is the local predictability of a signal, also hown as 'tonality' ([2]) which has a strong influence on the masking ability of a signal. This paper presents a useful tool for psychoacoustic research: The Real Time Perceptual Threshold Simulator.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116178636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634101
J. C. Middlebrooks
When presented with narrowband sound sources, human subjects make characteristic errors in localization that are largely restricted to the vertical dimension. The current study attempts to account for this behavior in terms of the directional characteristics of the head arid external ears. A model is described that effectively predicts the errors in narrowband localization and that can be applied to localization of more general types of sounds.
{"title":"Narrowband Sound Localization Related To Acoustical Cues","authors":"J. C. Middlebrooks","doi":"10.1109/ASPAA.1991.634101","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634101","url":null,"abstract":"When presented with narrowband sound sources, human subjects make characteristic errors in localization that are largely restricted to the vertical dimension. The current study attempts to account for this behavior in terms of the directional characteristics of the head arid external ears. A model is described that effectively predicts the errors in narrowband localization and that can be applied to localization of more general types of sounds.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125705873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634129
S. Lipshitz, R. Wannamaker, J. Vanderkooy, J. N. Wright
A mathematical investigation of quantizing systems using non-subtractive dither is presented. It is shown that with a suitably-chosen dither probability density function (pdf), certain moments of the total error can be made signal-independent and the error signal rendered white, but that statistical independence of the error and the input signal is not achievable. Some of these results are known but appear to be unpublished. The earliest references to many of these results are contained in manuscripts by one of the authors [JNW'] but they were later discovered independently by Stockham and Brinton2i3, Lipshitz and Vanderkooy4, and Gray5. In view of many widespread misunderstandings regarding non-subtractive dither, it seems that formal presentation of these results is long overdue.
{"title":"Non-Subtractive Dither","authors":"S. Lipshitz, R. Wannamaker, J. Vanderkooy, J. N. Wright","doi":"10.1109/ASPAA.1991.634129","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634129","url":null,"abstract":"A mathematical investigation of quantizing systems using non-subtractive dither is presented. It is shown that with a suitably-chosen dither probability density function (pdf), certain moments of the total error can be made signal-independent and the error signal rendered white, but that statistical independence of the error and the input signal is not achievable. Some of these results are known but appear to be unpublished. The earliest references to many of these results are contained in manuscripts by one of the authors [JNW'] but they were later discovered independently by Stockham and Brinton2i3, Lipshitz and Vanderkooy4, and Gray5. In view of many widespread misunderstandings regarding non-subtractive dither, it seems that formal presentation of these results is long overdue.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132766997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634140
J. Valière, S. Montrésor, J. Allard, M. Baudry
Digital techniques used for the restoration of old recordings are presented in this paper. Three different flaws can be present in old recordings, these being, harmonic distortion, impulsive noise, and background noise. Only the cancellation of the impulsive noise and the reduction of the background noise are considered. In order to cancel the impulsive noise, the corrupted samples are replaced by interpolated samples. An interpolator that uses the information located near the impulsive noise must be achieved. In this paper, two different methods of interpalation are compared. For the reduction of the background noise, we have used a method worked out by Ephraim and Malah that dces riot crzate musical noise. In order to improve the filtering of transcients, a decomposition of the signal in several frequency channels is beforehand perf ormed.
{"title":"New technics based on the wavelet transform for the restoration of old recordings","authors":"J. Valière, S. Montrésor, J. Allard, M. Baudry","doi":"10.1109/ASPAA.1991.634140","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634140","url":null,"abstract":"Digital techniques used for the restoration of old recordings are presented in this paper. Three different flaws can be present in old recordings, these being, harmonic distortion, impulsive noise, and background noise. Only the cancellation of the impulsive noise and the reduction of the background noise are considered. In order to cancel the impulsive noise, the corrupted samples are replaced by interpolated samples. An interpolator that uses the information located near the impulsive noise must be achieved. In this paper, two different methods of interpalation are compared. For the reduction of the background noise, we have used a method worked out by Ephraim and Malah that dces riot crzate musical noise. In order to improve the filtering of transcients, a decomposition of the signal in several frequency channels is beforehand perf ormed.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128942355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634087
J. Flanagan
Information signals are typically intended for human consumption. Human perception therefore contributes directly to fidelity criteria for digital representation. As computational capabilities increase and costs diminish, coding algorithms are able to iiicorporate more of the constraints that characterize perception. The incentive is still-greater economy for digital transmission and storage. Sight and sound are sensory modes favored by the human for information exchange. These modes are presently most central to humadmachine communications and multimedia systems. The intricacies of visual and auditory perception are therefore figuring more prominently in signal coding. For example, taking account of the eye's sensitivity to quantizing noise as a function of temporal and spatial frequencies leads to good-quality coding of color motion images at fractions of a bit per pixel. Similarly, the characteristics of auditory masking, in both time and frequency domains, provide leverage to identify signal components which are irrelevant to perception and which need not consume coding capacity. This discussion draws a perspective on recent coding advances and points up opportunities for increased sophistication in representing perceptual I y imp0 rtan t factors. It also indicates relations hips between economies gained by perceptual coding alone, and those where source coding can trade on signal-specific characteristics to achieve further reductions in bit rate. It COnChdeS with brief consideration of other sensory modalities, such as the tactile dimension, that might contribute to naturalness and ease of use in interactive multimedia information systems.
{"title":"Digital Representation of Perceptual Criteria","authors":"J. Flanagan","doi":"10.1109/ASPAA.1991.634087","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634087","url":null,"abstract":"Information signals are typically intended for human consumption. Human perception therefore contributes directly to fidelity criteria for digital representation. As computational capabilities increase and costs diminish, coding algorithms are able to iiicorporate more of the constraints that characterize perception. The incentive is still-greater economy for digital transmission and storage. Sight and sound are sensory modes favored by the human for information exchange. These modes are presently most central to humadmachine communications and multimedia systems. The intricacies of visual and auditory perception are therefore figuring more prominently in signal coding. For example, taking account of the eye's sensitivity to quantizing noise as a function of temporal and spatial frequencies leads to good-quality coding of color motion images at fractions of a bit per pixel. Similarly, the characteristics of auditory masking, in both time and frequency domains, provide leverage to identify signal components which are irrelevant to perception and which need not consume coding capacity. This discussion draws a perspective on recent coding advances and points up opportunities for increased sophistication in representing perceptual I y imp0 rtan t factors. It also indicates relations hips between economies gained by perceptual coding alone, and those where source coding can trade on signal-specific characteristics to achieve further reductions in bit rate. It COnChdeS with brief consideration of other sensory modalities, such as the tactile dimension, that might contribute to naturalness and ease of use in interactive multimedia information systems.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129242670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634100
F. Wightman, D. Kistler
Published data from our laboratory and others suggest that under laboratory conditions human listeners localize virtual sound sources with nearly the same accuracy as they do real sources. The virtual sources in these experiments are digitally synthesized and presented to listeners over headphones. Synthesis of a given virtual source is based on freefield to eardrum acoustical transfer functions ("head-related" transfer functions, or HRTFs) that are measured from both ears of each individual listener. It folllows that synthesis of a virtual auditory space of 265 source locations for each listener requires storage and processing of 530 complex, floating-point HRTFs. If each HRTF is represented by 256 complex spectral values, the total database consists of 271,360 floating-point numbers. Thus, while the perceptual data may argue for the viability of 3-dimensional auditory displays based on the virtual source techniques, the massive data storage and management requirements may impose some practical limitations.
{"title":"Localization of virtual sound sources synthesized from model HRTFs","authors":"F. Wightman, D. Kistler","doi":"10.1109/ASPAA.1991.634100","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634100","url":null,"abstract":"Published data from our laboratory and others suggest that under laboratory conditions human listeners localize virtual sound sources with nearly the same accuracy as they do real sources. The virtual sources in these experiments are digitally synthesized and presented to listeners over headphones. Synthesis of a given virtual source is based on freefield to eardrum acoustical transfer functions (\"head-related\" transfer functions, or HRTFs) that are measured from both ears of each individual listener. It folllows that synthesis of a virtual auditory space of 265 source locations for each listener requires storage and processing of 530 complex, floating-point HRTFs. If each HRTF is represented by 256 complex spectral values, the total database consists of 271,360 floating-point numbers. Thus, while the perceptual data may argue for the viability of 3-dimensional auditory displays based on the virtual source techniques, the massive data storage and management requirements may impose some practical limitations.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130926728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634090
R. Patterson, J. Holdsworth, P. Thurston, T. Robinson
Over the past decade, hearing scientists have developed a number of time-domain models of the processing performed by the cochlea in an effort to develop a reasonably accurate multi-channel representation of the pattern of neural activity flowing from the cochlea up the auditory nerve to the cochlear nucleus [l]. It is often assumed that peripheral auditory processing ends at the output of the cochlea and that the pattern of activity in the auditory nerve is in some sense what we hear. In reality, this neural activity pattern (NAP) is not a good representation of our auditory sensations because it includes phase differences that we do riot hear and it does not include auditory temporal integration (TI). As a result, several of the models have been extended to include periodicity-sensitive TI [2], [3], [4] which converts the fast-flowing neural activity pattern into a form that is much more like the auditory images we experience in response to sounds. When these models are applied to speech sounds, the auditory images of vowels reveal an elaborate formant structure that is absent in the more traditional representation of speech -the spectrogram. An example is presented on the left in the figure; it is the auditory image of the stationary part of the vowel /ae/ as in 'bab' [4]. The abscissa of the auditory image is 'temporal integration interval' and each line of the image shows the activity in one frequency channel of the auditory model. In general terms, activity on a vertical line in the auditory image shows that there is a correlation in the sound at that temporal interval. The coincentrations of activity are the formants of the vowel.
{"title":"Auditory Images As Input For Speech Recognition Systems","authors":"R. Patterson, J. Holdsworth, P. Thurston, T. Robinson","doi":"10.1109/ASPAA.1991.634090","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634090","url":null,"abstract":"Over the past decade, hearing scientists have developed a number of time-domain models of the processing performed by the cochlea in an effort to develop a reasonably accurate multi-channel representation of the pattern of neural activity flowing from the cochlea up the auditory nerve to the cochlear nucleus [l]. It is often assumed that peripheral auditory processing ends at the output of the cochlea and that the pattern of activity in the auditory nerve is in some sense what we hear. In reality, this neural activity pattern (NAP) is not a good representation of our auditory sensations because it includes phase differences that we do riot hear and it does not include auditory temporal integration (TI). As a result, several of the models have been extended to include periodicity-sensitive TI [2], [3], [4] which converts the fast-flowing neural activity pattern into a form that is much more like the auditory images we experience in response to sounds. When these models are applied to speech sounds, the auditory images of vowels reveal an elaborate formant structure that is absent in the more traditional representation of speech -the spectrogram. An example is presented on the left in the figure; it is the auditory image of the stationary part of the vowel /ae/ as in 'bab' [4]. The abscissa of the auditory image is 'temporal integration interval' and each line of the image shows the activity in one frequency channel of the auditory model. In general terms, activity on a vertical line in the auditory image shows that there is a correlation in the sound at that temporal interval. The coincentrations of activity are the formants of the vowel.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125117711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634122
A. Ganeshkumar, J. Hammond, C. G. Rice
Our approach is based on enhancing the Short Time Spectral Amplitude (STSA) of degraded speech using the spectral subtraction algorithm. The use of spectral subtraction to enhance speech has been studied quite extensively in the past [1,2]. These studies have generally shown an increase in speech quality but the gain in intelligibility has been insignificant. The lack of improvement in intelligibility can be atmbiited to two main factors. The first being that since all previous work on the application of spectral subtraction algorithm have been confined to single input systems, the noise short time spectrum can only be estimated during non-speech activity periods. This approach not only requires accurate speechhion-speech activity detection a difficult task, particularly at low signal to noise ratiosbut also requires the noise to be sufficiently stationary for the estimate to be used during the subsequent speech period. The second factor for the lack of improvement in intelligibility is due to the annoying 'musical' type of residual noise introduced by spectral subtraction processing. This residual noise may distract the listener from the speech.
{"title":"Speech Enhancement For Hearing Aids Using A Microphone Array","authors":"A. Ganeshkumar, J. Hammond, C. G. Rice","doi":"10.1109/ASPAA.1991.634122","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634122","url":null,"abstract":"Our approach is based on enhancing the Short Time Spectral Amplitude (STSA) of degraded speech using the spectral subtraction algorithm. The use of spectral subtraction to enhance speech has been studied quite extensively in the past [1,2]. These studies have generally shown an increase in speech quality but the gain in intelligibility has been insignificant. The lack of improvement in intelligibility can be atmbiited to two main factors. The first being that since all previous work on the application of spectral subtraction algorithm have been confined to single input systems, the noise short time spectrum can only be estimated during non-speech activity periods. This approach not only requires accurate speechhion-speech activity detection a difficult task, particularly at low signal to noise ratiosbut also requires the noise to be sufficiently stationary for the estimate to be used during the subsequent speech period. The second factor for the lack of improvement in intelligibility is due to the annoying 'musical' type of residual noise introduced by spectral subtraction processing. This residual noise may distract the listener from the speech.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"521 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123062308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634131
D. Rossum
In the &sign of VLSI circuits to implement digital filters for electronic music purposes, we have found it useful to encode the filter coefficients. Such encoding offers three advantages. First, the encoding can be made to correspond more properly to the "natural" perceptual units of audio. While these are most accurately the "bark" for frequency and the "sone" for loudness, a good working approximation is decibels and musical octaves respectively. Secondly, our encoding scheme allows for partial decoupling of the pole radius and angle, providing superior interpolation characteristics when the coefficients are dynamically swept. Thirdly, and perhaps most importantly, appropriate encoding of the coefficients can save substantial amounts of on-chip memory. While audio filter coefficients typically require twenty or more bits, we have found adequate coverage at as few as eight bits, allowing for a much more cost effective custom hardware implementation when many coefficients are required. We have named the resulting patented encoding scheme "ARh4Adillo." Our implementation of digital audio filters is based on the canonical second order section whose transfer function should be familiar to all: 1*-1*-2 H(Z) = +blz-1+b,z-2 [I1 While dealing with poles and feedback (bn) coefficients, the comments herein apply as well to zeroes and feedforward coefficients (an/@) when the gain (a@ is separated as shown above. Noting that the height of a resonant peak in the magnitude response produced by a pole is approximately inversely proportional to the distance from the pole to the unit circle, we can relate the height p of this resonant peak in dB to the pole radius R: 1 1-R
在VLSI电路的&符号中实现用于电子音乐的数字滤波器,我们发现对滤波器系数进行编码是有用的。这样的编码提供了三个优点。首先,编码可以更恰当地对应于音频的“自然”感知单元。虽然用“bark”来表示频率最准确,用“sone”来表示响度最准确,但一个很好的近似方法是分别用分贝和八度。其次,我们的编码方案允许极点半径和角度的部分解耦,当系数被动态扫描时提供优越的插值特性。第三,也许是最重要的,系数的适当编码可以节省大量的片上存储器。虽然音频滤波器系数通常需要20位或更多位,但我们发现只需8位就足够覆盖,当需要许多系数时,允许更具成本效益的定制硬件实现。我们将由此产生的专利编码方案命名为“ARh4Adillo”。我们的数字音频滤波器的实现是基于规范的二阶部分,其传递函数应该是所有人都熟悉的:1*-1*-2 H(Z) = +blz-1+b, Z -2 [I1]在处理极点和反馈(bn)系数时,此处的注释也适用于零点和前馈系数(an/@),当增益(a@)如上所示分离时。注意到极点产生的振幅响应中谐振峰的高度与极点到单位圆的距离近似成反比,我们可以将这个谐振峰的高度p(以dB为单位)与极点半径R联系起来:1 1-R
{"title":"The \"ARMAdillo\" Coefficient Encoding Scheme for Digital Audio Filters","authors":"D. Rossum","doi":"10.1109/ASPAA.1991.634131","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634131","url":null,"abstract":"In the &sign of VLSI circuits to implement digital filters for electronic music purposes, we have found it useful to encode the filter coefficients. Such encoding offers three advantages. First, the encoding can be made to correspond more properly to the \"natural\" perceptual units of audio. While these are most accurately the \"bark\" for frequency and the \"sone\" for loudness, a good working approximation is decibels and musical octaves respectively. Secondly, our encoding scheme allows for partial decoupling of the pole radius and angle, providing superior interpolation characteristics when the coefficients are dynamically swept. Thirdly, and perhaps most importantly, appropriate encoding of the coefficients can save substantial amounts of on-chip memory. While audio filter coefficients typically require twenty or more bits, we have found adequate coverage at as few as eight bits, allowing for a much more cost effective custom hardware implementation when many coefficients are required. We have named the resulting patented encoding scheme \"ARh4Adillo.\" Our implementation of digital audio filters is based on the canonical second order section whose transfer function should be familiar to all: 1*-1*-2 H(Z) = +blz-1+b,z-2 [I1 While dealing with poles and feedback (bn) coefficients, the comments herein apply as well to zeroes and feedforward coefficients (an/@) when the gain (a@ is separated as shown above. Noting that the height of a resonant peak in the magnitude response produced by a pole is approximately inversely proportional to the distance from the pole to the unit circle, we can relate the height p of this resonant peak in dB to the pole radius R: 1 1-R","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115024907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASPAA.1991.634119
H. Levitt, T. Schwander, M. Weiss
Noise reduction systems using two or more microphones are generally more effectj-ve than single-microphone systems. Under ideal conditions, an adaptive two-microphone system with one microphone placed at, the noise source can achieve perfect cancellation. For hearing-aid applications it is not usually practical to place a microphone at or near the noise source. It is possible, however, to mount both microphones on the head with a directional microphone facing the noise source and an omnidirectional microphone picking up speech plus noise. In practice, there is continual movement of the head relative to the speech and noise sourc:es which may adversely affect the adaptive cancellation algorithm. Another practical problem is that of room reverberation. A head-mounted two microphone adaptive noise cancellation system was evaluated experimentally in an anechoic chamber and in rooms, with reverberation times of up to 0.6 seconds. Significant improvements in speech intelligibility were obtained with both normal-hearing and hearing-impaired listeners.
{"title":"Adaptive Noise Cancellation for Hearing-Aid Application","authors":"H. Levitt, T. Schwander, M. Weiss","doi":"10.1109/ASPAA.1991.634119","DOIUrl":"https://doi.org/10.1109/ASPAA.1991.634119","url":null,"abstract":"Noise reduction systems using two or more microphones are generally more effectj-ve than single-microphone systems. Under ideal conditions, an adaptive two-microphone system with one microphone placed at, the noise source can achieve perfect cancellation. For hearing-aid applications it is not usually practical to place a microphone at or near the noise source. It is possible, however, to mount both microphones on the head with a directional microphone facing the noise source and an omnidirectional microphone picking up speech plus noise. In practice, there is continual movement of the head relative to the speech and noise sourc:es which may adversely affect the adaptive cancellation algorithm. Another practical problem is that of room reverberation. A head-mounted two microphone adaptive noise cancellation system was evaluated experimentally in an anechoic chamber and in rooms, with reverberation times of up to 0.6 seconds. Significant improvements in speech intelligibility were obtained with both normal-hearing and hearing-impaired listeners.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121258435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}