Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625616
W. Putnam, J. Smith
Fractional sample delay (FD) filters are useful and necessary in many applications, such as the accurate steering of acoustic arrays, delay lines for physical models of musical instruments, and time delay estimation. This paper addresses the design of finite impulse response (FIR) FD filters. The problem is posed as a convex optimization problem in which the maximum modulus of the complex error is minimized. Several design examples are presented, along with an empirical formula for the filter order required to meet a given worst case group delay error specification.
{"title":"Design of fractional delay filters using convex optimization","authors":"W. Putnam, J. Smith","doi":"10.1109/ASPAA.1997.625616","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625616","url":null,"abstract":"Fractional sample delay (FD) filters are useful and necessary in many applications, such as the accurate steering of acoustic arrays, delay lines for physical models of musical instruments, and time delay estimation. This paper addresses the design of finite impulse response (FIR) FD filters. The problem is posed as a convex optimization problem in which the maximum modulus of the complex error is minimized. Several design examples are presented, along with an empirical formula for the filter order required to meet a given worst case group delay error specification.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134275601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625617
G. Borin, Giovanni De Poli, D. Rocchesso
Nonlinear acoustic systems are often described by means of nonlinear maps which act as instantaneous constraints on the solutions of a system of linear differential equations. This description leads to discrete-time models exhibiting non-computable loops. This paper presents a solution to this computability problem by means of geometrical transformation of the nonlinearities and algebraic transformation of the time-dependent equations. The proposed leads to stable and accurate simulations even at relatively low sampling rates.
{"title":"Elimination of delay-free loops in discrete-time models of nonlinear acoustic systems","authors":"G. Borin, Giovanni De Poli, D. Rocchesso","doi":"10.1109/ASPAA.1997.625617","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625617","url":null,"abstract":"Nonlinear acoustic systems are often described by means of nonlinear maps which act as instantaneous constraints on the solutions of a system of linear differential equations. This description leads to discrete-time models exhibiting non-computable loops. This paper presents a solution to this computability problem by means of geometrical transformation of the nonlinearities and algebraic transformation of the time-dependent equations. The proposed leads to stable and accurate simulations even at relatively low sampling rates.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114743887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625608
I. Shmulevich, E. Coyle
We develop a method for establishing tonal contexts of musical patterns in a musical composition. This is subsequently incorporated into a system for recognition of musical patterns. Krumhansl's (1990) key-finding algorithm is used as a basis. The sequence of maximum correlations that it outputs is smoothed with a cubic spline and is used to determine weights for perceptual and absolute pitch errors. Statistically significant maximum correlations are used to create the assigned key sequence, which is then median filtered to improve the structure of the output of the key finding algorithm.
{"title":"Establishing the tonal context for musical pattern recognition","authors":"I. Shmulevich, E. Coyle","doi":"10.1109/ASPAA.1997.625608","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625608","url":null,"abstract":"We develop a method for establishing tonal contexts of musical patterns in a musical composition. This is subsequently incorporated into a system for recognition of musical patterns. Krumhansl's (1990) key-finding algorithm is used as a basis. The sequence of maximum correlations that it outputs is smoothed with a cubic spline and is used to determine weights for perceptual and absolute pitch errors. Statistically significant maximum correlations are used to create the assigned key sequence, which is then median filtered to improve the structure of the output of the key finding algorithm.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116885513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625615
M. Karjalainen, A. Harma, U. Laine, J. Huopaniemi
An inherent property of many DSP algorithms is that they tend to exhibit uniform frequency resolution from zero to the Nyquist frequency. This is a direct consequence of using unit delays as building blocks; a frequency independent delay implies uniform frequency resolution. In audio applications, however, this is often an undesirable feature since the response properties are typically specified and measured on a logarithmic scale, following the behavior of the human auditory system. We present an overview of warped filters and DSP techniques which can be designed to better match the audio and auditory criteria. Audio applications, including modeling of auditory and musical phenomena, equalization techniques, auralization, and audio coding, are presented.
{"title":"Warped filters and their audio applications","authors":"M. Karjalainen, A. Harma, U. Laine, J. Huopaniemi","doi":"10.1109/ASPAA.1997.625615","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625615","url":null,"abstract":"An inherent property of many DSP algorithms is that they tend to exhibit uniform frequency resolution from zero to the Nyquist frequency. This is a direct consequence of using unit delays as building blocks; a frequency independent delay implies uniform frequency resolution. In audio applications, however, this is often an undesirable feature since the response properties are typically specified and measured on a logarithmic scale, following the behavior of the human auditory system. We present an overview of warped filters and DSP techniques which can be designed to better match the audio and auditory criteria. Audio applications, including modeling of auditory and musical phenomena, equalization techniques, auralization, and audio coding, are presented.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125851214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625634
J. Benesty, D. Morgan, M. Sondhi
In many applications, such as teleconferencing, multimedia workstations, televideo games, etc., stereo sound is already, or will soon be, implemented to give spatial realism that mono systems cannot offer. In such hands-free systems, stereophonic acoustic echo cancelers are absolutely necessary for full-duplex communication. We propose a new acoustic echo canceler (AEC) based on a fundamental experimental observation that the stereo effect is due mostly to sound energy below about 1000 Hz. The principle of the hybrid mono/stereo AEC is to use stereophonic sound with a stereo AEC at low frequencies (e.g., below 1000 Hz) and monophonic sound with a conventional mono AEC at higher frequencies (e.g., above 1000 Hz). This solution is a good compromise between the complexity of a full-band stereo AEC and spatial realism. For the stereo case we borrow from a previous innovation and add a small nonlinearity into each channel in order to accurately identify the two receiving room impulse responses.
{"title":"A hybrid mono/stereo acoustic echo canceler","authors":"J. Benesty, D. Morgan, M. Sondhi","doi":"10.1109/ASPAA.1997.625634","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625634","url":null,"abstract":"In many applications, such as teleconferencing, multimedia workstations, televideo games, etc., stereo sound is already, or will soon be, implemented to give spatial realism that mono systems cannot offer. In such hands-free systems, stereophonic acoustic echo cancelers are absolutely necessary for full-duplex communication. We propose a new acoustic echo canceler (AEC) based on a fundamental experimental observation that the stereo effect is due mostly to sound energy below about 1000 Hz. The principle of the hybrid mono/stereo AEC is to use stereophonic sound with a stereo AEC at low frequencies (e.g., below 1000 Hz) and monophonic sound with a conventional mono AEC at higher frequencies (e.g., above 1000 Hz). This solution is a good compromise between the complexity of a full-band stereo AEC and spatial realism. For the stereo case we borrow from a previous innovation and add a small nonlinearity into each channel in order to accurately identify the two receiving room impulse responses.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122239985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625638
P.G. Georgiou, C. Kyriakakis, P. Tsakalides
This paper addresses the problem of robust localization of a sound source in a wide range of operating environments. We use fractional lower order statistics in the frequency domain of two-sensor measurements to accurately locate the source in impulsive noise. We demonstrate a significant improvement in detection via simulation experiments of a sound source in /spl alpha/-stable noise. Applications of this technique include the efficient steering of a microphone array in teleconference applications.
{"title":"Robust time delay estimation for sound source localization in noisy environments","authors":"P.G. Georgiou, C. Kyriakakis, P. Tsakalides","doi":"10.1109/ASPAA.1997.625638","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625638","url":null,"abstract":"This paper addresses the problem of robust localization of a sound source in a wide range of operating environments. We use fractional lower order statistics in the frequency domain of two-sensor measurements to accurately locate the source in impulsive noise. We demonstrate a significant improvement in detection via simulation experiments of a sound source in /spl alpha/-stable noise. Applications of this technique include the efficient steering of a microphone array in teleconference applications.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131763622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625627
J. K. Bates
Auditory scene analysis and its older cousin, the Haas/precedence effect both involve the same acoustic and auditory phenomena. In each case it is necessary to explain the ear's ability both to hear and pay attention to sources within a background of reverberations. Thus, a successful model of the Haas effect should be capable of being extended to CASA applications. We present a model based on a vector space of elementary meaning that is somewhat similar to Divenyi's (1995) "three cardinal dimensions". Test results replicating the Haas effect demonstrate ability to select and track, without foreknowledge, the azimuth direction of arrival of acoustic sources in a background of reverberations and environment noise.
{"title":"Modeling the Haas effect: a first step for solving the CASA problem","authors":"J. K. Bates","doi":"10.1109/ASPAA.1997.625627","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625627","url":null,"abstract":"Auditory scene analysis and its older cousin, the Haas/precedence effect both involve the same acoustic and auditory phenomena. In each case it is necessary to explain the ear's ability both to hear and pay attention to sources within a background of reverberations. Thus, a successful model of the Haas effect should be capable of being extended to CASA applications. We present a model based on a vector space of elementary meaning that is somewhat similar to Divenyi's (1995) \"three cardinal dimensions\". Test results replicating the Haas effect demonstrate ability to select and track, without foreknowledge, the azimuth direction of arrival of acoustic sources in a background of reverberations and environment noise.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134326064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625596
Phillip Brown, R. Duda
A simple model is presented for synthesizing binaural sound from a monaural source. The model produces vertical as well as horizontal and externalization effects. The simplicity of the model permits efficient implementation, allowing for real-time multisource operation. Additionally, the parameters in the model can be adjusted to fit a particular individual's characteristics.
{"title":"An efficient HRTF model for 3-D sound","authors":"Phillip Brown, R. Duda","doi":"10.1109/ASPAA.1997.625596","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625596","url":null,"abstract":"A simple model is presented for synthesizing binaural sound from a monaural source. The model produces vertical as well as horizontal and externalization effects. The simplicity of the model permits efficient implementation, allowing for real-time multisource operation. Additionally, the parameters in the model can be adjusted to fit a particular individual's characteristics.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129534065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625580
E. Lindemann
The typical multiband audio compressor (TMC), such as that used in many modern hearing aids, consists of a bandpass filter bank coupled to a compression circuit which applies gain to each frequency band as a function of power in that band. Generally the filter bank is designed so that the sum of magnitude responses of the filters is unity with the band edges as steep as the implementation will allow, to minimize overlap between bands. There are a number of problems with this approach. Difficult decisions must be made regarding placement of band edges. While the composite response for broad band signals may be flat, the narrow band-e.g. swept sine-response exhibits bumps near the band edges. In other words, the system is non-shift invariant with respect to frequency. We show that these problems can be eliminated by increasing the number of bands, and by extending the overlap region between bands. The problem is examined in terms of frequency domain sampling of the power spectrum. If the sampling rate is sufficiently high then artifacts disappear, and the system can be viewed as continuous in frequency with no band edges.
{"title":"The continuous frequency dynamic range compressor","authors":"E. Lindemann","doi":"10.1109/ASPAA.1997.625580","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625580","url":null,"abstract":"The typical multiband audio compressor (TMC), such as that used in many modern hearing aids, consists of a bandpass filter bank coupled to a compression circuit which applies gain to each frequency band as a function of power in that band. Generally the filter bank is designed so that the sum of magnitude responses of the filters is unity with the band edges as steep as the implementation will allow, to minimize overlap between bands. There are a number of problems with this approach. Difficult decisions must be made regarding placement of band edges. While the composite response for broad band signals may be flat, the narrow band-e.g. swept sine-response exhibits bumps near the band edges. In other words, the system is non-shift invariant with respect to frequency. We show that these problems can be eliminated by increasing the number of bands, and by extending the overlap region between bands. The problem is examined in terms of frequency domain sampling of the power spectrum. If the sampling rate is sufficiently high then artifacts disappear, and the system can be viewed as continuous in frequency with no band edges.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134166198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-10-19DOI: 10.1109/ASPAA.1997.625621
D. Mapes-Riordan, W. Yost
The Zwicker (1977, 1990) loudness model is a standard for predicting the loudness of a sound. This model, along with Moore and Glasberg's (see Acustica, vol.82, p.335-45, 1996) revision of it, is fairly accurate at predicting the loudness of steady-state sounds, but falls short for many types of temporally varying sounds. One temporal effect not accounted for in the Zwicker model is loudness recalibration. Loudness recalibration is a fatigue-like effect that makes a quiet tone at one frequency even quieter when it is preceded by a louder tone at the same frequency. The evidence suggests that loudness recalibration occurs in the central nervous system. Two means of modeling loudness recalibration are proposed. The first is an algorithmic description of the recalibration effect that could be added to the later stages of the Zwicker model. The other method uses a neural network and is based on a spike-train timing theory of hearing rather than a rate-place theory as assumed by the Zwicker model. This spike-train timing approach is unique in that spike-train averaging is postponed until a final loudness estimate is made. A more complete and accurate model of loudness recalibration will have to wait until more experimental data is collected.
{"title":"Towards a model of loudness recalibration","authors":"D. Mapes-Riordan, W. Yost","doi":"10.1109/ASPAA.1997.625621","DOIUrl":"https://doi.org/10.1109/ASPAA.1997.625621","url":null,"abstract":"The Zwicker (1977, 1990) loudness model is a standard for predicting the loudness of a sound. This model, along with Moore and Glasberg's (see Acustica, vol.82, p.335-45, 1996) revision of it, is fairly accurate at predicting the loudness of steady-state sounds, but falls short for many types of temporally varying sounds. One temporal effect not accounted for in the Zwicker model is loudness recalibration. Loudness recalibration is a fatigue-like effect that makes a quiet tone at one frequency even quieter when it is preceded by a louder tone at the same frequency. The evidence suggests that loudness recalibration occurs in the central nervous system. Two means of modeling loudness recalibration are proposed. The first is an algorithmic description of the recalibration effect that could be added to the later stages of the Zwicker model. The other method uses a neural network and is based on a spike-train timing theory of hearing rather than a rate-place theory as assumed by the Zwicker model. This spike-train timing approach is unique in that spike-train averaging is postponed until a final loudness estimate is made. A more complete and accurate model of loudness recalibration will have to wait until more experimental data is collected.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115406000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}