Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810844
Rix, R. Reynolds, M. Hollier
Perceptual quality assessment models were initially developed to predict subjective quality of codecs. Experience with telephony applications has found that today's complex networks make assessment difficult. Analogue interfaces and variable delay are amongst the technologies used in current voice transmission systems-and often make the first generation of perceptual models produce inaccurate scores. It has been necessary to extend perceptual models for use in whole network and field applications. In this paper we describe the perceptual analysis/measurement system (PAMS), a model specifically designed for end-to-end assessment of complex telephone networks. We focus on the method used to estimate subjective quality, applying constraints to make robust predictions and enhance the model's generality.
{"title":"Robust perceptual assessment of end-to-end audio quality","authors":"Rix, R. Reynolds, M. Hollier","doi":"10.1109/ASPAA.1999.810844","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810844","url":null,"abstract":"Perceptual quality assessment models were initially developed to predict subjective quality of codecs. Experience with telephony applications has found that today's complex networks make assessment difficult. Analogue interfaces and variable delay are amongst the technologies used in current voice transmission systems-and often make the first generation of perceptual models produce inaccurate scores. It has been necessary to extend perceptual models for use in whole network and field applications. In this paper we describe the perceptual analysis/measurement system (PAMS), a model specifically designed for end-to-end assessment of complex telephone networks. We focus on the method used to estimate subjective quality, applying constraints to make robust predictions and enhance the model's generality.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130392359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810871
A. Czyżewski, R. Królikowski
A new concept for the reduction of noise affecting audio signals transmitted in telecommunication channels is proposed. This concept is exploiting some features of the human auditory system. A strong subjective effect of noise suppression in noisy audio can be obtained by uplifting the masking thresholds above the estimated level of the noisy components or by reducing this level in such a way that the components be maintained just below the masking thresholds. The foundations of the engineered method together with the appropriate algorithms are described. A discussion on the results of experiments carried out and some conclusions are also included. The main focus is put on the perceptual foundations of the noise reduction method.
{"title":"Noise reduction in audio signals based on the perceptual coding approach","authors":"A. Czyżewski, R. Królikowski","doi":"10.1109/ASPAA.1999.810871","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810871","url":null,"abstract":"A new concept for the reduction of noise affecting audio signals transmitted in telecommunication channels is proposed. This concept is exploiting some features of the human auditory system. A strong subjective effect of noise suppression in noisy audio can be obtained by uplifting the masking thresholds above the estimated level of the noisy components or by reducing this level in such a way that the components be maintained just below the masking thresholds. The foundations of the engineered method together with the appropriate algorithms are described. A discussion on the results of experiments carried out and some conclusions are also included. The main focus is put on the perceptual foundations of the noise reduction method.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123665083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810836
Joerg Bitzer, K. Kammeyer, K. U. Simmer
We introduce a new implementation of superdirective beamformers. The new structure has the advantage of reduced computational complexity. This advantage is due to a GSC-like (generalized sidelobe canceller) scheme. Unlike the conventional GSC, the filters in the sidelobe cancelling path are fixed and can be computed in advance by using the Wiener solution. The new structure yields exactly the same noise reduction performance as the superdirective beamformer does.
{"title":"An alternative implementation of the superdirective beamformer","authors":"Joerg Bitzer, K. Kammeyer, K. U. Simmer","doi":"10.1109/ASPAA.1999.810836","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810836","url":null,"abstract":"We introduce a new implementation of superdirective beamformers. The new structure has the advantage of reduced computational complexity. This advantage is due to a GSC-like (generalized sidelobe canceller) scheme. Unlike the conventional GSC, the filters in the sidelobe cancelling path are fixed and can be computed in advance by using the Wiener solution. The new structure yields exactly the same noise reduction performance as the superdirective beamformer does.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"829 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126235209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810842
H. Purnhagen
Parametric modelling provides an efficient representation of general audio signals and is utilised in very low bit rate audio coding. It is based on the decomposition of an audio signal into components which are described by appropriate source models and represented by model parameters. Perception models are utilised in signal decomposition and model parameter coding. This paper gives a brief tutorial overview of parametric audio coding and describes the parametric coder currently developed in the MPEG-4 audio standardisation. Recent advances as well as novel approaches in this field are presented.
{"title":"Advances in parametric audio coding","authors":"H. Purnhagen","doi":"10.1109/ASPAA.1999.810842","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810842","url":null,"abstract":"Parametric modelling provides an efficient representation of general audio signals and is utilised in very low bit rate audio coding. It is based on the decomposition of an audio signal into components which are described by appropriate source models and represented by model parameters. Perception models are utilised in signal decomposition and model parameter coding. This paper gives a brief tutorial overview of parametric audio coding and describes the parametric coder currently developed in the MPEG-4 audio standardisation. Recent advances as well as novel approaches in this field are presented.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115617408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810840
D. Ward, M. Brandstein
A new method is presented for speech acquisition in a room-environment using a microphone array. The technique involves dividing the room into several regions (called grids), and classifying each grid according to the acoustic source located within it. Based on this classification, array weights are found to pass signals from the chosen source grid while minimizing the response to so-called interference grids (that contain either interfering sources or strong reflections). Simulation results are presented to demonstrate the effectiveness of the proposed technique.
{"title":"Grid-based beamformer design for room-environment microphone arrays","authors":"D. Ward, M. Brandstein","doi":"10.1109/ASPAA.1999.810840","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810840","url":null,"abstract":"A new method is presented for speech acquisition in a room-environment using a microphone array. The technique involves dividing the room into several regions (called grids), and classifying each grid according to the acoustic source located within it. Based on this classification, array weights are found to pass signals from the chosen source grid while minimizing the response to so-called interference grids (that contain either interfering sources or strong reflections). Simulation results are presented to demonstrate the effectiveness of the proposed technique.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114537847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810855
L. Trautmann, R. Rabenstein
Various methods for sound synthesis based on physical models have been presented. They start from a continuous model for the vibrating body, given by partial differential equations (PDEs), and employ proper discretization in time and space. Examples are waveguide models or finite difference models. A different approach is presented here. It is based on a multidimensional transfer function model derived by suitable functional transformations in time and space. Physical effects modeled by the PDE like longitudinal and transversal oscillations, loss and dispersion are treated with this method in an exact fashion. Moreover, the transfer function models explicitly take initial and boundary conditions, as well as excitation functions into account. The discretization based on analog-to-discrete transformations preserves not only the inherent physical stability, but also the natural frequencies of the oscillating body. The resulting algorithms are suitable for real-time implementation on digital signal processors. This paper shows the new method on the linear example of a transversal oscillating tightened string with frequency dependent loss terms.
{"title":"Digital sound synthesis based on transfer function models","authors":"L. Trautmann, R. Rabenstein","doi":"10.1109/ASPAA.1999.810855","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810855","url":null,"abstract":"Various methods for sound synthesis based on physical models have been presented. They start from a continuous model for the vibrating body, given by partial differential equations (PDEs), and employ proper discretization in time and space. Examples are waveguide models or finite difference models. A different approach is presented here. It is based on a multidimensional transfer function model derived by suitable functional transformations in time and space. Physical effects modeled by the PDE like longitudinal and transversal oscillations, loss and dispersion are treated with this method in an exact fashion. Moreover, the transfer function models explicitly take initial and boundary conditions, as well as excitation functions into account. The discretization based on analog-to-discrete transformations preserves not only the inherent physical stability, but also the natural frequencies of the oscillating body. The resulting algorithms are suitable for real-time implementation on digital signal processors. This paper shows the new method on the linear example of a transversal oscillating tightened string with frequency dependent loss terms.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126175382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810883
F. Fontana, Luca Gibin, D. Rocchesso, O. Ballan
Small enclosures are characterized by peculiar acoustical properties, which sometimes need to be corrected. Since the impulse responses taken in different positions of a small room exhibit common characteristics ascribing to the physical parameters of the enclosure-in particular some peaks in the low frequency spectra-an effective correction can be realized, making use of an equalizer which processes the music or speech signal during its travel along the reproduction chain. In this paper, an efficient yet versatile equalizer for small rooms, simple enough to run in real-time, is presented. It is based on the common acoustical poles model, focused on the low-frequency range and cascaded with a conditioning stage.
{"title":"Common pole equalization of small rooms using a two-step real-time digital equalizer","authors":"F. Fontana, Luca Gibin, D. Rocchesso, O. Ballan","doi":"10.1109/ASPAA.1999.810883","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810883","url":null,"abstract":"Small enclosures are characterized by peculiar acoustical properties, which sometimes need to be corrected. Since the impulse responses taken in different positions of a small room exhibit common characteristics ascribing to the physical parameters of the enclosure-in particular some peaks in the low frequency spectra-an effective correction can be realized, making use of an equalizer which processes the music or speech signal during its travel along the reproduction chain. In this paper, an efficient yet versatile equalizer for small rooms, simple enough to run in real-time, is presented. It is based on the common acoustical poles model, focused on the low-frequency range and cascaded with a conditioning stage.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132269222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810877
D. Brungart
When a sound source is close, the angle of the source relative to the center of the head can differ substantially from the angle of the source relative to the ear. Since the high-frequency features of the HRTF (head related transfer function) are known to depend on angle of the source relative to the ear, this "acoustic parallax" should produce a systematic remapping of high-frequency features in the far-field ipsilateral HRTF to more lateral locations in the near-field HRTF. HRTFs measured on an acoustic manikin indicate that this type of remapping does occur in the near field, and that the frequency response of the pinna is roughly independent of distance when the source is more than 5 cm from the ear. The perceptual relevance of the acoustic parallax effect is briefly discussed, along with its potential application to near-field virtual audio displays.
{"title":"Auditory parallax effects in the HRTF for nearby sources","authors":"D. Brungart","doi":"10.1109/ASPAA.1999.810877","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810877","url":null,"abstract":"When a sound source is close, the angle of the source relative to the center of the head can differ substantially from the angle of the source relative to the ear. Since the high-frequency features of the HRTF (head related transfer function) are known to depend on angle of the source relative to the ear, this \"acoustic parallax\" should produce a systematic remapping of high-frequency features in the far-field ipsilateral HRTF to more lateral locations in the near-field HRTF. HRTFs measured on an acoustic manikin indicate that this type of remapping does occur in the near field, and that the frequency response of the pinna is roughly independent of distance when the source is more than 5 cm from the ear. The perceptual relevance of the acoustic parallax effect is briefly discussed, along with its potential application to near-field virtual audio displays.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128864348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810881
V. Pulkki
The perceived spatial spread of amplitude panned virtual sources is dependent on the number of loudspeakers that are used to produce them. When pair-wise or triplet-wise panning is applied, the number of active loudspeakers varies as a function of the panning direction. This may cause unwanted changes in spatial spread and coloration of a virtual source if it is moved in the sound stage. In this paper a method is presented to make the directional spread of amplitude panned virtual sources independent of their panning direction. This is accomplished by panning the sound signal to multiple directions near each other simultaneously. This forms a single virtual source with constant directional spread as a function of direction.
{"title":"Uniform spreading of amplitude panned virtual sources","authors":"V. Pulkki","doi":"10.1109/ASPAA.1999.810881","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810881","url":null,"abstract":"The perceived spatial spread of amplitude panned virtual sources is dependent on the number of loudspeakers that are used to produce them. When pair-wise or triplet-wise panning is applied, the number of active loudspeakers varies as a function of the panning direction. This may cause unwanted changes in spatial spread and coloration of a virtual source if it is moved in the sound stage. In this paper a method is presented to make the directional spread of amplitude panned virtual sources independent of their panning direction. This is accomplished by panning the sound signal to multiple directions near each other simultaneously. This forms a single virtual source with constant directional spread as a function of direction.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123787980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-10-17DOI: 10.1109/ASPAA.1999.810890
E. Lindemann, J. Kates
The firing rate of an inner hair cell depends on the amplitude envelope in the associated critical band. Phase relationships between clusters of sinusoids in a critical band affect this envelope. This means that sounds with identical magnitude spectra can result in different firing patterns. This may explain why a pulse train, modeled as a sum of equal amplitude cosines, sounds different than a sum of equal amplitude sinusoids with random initial phase. We demonstrate the effect on firing rate by using a time-domain digital cochlear model. We speculate about other psychoacoustic consequences of phase relationships and amplitude envelopes and their effect on firing rates.
{"title":"Phase relationships and amplitude envelopes in auditory perception","authors":"E. Lindemann, J. Kates","doi":"10.1109/ASPAA.1999.810890","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810890","url":null,"abstract":"The firing rate of an inner hair cell depends on the amplitude envelope in the associated critical band. Phase relationships between clusters of sinusoids in a critical band affect this envelope. This means that sounds with identical magnitude spectra can result in different firing patterns. This may explain why a pulse train, modeled as a sum of equal amplitude cosines, sounds different than a sum of equal amplitude sinusoids with random initial phase. We demonstrate the effect on firing rate by using a time-domain digital cochlear model. We speculate about other psychoacoustic consequences of phase relationships and amplitude envelopes and their effect on firing rates.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133195281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}