Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380003
B. Bhaskar
The adaptive predictive coding with transform domain quantization (APC-TQ) technique was proposed by Bhaskar (1991) for the compression of audio signals. Since then, significant developments have taken place leading to a reduction in the coding rate. While enhancing the audio quality. These developments include (i) the use of block size adaptation to exploit the variations in the stationarity of the signal, (ii) high resolution spectral modeling using LPC analysis orders up to 64, and (iii) an adaptive bit-allocation procedure to minimize coding noise power as well as minimize the perception of coding noise. The result is a near transparent quality compression of 5 kHz bandwidth audio at a rate of 17 kbit/s. This technology will find applications in the distribution and transmission of AM quality audio programming over low rate channels such as the INMARSAT Standard A, B and aeronautical systems.<>
{"title":"Adaptive predictive coding with transform domain quantization using block size adaptation and high-resolution spectral modeling","authors":"B. Bhaskar","doi":"10.1109/ASPAA.1993.380003","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380003","url":null,"abstract":"The adaptive predictive coding with transform domain quantization (APC-TQ) technique was proposed by Bhaskar (1991) for the compression of audio signals. Since then, significant developments have taken place leading to a reduction in the coding rate. While enhancing the audio quality. These developments include (i) the use of block size adaptation to exploit the variations in the stationarity of the signal, (ii) high resolution spectral modeling using LPC analysis orders up to 64, and (iii) an adaptive bit-allocation procedure to minimize coding noise power as well as minimize the perception of coding noise. The result is a near transparent quality compression of 5 kHz bandwidth audio at a rate of 17 kbit/s. This technology will find applications in the distribution and transmission of AM quality audio programming over low rate channels such as the INMARSAT Standard A, B and aeronautical systems.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116240127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379989
K. Hermansen, F. K. Fink, U. Hartmann, V.M. Hansen
People with severe hearing loss only have a minor part of the frequency range available for reception of information in speech signals. These people do not benefit from normal hearing aids as the information in high frequency parts of the speech is not available. To overcome this problem the authors have developed a new method enabling to present information from the frequency range of interest in the frequency range available for the hearing disabled. By means of parametric modeling of the speech production system, transforming the speech production model to match the available frequency range, and then finally resynthesize the speech using this transformed model, one can present the speech information of interest in a frequency range at choice. This concept is believed to reduce wideband background noise which is a problem for hearing disabled as well as for people with normal hearing ability.<>
{"title":"Hearing aids for profoundly deaf people based on a new parametric concept","authors":"K. Hermansen, F. K. Fink, U. Hartmann, V.M. Hansen","doi":"10.1109/ASPAA.1993.379989","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379989","url":null,"abstract":"People with severe hearing loss only have a minor part of the frequency range available for reception of information in speech signals. These people do not benefit from normal hearing aids as the information in high frequency parts of the speech is not available. To overcome this problem the authors have developed a new method enabling to present information from the frequency range of interest in the frequency range available for the hearing disabled. By means of parametric modeling of the speech production system, transforming the speech production model to match the available frequency range, and then finally resynthesize the speech using this transformed model, one can present the speech information of interest in a frequency range at choice. This concept is believed to reduce wideband background noise which is a problem for hearing disabled as well as for people with normal hearing ability.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129884075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379969
J. Laroche, J. Meillier
This paper deals with source-filter models of percussive instruments. A 'multi-channel excitation/filter model' is presented in which a single excitation is used to generate several sounds, for example six piano tones belonging to the same octave. Techniques for estimating the model parameters are presented and applied to the sound of a real piano. Our experiments demonstrate that it is possible to calculate a single excitation signal which when fed into different filters, generates very accurate synthetic tones. Finally, a low-cost synthesis method is proposed that can be used to generate natural sounding percussive tones.<>
{"title":"A simplified source/filter model for percussive sounds","authors":"J. Laroche, J. Meillier","doi":"10.1109/ASPAA.1993.379969","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379969","url":null,"abstract":"This paper deals with source-filter models of percussive instruments. A 'multi-channel excitation/filter model' is presented in which a single excitation is used to generate several sounds, for example six piano tones belonging to the same octave. Techniques for estimating the model parameters are presented and applied to the sound of a real piano. Our experiments demonstrate that it is possible to calculate a single excitation signal which when fed into different filters, generates very accurate synthetic tones. Finally, a low-cost synthesis method is proposed that can be used to generate natural sounding percussive tones.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116049093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379972
D. Rossum
In audio digital signal processing, interpolators are used for a variety of functions, including sample rate conversion. Linear interpolation is commonly used, but has serious signal quality problems for signals with significant high frequency content. Higher order interpolators based on sine functions or other conventional lowpass filter design techniques offer somewhat better performance, but are not optimal in terms of perceived audio performance for a given degree of computational complexity. We present a new technique for the design of higher order interpolators for digital audio which provides improved performance at a given degree of complexity. Also presented are new methods for evaluating audio interpolators.<>
{"title":"Constraint based audio interpolators","authors":"D. Rossum","doi":"10.1109/ASPAA.1993.379972","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379972","url":null,"abstract":"In audio digital signal processing, interpolators are used for a variety of functions, including sample rate conversion. Linear interpolation is commonly used, but has serious signal quality problems for signals with significant high frequency content. Higher order interpolators based on sine functions or other conventional lowpass filter design techniques offer somewhat better performance, but are not optimal in terms of perceived audio performance for a given degree of computational complexity. We present a new technique for the design of higher order interpolators for digital audio which provides improved performance at a given degree of complexity. Also presented are new methods for evaluating audio interpolators.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134395075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380004
Larry Heck, K. Naghshineh, J. Stach
This paper presents a method for interpolating a sparse set of nonuniformly spaced velocity measurements on the surface of a vibrating structure. The method utilizes knowledge of the physical nature of the vibrating structure specified in terms of a given bound on the energy of the excitation forces, estimated mobilities of the structure and a known set of sparse velocity measurements. To minimize the maximum possible error of the estimated surface velocities. The method employs an estimation approach derived from the theory of optimal signal recovery. Results are presented which demonstrate the performance of the method on interpolating surface velocities of a rectangular plate. With only four randomly selected point velocity measurements out of 209 possible locations. The method estimates the structural surface velocity with a normalized error of only -45 dB. The ability to achieve this performance with a small number of sensors makes this method important for many active noise control applications where an accurate measure of structural surface velocity is required to predict the radiated acoustic field.<>
{"title":"Interpolation of forced structural responses from non-uniform sparse measurements","authors":"Larry Heck, K. Naghshineh, J. Stach","doi":"10.1109/ASPAA.1993.380004","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380004","url":null,"abstract":"This paper presents a method for interpolating a sparse set of nonuniformly spaced velocity measurements on the surface of a vibrating structure. The method utilizes knowledge of the physical nature of the vibrating structure specified in terms of a given bound on the energy of the excitation forces, estimated mobilities of the structure and a known set of sparse velocity measurements. To minimize the maximum possible error of the estimated surface velocities. The method employs an estimation approach derived from the theory of optimal signal recovery. Results are presented which demonstrate the performance of the method on interpolating surface velocities of a rectangular plate. With only four randomly selected point velocity measurements out of 209 possible locations. The method estimates the structural surface velocity with a normalized error of only -45 dB. The ability to achieve this performance with a small number of sensors makes this method important for many active noise control applications where an accurate measure of structural surface velocity is required to predict the radiated acoustic field.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134645640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380008
M. Rupp
Cancelling echoes by using the normalized least mean square (NLMS) algorithm has been state of the art for many years. In acoustical echo compensation, however, it is common to estimate more than 1000 parameters resulting in a too slow convergence when driven by speech signals. In order to overcome this drawback, a lot of modifications have been published in the last years, all having one goal: to decorrelate the driving process. Beginning with a deterministic approach we show that all these different ideas can be arranged in one scheme, allowing a uniform normalization. The different properties of the several algorithms are then obvious. A comparison of some algorithms with 2N-4N complexity is presented. Surprisingly, all algorithms do not work perfectly for a large compensator filter length and speech as input process.<>
{"title":"A comparison of gradient-based algorithms for echo compensation with decorrelating properties","authors":"M. Rupp","doi":"10.1109/ASPAA.1993.380008","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380008","url":null,"abstract":"Cancelling echoes by using the normalized least mean square (NLMS) algorithm has been state of the art for many years. In acoustical echo compensation, however, it is common to estimate more than 1000 parameters resulting in a too slow convergence when driven by speech signals. In order to overcome this drawback, a lot of modifications have been published in the last years, all having one goal: to decorrelate the driving process. Beginning with a deterministic approach we show that all these different ideas can be arranged in one scheme, allowing a uniform normalization. The different properties of the several algorithms are then obvious. A comparison of some algorithms with 2N-4N complexity is presented. Surprisingly, all algorithms do not work perfectly for a large compensator filter length and speech as input process.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124730177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380001
H. Fuchs
A method for exploiting inter-channel redundancies of stereophonic or multichannel audio signals is presented. In contrast to known stereo redundancy reduction techniques used in joint stereo audio coding. Where only the statistical dependencies between two concurrent samples of the left and right channel signals are considered, the adaptive inter-channel prediction also takes into account possible phase or time delay between the channels and exploits more than only one value of the cross-correlation function. The analysis of subjective listening test results has shown that this technique is especially effective for a class of test sequences which has proven to be most critical for the ISO MPEG Layer II and Layer III codecs at bit rates of 2/spl times/64 kbit/s. For these signals the gain due to the stereo redundancy reduction technique used in Layer III joint stereo coding is less than 5-10 dB, while in Layer II joint stereo coding no specific stereo redundancy reduction technique is used. In a first step, the adaptive inter-channel prediction has been applied to an ISO MPEG Layer II codec. The simulation results show that a prediction gain up to 30-40 dB can be achieved for large parts of the above mentioned signals.<>
{"title":"Improving joint stereo audio coding by adaptive inter-channel prediction","authors":"H. Fuchs","doi":"10.1109/ASPAA.1993.380001","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380001","url":null,"abstract":"A method for exploiting inter-channel redundancies of stereophonic or multichannel audio signals is presented. In contrast to known stereo redundancy reduction techniques used in joint stereo audio coding. Where only the statistical dependencies between two concurrent samples of the left and right channel signals are considered, the adaptive inter-channel prediction also takes into account possible phase or time delay between the channels and exploits more than only one value of the cross-correlation function. The analysis of subjective listening test results has shown that this technique is especially effective for a class of test sequences which has proven to be most critical for the ISO MPEG Layer II and Layer III codecs at bit rates of 2/spl times/64 kbit/s. For these signals the gain due to the stereo redundancy reduction technique used in Layer III joint stereo coding is less than 5-10 dB, while in Layer II joint stereo coding no specific stereo redundancy reduction technique is used. In a first step, the adaptive inter-channel prediction has been applied to an ISO MPEG Layer II codec. The simulation results show that a prediction gain up to 30-40 dB can be achieved for large parts of the above mentioned signals.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133000734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380000
R. van der Waal, K. Brandenburg, G. Stoll
Since 1988 ISO/IEC JTCI/SC29 WG11 (MPEG) is working on the standardization of video and audio signals. The Audio subgroup of MPEG is working on bit rate reduction systems for high quality digital audio. Since the first phase of this standardization effort has been finished, MPEG/Audio is extending its work to multichannel audio coding systems as well as to medium quality coding at lower sampling frequencies and lower bit rates. Future standardization work aims at next-generation coder suitable for high quality audio transmission and storage at bit rates of 64 kb/s per channel and well below.<>
{"title":"Current and future standardization of high-quality digital audio coding in MPEG","authors":"R. van der Waal, K. Brandenburg, G. Stoll","doi":"10.1109/ASPAA.1993.380000","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380000","url":null,"abstract":"Since 1988 ISO/IEC JTCI/SC29 WG11 (MPEG) is working on the standardization of video and audio signals. The Audio subgroup of MPEG is working on bit rate reduction systems for high quality digital audio. Since the first phase of this standardization effort has been finished, MPEG/Audio is extending its work to multichannel audio coding systems as well as to medium quality coding at lower sampling frequencies and lower bit rates. Future standardization work aims at next-generation coder suitable for high quality audio transmission and storage at bit rates of 64 kb/s per channel and well below.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131420199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379983
Jerry Bauck, Duane H. Cooper
Transaural stereo achieves precision 3-D imaging by compensating for spectral distortions in the loudspeaker-to-car signal paths. The heart of transaural stereo, signal processing for crosstalk cancellation, is herein generalized to accommodate any number of loudspeakers and listeners in any layout. Transaural equations are written and then solved using standard algebraic methods. Worked-out examples are shown and several applications are proposed.<>
{"title":"Developments in transaural stereo","authors":"Jerry Bauck, Duane H. Cooper","doi":"10.1109/ASPAA.1993.379983","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379983","url":null,"abstract":"Transaural stereo achieves precision 3-D imaging by compensating for spectral distortions in the loudspeaker-to-car signal paths. The heart of transaural stereo, signal processing for crosstalk cancellation, is herein generalized to accommodate any number of loudspeakers and listeners in any layout. Transaural equations are written and then solved using standard algebraic methods. Worked-out examples are shown and several applications are proposed.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132388121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379971
E. George, M.J.T. Smith
Analysis-by-synthesis/overlap-add (ABS/OLA) sinusoidal modeling has been successfully demonstrated as an accurate, flexible, and computationally tractable representation for the purposes of speech modification and harmonic tone synthesis; however, the model formulation used to synthesize these signals does not take full advantage of the structure of quasi-harmonic music signals. This paper describes a generalized overlap-add sinusoidal model formulation that accounts for the time-frequency behavior of quasi-harmonic tones and which reduces to the previous formulation as a special case.<>
{"title":"Generalized overlap-add sinusoidal modeling applied to quasi-harmonic tone synthesis","authors":"E. George, M.J.T. Smith","doi":"10.1109/ASPAA.1993.379971","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379971","url":null,"abstract":"Analysis-by-synthesis/overlap-add (ABS/OLA) sinusoidal modeling has been successfully demonstrated as an accurate, flexible, and computationally tractable representation for the purposes of speech modification and harmonic tone synthesis; however, the model formulation used to synthesize these signals does not take full advantage of the structure of quasi-harmonic music signals. This paper describes a generalized overlap-add sinusoidal model formulation that accounts for the time-frequency behavior of quasi-harmonic tones and which reduces to the previous formulation as a special case.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116715341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}