Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379981
V.C. Georgopoulos, D. Preis
This paper introduces a layered neural network model for hearing perception. It is based on five important perceptual properties of hearing. The neural network model processes a joint-domain representation of the input signal to yield the desired perceptual properties. The focus is on the first two layers of the model, the transformation layer and two feature extraction layers.<>
{"title":"A time-frequency neutral network layered model for hearing perception","authors":"V.C. Georgopoulos, D. Preis","doi":"10.1109/ASPAA.1993.379981","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379981","url":null,"abstract":"This paper introduces a layered neural network model for hearing perception. It is based on five important perceptual properties of hearing. The neural network model processes a joint-domain representation of the input signal to yield the desired perceptual properties. The focus is on the first two layers of the model, the transformation layer and two feature extraction layers.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127673666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379999
M. Bosi, C. Todd, T. Holman
This paper analyzes directions in the current standardization activities for multi-channel audio, briefly reviews the composite coding schemes AC-3 and ISO 11172-3 compatible systems, and discusses requirements, features, and time-tables for the audio systems in the ISO/Moving pictures Expert Group (MPEG) phase 2 and the United States high definition television (HDTV) standardization processes.<>
{"title":"Aspects of current standardization activities for high-quality, low-rate multi-channel audio coding","authors":"M. Bosi, C. Todd, T. Holman","doi":"10.1109/ASPAA.1993.379999","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379999","url":null,"abstract":"This paper analyzes directions in the current standardization activities for multi-channel audio, briefly reviews the composite coding schemes AC-3 and ISO 11172-3 compatible systems, and discusses requirements, features, and time-tables for the audio systems in the ISO/Moving pictures Expert Group (MPEG) phase 2 and the United States high definition television (HDTV) standardization processes.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128742853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380002
R. Maher
The need to transmit large amounts of data over limited bandwidth channels has resulted in many methods for digital data compression. The common approach is to identify and remove redundancy from the input data stream using knowledge of the source characteristics. In the case of signals intended for human observers (speech, music, pictures, etc.) it is also useful to consider the strengths and weaknesses of the human sensory systems in order to achieve a greater degree of data compression. Unfortunately, achieving perceptually transparent compression requires considerable computational resources. For situations requiring extremely low computational complexity without strictly transparent coding, such as multimedia applications on personal computer platforms, a new adaptive differential pulse code modulation (DPCM) data compression scheme is proposed. Although standard DPCM structures are widely used in single-talker speech coding systems, the models and statistical assumptions well-known for speech signals are not applicable to arbitrary audio signals such as music. The new DPCM formulation presented includes a recursively indexed quantizer (RIQ) to eliminate the problem of overload distortion, a simple predictor structure to take advantage of the short-term correlation present in wideband audio signals, and an adaptation strategy to optimize the system to the local statistics of the input signal. Thus, the new RIQ-DPCM formulation is presented as a computationally efficient means of wideband audio compression.<>
{"title":"Computationally efficient compression of audio signals by means of RIQ-DPCM","authors":"R. Maher","doi":"10.1109/ASPAA.1993.380002","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380002","url":null,"abstract":"The need to transmit large amounts of data over limited bandwidth channels has resulted in many methods for digital data compression. The common approach is to identify and remove redundancy from the input data stream using knowledge of the source characteristics. In the case of signals intended for human observers (speech, music, pictures, etc.) it is also useful to consider the strengths and weaknesses of the human sensory systems in order to achieve a greater degree of data compression. Unfortunately, achieving perceptually transparent compression requires considerable computational resources. For situations requiring extremely low computational complexity without strictly transparent coding, such as multimedia applications on personal computer platforms, a new adaptive differential pulse code modulation (DPCM) data compression scheme is proposed. Although standard DPCM structures are widely used in single-talker speech coding systems, the models and statistical assumptions well-known for speech signals are not applicable to arbitrary audio signals such as music. The new DPCM formulation presented includes a recursively indexed quantizer (RIQ) to eliminate the problem of overload distortion, a simple predictor structure to take advantage of the short-term correlation present in wideband audio signals, and an adaptation strategy to optimize the system to the local statistics of the input signal. Thus, the new RIQ-DPCM formulation is presented as a computationally efficient means of wideband audio compression.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115142553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379976
M.K. Islam, G. Saplakoglu
The restoration of flute notes embedded in noise is formulated as a state-estimation problem of a dynamic system. A single Kalman filter, with a given state-transition matrix, is implemented in real-time to recover the corresponding note as well as some of the neighbouring notes. In order to restore a continuous piece of music played by flute, a bank of Kalman filters (with different state-transition matrices) can be used. Event detectors are employed to detect the change of underlying system model. The overall restored sound is the output of Kalman filter, whose model is valid at a given time instant.<>
{"title":"Detection and restoration of sound of flute embedded in noise using real-time Kalman filter","authors":"M.K. Islam, G. Saplakoglu","doi":"10.1109/ASPAA.1993.379976","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379976","url":null,"abstract":"The restoration of flute notes embedded in noise is formulated as a state-estimation problem of a dynamic system. A single Kalman filter, with a given state-transition matrix, is implemented in real-time to recover the corresponding note as well as some of the neighbouring notes. In order to restore a continuous piece of music played by flute, a bank of Kalman filters (with different state-transition matrices) can be used. Event detectors are employed to detect the change of underlying system model. The overall restored sound is the output of Kalman filter, whose model is valid at a given time instant.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129542094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380009
P. Chu
A Weaver SSB subband structure is used to implement an acoustic echo canceller. The structure has 29 bands of 250 Hz width, covering the audio range from 0 to 7 kHz. The Weaver structure lowers each band pass region to baseband, allows for oversampling to eliminate aliasing components, and is computationally efficient. The subsampled components are purely real, as compared to the complex components found in some other subband schemes. The adaptive filter update algorithm is a variant of the block NLMS. The double-talk, divergence, echo suppression, and noise fill-in algorithms all fully exploit the band pass structure to achieve performance difficult to attain in full-band or two-band acoustic echo cancellers. The acoustic echo canceller has been extensively field tested and has been shown to be robust.<>
{"title":"Weaver SSB subband acoustic echo canceller [videoconferencing application]","authors":"P. Chu","doi":"10.1109/ASPAA.1993.380009","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380009","url":null,"abstract":"A Weaver SSB subband structure is used to implement an acoustic echo canceller. The structure has 29 bands of 250 Hz width, covering the audio range from 0 to 7 kHz. The Weaver structure lowers each band pass region to baseband, allows for oversampling to eliminate aliasing components, and is computationally efficient. The subsampled components are purely real, as compared to the complex components found in some other subband schemes. The adaptive filter update algorithm is a variant of the block NLMS. The double-talk, divergence, echo suppression, and noise fill-in algorithms all fully exploit the band pass structure to achieve performance difficult to attain in full-band or two-band acoustic echo cancellers. The acoustic echo canceller has been extensively field tested and has been shown to be robust.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132708579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379994
M. Schonle, N. Fliege, U. Zolzer
A new approach to the approximation and real-time simulation of room impulse responses is presented. Based on wavelet decomposition of measured impulse response data an energy-time-frequency representation of the system room is obtained. The wavelet coefficients in the frequency subbands are calculated by a multirate analysis filter bank providing aliasing-free subband processing and linear-phase filters. In a second step a modification of the Prony-method is used to obtain the parameters of cascaded moving average comb filter structures. Combining the approximated subband signals by a synthesis filter bank with perfect reconstruction properties gives an approximation of the broadband impulse reponse.<>
{"title":"Parametric approximation of room impulse responses based on wavelet decomposition","authors":"M. Schonle, N. Fliege, U. Zolzer","doi":"10.1109/ASPAA.1993.379994","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379994","url":null,"abstract":"A new approach to the approximation and real-time simulation of room impulse responses is presented. Based on wavelet decomposition of measured impulse response data an energy-time-frequency representation of the system room is obtained. The wavelet coefficients in the frequency subbands are calculated by a multirate analysis filter bank providing aliasing-free subband processing and linear-phase filters. In a second step a modification of the Prony-method is used to obtain the parameters of cascaded moving average comb filter structures. Combining the approximated subband signals by a synthesis filter bank with perfect reconstruction properties gives an approximation of the broadband impulse reponse.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132089668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379975
S. Godsill, P. Rayner
A new algorithm is presented for the identification and restoration of time-varying pitch defects in audio signals. The problem is commonly encountered as 'wow' in gramophone disc and magnetic tape recordings where motor speed variations or eccentricity in the recording process are significant. The algorithm operates in two stages, the first of which trades tonal components in musical signals to generate a single pitch variation curve, and the second stage which performs restoration as a time-varying resampling operation. Results are presented from both artificially degraded sources and real sources.<>
{"title":"The restoration of pitch variation defects in gramophone recordings","authors":"S. Godsill, P. Rayner","doi":"10.1109/ASPAA.1993.379975","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379975","url":null,"abstract":"A new algorithm is presented for the identification and restoration of time-varying pitch defects in audio signals. The problem is commonly encountered as 'wow' in gramophone disc and magnetic tape recordings where motor speed variations or eccentricity in the recording process are significant. The algorithm operates in two stages, the first of which trades tonal components in musical signals to generate a single pitch variation curve, and the second stage which performs restoration as a time-varying resampling operation. Results are presented from both artificially degraded sources and real sources.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129834130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.380006
S. Kuo, J. Kunduru
Broadband adaptive noise cancellation applications often involve adaptive filter lengths of hundreds of taps. This gives rise to high computational complexity, large misadjustment errors and slow convergence. In this paper subband processing techniques are proposed to get a better noise cancellation in hands-free cellular phones. Simulation results are shown for the traditional and subband adaptive noise canceler (SANC). An average noise reduction of 18 dB is achieved with SANC, as compared to a reduction of 6 dB with traditional ANC.<>
{"title":"Subband adaptive noise canceler for hands-free cellular phone applications","authors":"S. Kuo, J. Kunduru","doi":"10.1109/ASPAA.1993.380006","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.380006","url":null,"abstract":"Broadband adaptive noise cancellation applications often involve adaptive filter lengths of hundreds of taps. This gives rise to high computational complexity, large misadjustment errors and slow convergence. In this paper subband processing techniques are proposed to get a better noise cancellation in hands-free cellular phones. Simulation results are shown for the traditional and subband adaptive noise canceler (SANC). An average noise reduction of 18 dB is achieved with SANC, as compared to a reduction of 6 dB with traditional ANC.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132039038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379988
A. Tungthangthum, J. C. Rutledge
An objective measures system has been developed to predict the results of subject-based tests for sensorineural hearing loss compensation techniques. Parameters related to the loudness level of the compensated speech signal are extracted from its frequency spectrum. These parameters are then used to train a neural network based phoneme classifier. Good prediction results have been achieved for two hearing impaired subjects.<>
{"title":"Objective measures based on neural networks for hearing loss compensation techniques","authors":"A. Tungthangthum, J. C. Rutledge","doi":"10.1109/ASPAA.1993.379988","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379988","url":null,"abstract":"An objective measures system has been developed to predict the results of subject-based tests for sensorineural hearing loss compensation techniques. Parameters related to the loudness level of the compensated speech signal are extracted from its frequency spectrum. These parameters are then used to train a neural network based phoneme classifier. Good prediction results have been achieved for two hearing impaired subjects.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126364402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-17DOI: 10.1109/ASPAA.1993.379990
J. Svean, A. Krokstad, S. Sorsdal
The paper describes an all digital concha hearing aid. The main features of this hearing aid concept are a large vent, acoustic feed-back cancellation, great flexibility by programming, a versatile equalizer, and an advanced compressor. The A/D and D/A converters have log/in characteristics and the signal processing is performed by floating point arithmetic, ensuring a large dynamic range and a signal to quantization noise ratio which is almost independent of signal level. The hearing aid contains a high speed two-way digital interface which is able to transmit measurement signals in real time. This feature provides great advantages to the fitting procedure. The main part of the hearing aid is a VLSI chip in 0.8 /spl mu/m CMOS technology, measuring 44 mm/sup 2/.<>
介绍了一种全数字式耳廓助听器。这款助听器概念的主要特点是大通风口、声反馈消除、编程的极大灵活性、多功能均衡器和先进的压缩机。A/D和D/A转换器具有对数/入特性,信号处理采用浮点运算,保证了大的动态范围和几乎与信号电平无关的信量化噪声比。该助听器包含高速双向数字接口,能够实时传输测量信号。这一特点为装配过程提供了很大的优势。助听器的主体部分是一个采用0.8 /spl μ m CMOS技术的VLSI芯片,尺寸为44 mm/sup / 2/。
{"title":"An all digital concha hearing aid","authors":"J. Svean, A. Krokstad, S. Sorsdal","doi":"10.1109/ASPAA.1993.379990","DOIUrl":"https://doi.org/10.1109/ASPAA.1993.379990","url":null,"abstract":"The paper describes an all digital concha hearing aid. The main features of this hearing aid concept are a large vent, acoustic feed-back cancellation, great flexibility by programming, a versatile equalizer, and an advanced compressor. The A/D and D/A converters have log/in characteristics and the signal processing is performed by floating point arithmetic, ensuring a large dynamic range and a signal to quantization noise ratio which is almost independent of signal level. The hearing aid contains a high speed two-way digital interface which is able to transmit measurement signals in real time. This feature provides great advantages to the fitting procedure. The main part of the hearing aid is a VLSI chip in 0.8 /spl mu/m CMOS technology, measuring 44 mm/sup 2/.<<ETX>>","PeriodicalId":270576,"journal":{"name":"Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117195889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}