A Quantitative Protocol for Calibrating Short Speech Signals (Monosyllabic Words) Based on the 50-ms Segment of the Voiced Phoneme(s) with the Maximum Root-Mean-Square Amplitude.
{"title":"A Quantitative Protocol for Calibrating Short Speech Signals (Monosyllabic Words) Based on the 50-ms Segment of the Voiced Phoneme(s) with the Maximum Root-Mean-Square Amplitude.","authors":"Richard H Wilson, Nancy J Scherer","doi":"10.3766/jaaa.21126","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Since the development of word-recognition materials to test the transmission properties ofauditory devices and human auditory systems, a carrier sentence or phrase (e.g., <i>Say the word</i>) has beenused to preface the test word. For practical reasons, only the amplitude of the carrier phrase was somewhatcontrolled. The current American National Standards Institute standard for audiometers continues tospecify the level of the test word should be <i>the same communication level as the carrier phrase</i>.<br /><b>Purpose:</b> The development of an amplitude calibration protocol for use with short-duration speech signalsthat are characterized by substantial amplitude modulations is described.<br /><b>Research Design:</b> Protocol 1 evaluated the average maximum root-mean-square (rms) amplitudes of12.5-, 25-, 50-, and 100-ms voiced phoneme segments of each test word in 0.0227-ms increments todetermine the segment duration to use. Protocol 2 used the 50-ms segment with the maximum rmsamplitude among the 200 words in each list to normalize independently the amplitudes of the carrierphrases and test words to a target rms amplitude for each speaker.<br /><b>Study Sample:</b> Digital copies of the 200 monosyllabic words in three versions of Northwestern UniversityAuditory Test No. 6 (NU-6) and one version of the W-22 each spoken by a different speaker wereevaluated using the numeric digital values transcribed from the audio files. Two iterations of the protocolwere compiled.<br /><b>Data Collection and Analysis:</b> In-house routines were used to analyze the waveform data, the resultsof which were evaluated with central tendency statistical analyses.<br /><b>Results:</b> The finalized protocol is based on the rms amplitude of a 50-ms segment of the sustained,voiced phoneme of each test word. The protocol directly links the rms amplitudes of the calibrationtone and of the 50-ms word segments as opposed to the currently used linking of the calibration tonerms amplitude to a peak meter deflection of the carrier phrase from which the amplitude of the testword is inferred.<br /><b>Conclusions:</b> The effectiveness of the calibration protocol was demonstrated successfully on the foursets of word-recognition materials. The rms amplitude adjustments made independently to the individualcarrier phrase and test-word utterances produced overall rms amplitudes for each of the four speakersthat were homogenized slightly for the carrier phrases but substantially for many of the test words.<br /><b>Clinical Relevance Statement:</b> The calibration protocol described provides an objective procedurethat can be implemented and, most importantly, replicated with numeric accuracy to equate test-word(and carrier phrase) amplitudes among short speech signals like monosyllabic words and amongspeaker versions of those materials.</p>","PeriodicalId":50021,"journal":{"name":"Journal of the American Academy of Audiology","volume":" ","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Academy of Audiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3766/jaaa.21126","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Since the development of word-recognition materials to test the transmission properties ofauditory devices and human auditory systems, a carrier sentence or phrase (e.g., Say the word) has beenused to preface the test word. For practical reasons, only the amplitude of the carrier phrase was somewhatcontrolled. The current American National Standards Institute standard for audiometers continues tospecify the level of the test word should be the same communication level as the carrier phrase. Purpose: The development of an amplitude calibration protocol for use with short-duration speech signalsthat are characterized by substantial amplitude modulations is described. Research Design: Protocol 1 evaluated the average maximum root-mean-square (rms) amplitudes of12.5-, 25-, 50-, and 100-ms voiced phoneme segments of each test word in 0.0227-ms increments todetermine the segment duration to use. Protocol 2 used the 50-ms segment with the maximum rmsamplitude among the 200 words in each list to normalize independently the amplitudes of the carrierphrases and test words to a target rms amplitude for each speaker. Study Sample: Digital copies of the 200 monosyllabic words in three versions of Northwestern UniversityAuditory Test No. 6 (NU-6) and one version of the W-22 each spoken by a different speaker wereevaluated using the numeric digital values transcribed from the audio files. Two iterations of the protocolwere compiled. Data Collection and Analysis: In-house routines were used to analyze the waveform data, the resultsof which were evaluated with central tendency statistical analyses. Results: The finalized protocol is based on the rms amplitude of a 50-ms segment of the sustained,voiced phoneme of each test word. The protocol directly links the rms amplitudes of the calibrationtone and of the 50-ms word segments as opposed to the currently used linking of the calibration tonerms amplitude to a peak meter deflection of the carrier phrase from which the amplitude of the testword is inferred. Conclusions: The effectiveness of the calibration protocol was demonstrated successfully on the foursets of word-recognition materials. The rms amplitude adjustments made independently to the individualcarrier phrase and test-word utterances produced overall rms amplitudes for each of the four speakersthat were homogenized slightly for the carrier phrases but substantially for many of the test words. Clinical Relevance Statement: The calibration protocol described provides an objective procedurethat can be implemented and, most importantly, replicated with numeric accuracy to equate test-word(and carrier phrase) amplitudes among short speech signals like monosyllabic words and amongspeaker versions of those materials.
期刊介绍:
The Journal of the American Academy of Audiology (JAAA) is the Academy''s scholarly peer-reviewed publication, issued 10 times per year and available to Academy members as a benefit of membership. The JAAA publishes articles and clinical reports in all areas of audiology, including audiological assessment, amplification, aural habilitation and rehabilitation, auditory electrophysiology, vestibular assessment, and hearing science.