Reassessing the Benefits of Audiovisual Integration to Speech Perception and Intelligibility.

IF 2.2 2区医学 Q1 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY Journal of Speech Language and Hearing Research Pub Date : 2025-01-02 Epub Date: 2024-12-02 DOI:10.1044/2024_JSLHR-24-00162

Brandon O'Hanlon, Christopher J Plack, Helen E Nuttall

{"title":"Reassessing the Benefits of Audiovisual Integration to Speech Perception and Intelligibility.","authors":"Brandon O'Hanlon, Christopher J Plack, Helen E Nuttall","doi":"10.1044/2024_JSLHR-24-00162","DOIUrl":null,"url":null,"abstract":"Purpose: In difficult listening conditions, the visual system assists with speech perception through lipreading. Stimulus onset asynchrony (SOA) is used to investigate the interaction between the two modalities in speech perception. Previous estimates of audiovisual benefit and SOA integration period differ widely. A limitation of previous research is a lack of consideration of visemes-categories of phonemes defined by similar lip movements when produced by a speaker-to ensure that selected phonemes are visually distinct. This study aimed to reassess the benefits of audiovisual lipreading to speech perception when different viseme categories are selected as stimuli and presented in noise. The study also aimed to investigate the effects of SOA on these stimuli.Method: Sixty participants were tested online and presented with audio-only and audiovisual stimuli containing the speaker's lip movements. The speech was presented either with or without noise and had six different SOAs (0, 200, 216.6, 233.3, 250, and 266.6 ms). Participants discriminated between speech syllables with button presses.Results: The benefit of visual information was weaker than that in previous studies. There was a significant increase in reaction times as SOA was introduced, but there were no significant effects of SOA on accuracy. Furthermore, exploratory analyses suggest that the effect was not equal across viseme categories: \"Ba\" was more difficult to recognize than \"ka\" in noise.Conclusion: In summary, the findings suggest that the contributions of audiovisual integration to speech processing are weaker when considering visemes but are not sufficient to identify a full integration period.Supplemental material: https://doi.org/10.23641/asha.27641064.","PeriodicalId":51254,"journal":{"name":"Journal of Speech Language and Hearing Research","volume":" ","pages":"26-39"},"PeriodicalIF":2.2000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Speech Language and Hearing Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1044/2024_JSLHR-24-00162","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: In difficult listening conditions, the visual system assists with speech perception through lipreading. Stimulus onset asynchrony (SOA) is used to investigate the interaction between the two modalities in speech perception. Previous estimates of audiovisual benefit and SOA integration period differ widely. A limitation of previous research is a lack of consideration of visemes-categories of phonemes defined by similar lip movements when produced by a speaker-to ensure that selected phonemes are visually distinct. This study aimed to reassess the benefits of audiovisual lipreading to speech perception when different viseme categories are selected as stimuli and presented in noise. The study also aimed to investigate the effects of SOA on these stimuli.

Method: Sixty participants were tested online and presented with audio-only and audiovisual stimuli containing the speaker's lip movements. The speech was presented either with or without noise and had six different SOAs (0, 200, 216.6, 233.3, 250, and 266.6 ms). Participants discriminated between speech syllables with button presses.

Results: The benefit of visual information was weaker than that in previous studies. There was a significant increase in reaction times as SOA was introduced, but there were no significant effects of SOA on accuracy. Furthermore, exploratory analyses suggest that the effect was not equal across viseme categories: "Ba" was more difficult to recognize than "ka" in noise.

Conclusion: In summary, the findings suggest that the contributions of audiovisual integration to speech processing are weaker when considering visemes but are not sufficient to identify a full integration period.

Supplemental material: https://doi.org/10.23641/asha.27641064.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

重新评估视听整合对语音感知和可理解性的好处。

目的：在听力困难的情况下，视觉系统通过唇读帮助言语感知。刺激发生异步（SOA）被用来研究两种模式在语音感知中的相互作用。以前对视听效益和SOA集成周期的估计差别很大。先前研究的一个局限是缺乏对音素的考虑，以确保所选择的音素在视觉上是不同的。音素是由说话者发出类似的嘴唇运动来定义的音素类别。本研究的目的是重新评估当选择不同的音素类别作为刺激并在噪声中呈现时，视听唇读对语音感知的益处。本研究还旨在探讨SOA对这些刺激的影响。方法：60名参与者在线测试，并提供音频和视听刺激，包括说话者的嘴唇运动。该演讲有噪声或没有噪声，并且有6个不同的soa（0,200,216.6, 233.3, 250和266.6 ms）。参与者通过按键来区分语音音节。结果：视觉信息的益处较以往研究弱。随着SOA的引入，响应时间显著增加，但SOA对准确性没有显著影响。此外，探索性分析表明，在不同的viseme类别中，效果并不相等：在噪声中，“Ba”比“ka”更难识别。结论：综上所述，研究结果表明，当考虑视觉时，视听整合对语音处理的贡献较弱，但不足以确定完整的整合期。补充资料：https://doi.org/10.23641/asha.27641064。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Speech Language and Hearing Research AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY-REHABILITATION

CiteScore

4.10

自引率

19.20%

发文量

538

审稿时长

4-8 weeks

期刊介绍： Mission: JSLHR publishes peer-reviewed research and other scholarly articles on the normal and disordered processes in speech, language, hearing, and related areas such as cognition, oral-motor function, and swallowing. The journal is an international outlet for both basic research on communication processes and clinical research pertaining to screening, diagnosis, and management of communication disorders as well as the etiologies and characteristics of these disorders. JSLHR seeks to advance evidence-based practice by disseminating the results of new studies as well as providing a forum for critical reviews and meta-analyses of previously published work. Scope: The broad field of communication sciences and disorders, including speech production and perception; anatomy and physiology of speech and voice; genetics, biomechanics, and other basic sciences pertaining to human communication; mastication and swallowing; speech disorders; voice disorders; development of speech, language, or hearing in children; normal language processes; language disorders; disorders of hearing and balance; psychoacoustics; and anatomy and physiology of hearing.