{"title":"Evaluating OpenAI's Whisper ASR: Performance analysis across diverse accents and speaker traits.","authors":"Calbert Graham, Nathan Roll","doi":"10.1121/10.0024876","DOIUrl":null,"url":null,"abstract":"<p><p>This study investigates Whisper's automatic speech recognition (ASR) system performance across diverse native and non-native English accents. Results reveal superior recognition in American compared to British and Australian English accents with similar performance in Canadian English. Overall, native English accents demonstrate higher accuracy than non-native accents. Exploring connections between speaker traits [sex, native language (L1) typology, and second language (L2) proficiency] and word error rate uncovers notable associations. Furthermore, Whisper exhibits enhanced performance in read speech over conversational speech with modifications based on speaker gender. The implications of these findings are discussed.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JASA express letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1121/10.0024876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
This study investigates Whisper's automatic speech recognition (ASR) system performance across diverse native and non-native English accents. Results reveal superior recognition in American compared to British and Australian English accents with similar performance in Canadian English. Overall, native English accents demonstrate higher accuracy than non-native accents. Exploring connections between speaker traits [sex, native language (L1) typology, and second language (L2) proficiency] and word error rate uncovers notable associations. Furthermore, Whisper exhibits enhanced performance in read speech over conversational speech with modifications based on speaker gender. The implications of these findings are discussed.