语言环境分析(LENA)自动语音处理算法标签在印度家庭样本中的成人和儿童部分的验证。

IF 2.2 2区 医学 Q1 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY Journal of Speech Language and Hearing Research Pub Date : 2025-01-02 Epub Date: 2024-12-05 DOI:10.1044/2024_JSLHR-24-00099
Shoba S Meera, Divya Swaminathan, Sri Ranjani Venkata Murali, Reny Raju, Malavi Srikar, Sahana Shyam Sundar, Senthil Amudhan, Alejandrina Cristia, Rahul Pawar, Achuth Rao, Prathyusha P Vasuki, Shree Volme, Ashok Mysore
{"title":"语言环境分析(LENA)自动语音处理算法标签在印度家庭样本中的成人和儿童部分的验证。","authors":"Shoba S Meera, Divya Swaminathan, Sri Ranjani Venkata Murali, Reny Raju, Malavi Srikar, Sahana Shyam Sundar, Senthil Amudhan, Alejandrina Cristia, Rahul Pawar, Achuth Rao, Prathyusha P Vasuki, Shree Volme, Ashok Mysore","doi":"10.1044/2024_JSLHR-24-00099","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The Language ENvironment Analysis (LENA) technology uses automated speech processing (ASP) algorithms to estimate counts such as total adult words and child vocalizations, which helps understand children's early language environment. This ASP has been validated in North American English and other languages in predominantly monolingual contexts but not in a multilingual context like India. Thus, the current study aims to validate the classification accuracy of the LENA algorithm specifically focusing on speaker recognition of adult segments (AdS) and child segments (ChS) in a sample of bi/multilingual families from India.</p><p><strong>Method: </strong>Thirty neurotypical children between 6 and 24 months (<i>M</i> = 12.89, <i>SD</i> = 4.95) were recruited. Participants were growing up in bi/multilingual environment hearing a combination of Kannada, Tamil, Malayalam, Telugu, Hindi, and/or English. Daylong audio recordings were collected using LENA and processed using the ASP to automatically detect segments across speaker categories. Two human annotators manually annotated ~900 min (37,431 segments across speaker categories). Performance accuracy (recall and precision) was calculated for AdS and ChS.</p><p><strong>Results: </strong>The recall and precision for AdS were 0.62 (95% confidence interval [CI] [0.61, 0.63]) and 0.83 (95% CI [0.8, 0.83]), respectively. This indicated that 62% of the segments identified as AdS by the human annotator were also identified as AdS by the LENA ASP algorithm and 83% of the segments labeled by the LENA ASP as AdS were also labeled by the human annotator as AdS. Similarly, the recall and precision for ChS were 0.65 (95% CI [0.64, 0.66]) and 0.55 (95% CI [0.54, 0.56]), respectively.</p><p><strong>Conclusions: </strong>This study documents the performance of the ASP in correctly classifying speakers as adult or child in a sample of families from India, indicating recall and precision that is relatively low. This study lays the groundwork for future investigations aiming to refine the algorithm models, potentially facilitating more accurate performance in bi/multilingual societies like India.</p><p><strong>Supplemental material: </strong>https://doi.org/10.23641/asha.27910710.</p>","PeriodicalId":51254,"journal":{"name":"Journal of Speech Language and Hearing Research","volume":" ","pages":"40-53"},"PeriodicalIF":2.2000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Validation of the Language ENvironment Analysis (LENA) Automated Speech Processing Algorithm Labels for Adult and Child Segments in a Sample of Families From India.\",\"authors\":\"Shoba S Meera, Divya Swaminathan, Sri Ranjani Venkata Murali, Reny Raju, Malavi Srikar, Sahana Shyam Sundar, Senthil Amudhan, Alejandrina Cristia, Rahul Pawar, Achuth Rao, Prathyusha P Vasuki, Shree Volme, Ashok Mysore\",\"doi\":\"10.1044/2024_JSLHR-24-00099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>The Language ENvironment Analysis (LENA) technology uses automated speech processing (ASP) algorithms to estimate counts such as total adult words and child vocalizations, which helps understand children's early language environment. This ASP has been validated in North American English and other languages in predominantly monolingual contexts but not in a multilingual context like India. Thus, the current study aims to validate the classification accuracy of the LENA algorithm specifically focusing on speaker recognition of adult segments (AdS) and child segments (ChS) in a sample of bi/multilingual families from India.</p><p><strong>Method: </strong>Thirty neurotypical children between 6 and 24 months (<i>M</i> = 12.89, <i>SD</i> = 4.95) were recruited. Participants were growing up in bi/multilingual environment hearing a combination of Kannada, Tamil, Malayalam, Telugu, Hindi, and/or English. Daylong audio recordings were collected using LENA and processed using the ASP to automatically detect segments across speaker categories. Two human annotators manually annotated ~900 min (37,431 segments across speaker categories). Performance accuracy (recall and precision) was calculated for AdS and ChS.</p><p><strong>Results: </strong>The recall and precision for AdS were 0.62 (95% confidence interval [CI] [0.61, 0.63]) and 0.83 (95% CI [0.8, 0.83]), respectively. This indicated that 62% of the segments identified as AdS by the human annotator were also identified as AdS by the LENA ASP algorithm and 83% of the segments labeled by the LENA ASP as AdS were also labeled by the human annotator as AdS. Similarly, the recall and precision for ChS were 0.65 (95% CI [0.64, 0.66]) and 0.55 (95% CI [0.54, 0.56]), respectively.</p><p><strong>Conclusions: </strong>This study documents the performance of the ASP in correctly classifying speakers as adult or child in a sample of families from India, indicating recall and precision that is relatively low. This study lays the groundwork for future investigations aiming to refine the algorithm models, potentially facilitating more accurate performance in bi/multilingual societies like India.</p><p><strong>Supplemental material: </strong>https://doi.org/10.23641/asha.27910710.</p>\",\"PeriodicalId\":51254,\"journal\":{\"name\":\"Journal of Speech Language and Hearing Research\",\"volume\":\" \",\"pages\":\"40-53\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Speech Language and Hearing Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1044/2024_JSLHR-24-00099\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/5 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Speech Language and Hearing Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1044/2024_JSLHR-24-00099","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/5 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的:语言环境分析(LENA)技术使用自动语音处理(ASP)算法来估计成人总单词和儿童发声等计数,有助于了解儿童早期的语言环境。这个ASP已经在北美英语和其他主要的单语言环境中得到了验证,但在印度这样的多语言环境中还没有得到验证。因此,本研究旨在验证LENA算法的分类准确性,特别关注印度双/多语家庭样本中成人语段(AdS)和儿童语段(ChS)的说话人识别。方法:选取6 ~ 24月龄神经正常儿童30例(M = 12.89, SD = 4.95)。参与者在双语/多语环境中长大,听到卡纳达语、泰米尔语、马拉雅拉姆语、泰卢固语、印地语和/或英语的组合。全天的录音使用LENA收集,并使用ASP进行处理,以自动检测发言者类别的片段。两名人工注释员手动注释约900分钟(37,431段跨越演讲者类别)。计算了AdS和ChS的性能准确度(召回率和精度)。结果:AdS的查全率和查准率分别为0.62(95%可信区间[CI][0.61, 0.63])和0.83 (95% CI[0.8, 0.83])。这表明,62%被人类注释者识别为AdS的片段也被LENA ASP算法识别为AdS, 83%被LENA ASP标记为AdS的片段也被人类注释者标记为AdS。同样,ChS的召回率和精度分别为0.65 (95% CI[0.64, 0.66])和0.55 (95% CI[0.54, 0.56])。结论:本研究记录了ASP在印度家庭样本中正确分类说话者为成人或儿童的表现,表明召回率和准确率相对较低。这项研究为未来的研究奠定了基础,旨在完善算法模型,潜在地促进像印度这样的双语/多语社会更准确的表现。补充资料:https://doi.org/10.23641/asha.27910710。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Validation of the Language ENvironment Analysis (LENA) Automated Speech Processing Algorithm Labels for Adult and Child Segments in a Sample of Families From India.

Purpose: The Language ENvironment Analysis (LENA) technology uses automated speech processing (ASP) algorithms to estimate counts such as total adult words and child vocalizations, which helps understand children's early language environment. This ASP has been validated in North American English and other languages in predominantly monolingual contexts but not in a multilingual context like India. Thus, the current study aims to validate the classification accuracy of the LENA algorithm specifically focusing on speaker recognition of adult segments (AdS) and child segments (ChS) in a sample of bi/multilingual families from India.

Method: Thirty neurotypical children between 6 and 24 months (M = 12.89, SD = 4.95) were recruited. Participants were growing up in bi/multilingual environment hearing a combination of Kannada, Tamil, Malayalam, Telugu, Hindi, and/or English. Daylong audio recordings were collected using LENA and processed using the ASP to automatically detect segments across speaker categories. Two human annotators manually annotated ~900 min (37,431 segments across speaker categories). Performance accuracy (recall and precision) was calculated for AdS and ChS.

Results: The recall and precision for AdS were 0.62 (95% confidence interval [CI] [0.61, 0.63]) and 0.83 (95% CI [0.8, 0.83]), respectively. This indicated that 62% of the segments identified as AdS by the human annotator were also identified as AdS by the LENA ASP algorithm and 83% of the segments labeled by the LENA ASP as AdS were also labeled by the human annotator as AdS. Similarly, the recall and precision for ChS were 0.65 (95% CI [0.64, 0.66]) and 0.55 (95% CI [0.54, 0.56]), respectively.

Conclusions: This study documents the performance of the ASP in correctly classifying speakers as adult or child in a sample of families from India, indicating recall and precision that is relatively low. This study lays the groundwork for future investigations aiming to refine the algorithm models, potentially facilitating more accurate performance in bi/multilingual societies like India.

Supplemental material: https://doi.org/10.23641/asha.27910710.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Speech Language and Hearing Research
Journal of Speech Language and Hearing Research AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY-REHABILITATION
CiteScore
4.10
自引率
19.20%
发文量
538
审稿时长
4-8 weeks
期刊介绍: Mission: JSLHR publishes peer-reviewed research and other scholarly articles on the normal and disordered processes in speech, language, hearing, and related areas such as cognition, oral-motor function, and swallowing. The journal is an international outlet for both basic research on communication processes and clinical research pertaining to screening, diagnosis, and management of communication disorders as well as the etiologies and characteristics of these disorders. JSLHR seeks to advance evidence-based practice by disseminating the results of new studies as well as providing a forum for critical reviews and meta-analyses of previously published work. Scope: The broad field of communication sciences and disorders, including speech production and perception; anatomy and physiology of speech and voice; genetics, biomechanics, and other basic sciences pertaining to human communication; mastication and swallowing; speech disorders; voice disorders; development of speech, language, or hearing in children; normal language processes; language disorders; disorders of hearing and balance; psychoacoustics; and anatomy and physiology of hearing.
期刊最新文献
Linguistic Skills and Text Reading Comprehension in Prelingually Deaf Readers: A Systematic Review. Perception Versus Comprehension of Bound Morphemes in Children Who Are Deaf and Hard of Hearing: The Pivotal Role of Form-Meaning Mapping. Small-Dose Behavioral Treatment Effects: Learning Following 2 Hours of Computer-Based Conversational Script Training in Individuals With Poststroke Aphasia. Speech Perception in Noise and Cognitive Skills in Children With Varying Degrees of Music Training. Talker Differences in Perceived Emotion in Clear and Conversational Speech.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1