{"title":"语言环境分析(LENA)自动语音处理算法标签在印度家庭样本中的成人和儿童部分的验证。","authors":"Shoba S Meera, Divya Swaminathan, Sri Ranjani Venkata Murali, Reny Raju, Malavi Srikar, Sahana Shyam Sundar, Senthil Amudhan, Alejandrina Cristia, Rahul Pawar, Achuth Rao, Prathyusha P Vasuki, Shree Volme, Ashok Mysore","doi":"10.1044/2024_JSLHR-24-00099","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The Language ENvironment Analysis (LENA) technology uses automated speech processing (ASP) algorithms to estimate counts such as total adult words and child vocalizations, which helps understand children's early language environment. This ASP has been validated in North American English and other languages in predominantly monolingual contexts but not in a multilingual context like India. Thus, the current study aims to validate the classification accuracy of the LENA algorithm specifically focusing on speaker recognition of adult segments (AdS) and child segments (ChS) in a sample of bi/multilingual families from India.</p><p><strong>Method: </strong>Thirty neurotypical children between 6 and 24 months (<i>M</i> = 12.89, <i>SD</i> = 4.95) were recruited. Participants were growing up in bi/multilingual environment hearing a combination of Kannada, Tamil, Malayalam, Telugu, Hindi, and/or English. Daylong audio recordings were collected using LENA and processed using the ASP to automatically detect segments across speaker categories. Two human annotators manually annotated ~900 min (37,431 segments across speaker categories). Performance accuracy (recall and precision) was calculated for AdS and ChS.</p><p><strong>Results: </strong>The recall and precision for AdS were 0.62 (95% confidence interval [CI] [0.61, 0.63]) and 0.83 (95% CI [0.8, 0.83]), respectively. This indicated that 62% of the segments identified as AdS by the human annotator were also identified as AdS by the LENA ASP algorithm and 83% of the segments labeled by the LENA ASP as AdS were also labeled by the human annotator as AdS. Similarly, the recall and precision for ChS were 0.65 (95% CI [0.64, 0.66]) and 0.55 (95% CI [0.54, 0.56]), respectively.</p><p><strong>Conclusions: </strong>This study documents the performance of the ASP in correctly classifying speakers as adult or child in a sample of families from India, indicating recall and precision that is relatively low. This study lays the groundwork for future investigations aiming to refine the algorithm models, potentially facilitating more accurate performance in bi/multilingual societies like India.</p><p><strong>Supplemental material: </strong>https://doi.org/10.23641/asha.27910710.</p>","PeriodicalId":51254,"journal":{"name":"Journal of Speech Language and Hearing Research","volume":" ","pages":"40-53"},"PeriodicalIF":2.2000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Validation of the Language ENvironment Analysis (LENA) Automated Speech Processing Algorithm Labels for Adult and Child Segments in a Sample of Families From India.\",\"authors\":\"Shoba S Meera, Divya Swaminathan, Sri Ranjani Venkata Murali, Reny Raju, Malavi Srikar, Sahana Shyam Sundar, Senthil Amudhan, Alejandrina Cristia, Rahul Pawar, Achuth Rao, Prathyusha P Vasuki, Shree Volme, Ashok Mysore\",\"doi\":\"10.1044/2024_JSLHR-24-00099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>The Language ENvironment Analysis (LENA) technology uses automated speech processing (ASP) algorithms to estimate counts such as total adult words and child vocalizations, which helps understand children's early language environment. This ASP has been validated in North American English and other languages in predominantly monolingual contexts but not in a multilingual context like India. Thus, the current study aims to validate the classification accuracy of the LENA algorithm specifically focusing on speaker recognition of adult segments (AdS) and child segments (ChS) in a sample of bi/multilingual families from India.</p><p><strong>Method: </strong>Thirty neurotypical children between 6 and 24 months (<i>M</i> = 12.89, <i>SD</i> = 4.95) were recruited. Participants were growing up in bi/multilingual environment hearing a combination of Kannada, Tamil, Malayalam, Telugu, Hindi, and/or English. Daylong audio recordings were collected using LENA and processed using the ASP to automatically detect segments across speaker categories. Two human annotators manually annotated ~900 min (37,431 segments across speaker categories). Performance accuracy (recall and precision) was calculated for AdS and ChS.</p><p><strong>Results: </strong>The recall and precision for AdS were 0.62 (95% confidence interval [CI] [0.61, 0.63]) and 0.83 (95% CI [0.8, 0.83]), respectively. This indicated that 62% of the segments identified as AdS by the human annotator were also identified as AdS by the LENA ASP algorithm and 83% of the segments labeled by the LENA ASP as AdS were also labeled by the human annotator as AdS. Similarly, the recall and precision for ChS were 0.65 (95% CI [0.64, 0.66]) and 0.55 (95% CI [0.54, 0.56]), respectively.</p><p><strong>Conclusions: </strong>This study documents the performance of the ASP in correctly classifying speakers as adult or child in a sample of families from India, indicating recall and precision that is relatively low. This study lays the groundwork for future investigations aiming to refine the algorithm models, potentially facilitating more accurate performance in bi/multilingual societies like India.</p><p><strong>Supplemental material: </strong>https://doi.org/10.23641/asha.27910710.</p>\",\"PeriodicalId\":51254,\"journal\":{\"name\":\"Journal of Speech Language and Hearing Research\",\"volume\":\" \",\"pages\":\"40-53\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Speech Language and Hearing Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1044/2024_JSLHR-24-00099\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/5 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Speech Language and Hearing Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1044/2024_JSLHR-24-00099","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/5 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
Validation of the Language ENvironment Analysis (LENA) Automated Speech Processing Algorithm Labels for Adult and Child Segments in a Sample of Families From India.
Purpose: The Language ENvironment Analysis (LENA) technology uses automated speech processing (ASP) algorithms to estimate counts such as total adult words and child vocalizations, which helps understand children's early language environment. This ASP has been validated in North American English and other languages in predominantly monolingual contexts but not in a multilingual context like India. Thus, the current study aims to validate the classification accuracy of the LENA algorithm specifically focusing on speaker recognition of adult segments (AdS) and child segments (ChS) in a sample of bi/multilingual families from India.
Method: Thirty neurotypical children between 6 and 24 months (M = 12.89, SD = 4.95) were recruited. Participants were growing up in bi/multilingual environment hearing a combination of Kannada, Tamil, Malayalam, Telugu, Hindi, and/or English. Daylong audio recordings were collected using LENA and processed using the ASP to automatically detect segments across speaker categories. Two human annotators manually annotated ~900 min (37,431 segments across speaker categories). Performance accuracy (recall and precision) was calculated for AdS and ChS.
Results: The recall and precision for AdS were 0.62 (95% confidence interval [CI] [0.61, 0.63]) and 0.83 (95% CI [0.8, 0.83]), respectively. This indicated that 62% of the segments identified as AdS by the human annotator were also identified as AdS by the LENA ASP algorithm and 83% of the segments labeled by the LENA ASP as AdS were also labeled by the human annotator as AdS. Similarly, the recall and precision for ChS were 0.65 (95% CI [0.64, 0.66]) and 0.55 (95% CI [0.54, 0.56]), respectively.
Conclusions: This study documents the performance of the ASP in correctly classifying speakers as adult or child in a sample of families from India, indicating recall and precision that is relatively low. This study lays the groundwork for future investigations aiming to refine the algorithm models, potentially facilitating more accurate performance in bi/multilingual societies like India.
期刊介绍:
Mission: JSLHR publishes peer-reviewed research and other scholarly articles on the normal and disordered processes in speech, language, hearing, and related areas such as cognition, oral-motor function, and swallowing. The journal is an international outlet for both basic research on communication processes and clinical research pertaining to screening, diagnosis, and management of communication disorders as well as the etiologies and characteristics of these disorders. JSLHR seeks to advance evidence-based practice by disseminating the results of new studies as well as providing a forum for critical reviews and meta-analyses of previously published work.
Scope: The broad field of communication sciences and disorders, including speech production and perception; anatomy and physiology of speech and voice; genetics, biomechanics, and other basic sciences pertaining to human communication; mastication and swallowing; speech disorders; voice disorders; development of speech, language, or hearing in children; normal language processes; language disorders; disorders of hearing and balance; psychoacoustics; and anatomy and physiology of hearing.