Alejandro Chacón-Vargas, Daniel Pérez-Conejo, Marvin Coto-Jiménez
{"title":"根据年龄和性别评估哥斯达黎加儿童-成人语言的词频化算法的有效性","authors":"Alejandro Chacón-Vargas, Daniel Pérez-Conejo, Marvin Coto-Jiménez","doi":"10.18845/tm.v35i8.6443","DOIUrl":null,"url":null,"abstract":"Speaker diarization is the task of automatically identifying speaker identities and detecting their speaking times in an audio recording. Several algorithms have shown improvements in the performance of this task during the past years. However, it still has performance challenges in interaction scenarios, such as between a child and adult, where interruptions, fillers, laughs and other elements may affect the detection and clustering of the segments. \nIn this work, we perform an exploratory study with two diarization algorithms in children-adult interactions within a recording studio and assess the effectiveness of the algorithms in different age groups and genders. All participants are native Costa Rican Spanish speakers. The children have ages between 3 to 14 years, and the interaction combines guided repetition of words or short phrases, as well as natural speech. \nThe results demonstrate how the age affects the diarization performance, both in cluster purity and speaker purity, in a direct but non-linear fashion.","PeriodicalId":42957,"journal":{"name":"Tecnologia en Marcha","volume":"PP 1","pages":""},"PeriodicalIF":0.1000,"publicationDate":"2022-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing the effectiveness of diarization algorithms in costa rican children-adult speech according to age group and gender\",\"authors\":\"Alejandro Chacón-Vargas, Daniel Pérez-Conejo, Marvin Coto-Jiménez\",\"doi\":\"10.18845/tm.v35i8.6443\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker diarization is the task of automatically identifying speaker identities and detecting their speaking times in an audio recording. Several algorithms have shown improvements in the performance of this task during the past years. However, it still has performance challenges in interaction scenarios, such as between a child and adult, where interruptions, fillers, laughs and other elements may affect the detection and clustering of the segments. \\nIn this work, we perform an exploratory study with two diarization algorithms in children-adult interactions within a recording studio and assess the effectiveness of the algorithms in different age groups and genders. All participants are native Costa Rican Spanish speakers. The children have ages between 3 to 14 years, and the interaction combines guided repetition of words or short phrases, as well as natural speech. \\nThe results demonstrate how the age affects the diarization performance, both in cluster purity and speaker purity, in a direct but non-linear fashion.\",\"PeriodicalId\":42957,\"journal\":{\"name\":\"Tecnologia en Marcha\",\"volume\":\"PP 1\",\"pages\":\"\"},\"PeriodicalIF\":0.1000,\"publicationDate\":\"2022-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tecnologia en Marcha\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18845/tm.v35i8.6443\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tecnologia en Marcha","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18845/tm.v35i8.6443","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Assessing the effectiveness of diarization algorithms in costa rican children-adult speech according to age group and gender
Speaker diarization is the task of automatically identifying speaker identities and detecting their speaking times in an audio recording. Several algorithms have shown improvements in the performance of this task during the past years. However, it still has performance challenges in interaction scenarios, such as between a child and adult, where interruptions, fillers, laughs and other elements may affect the detection and clustering of the segments.
In this work, we perform an exploratory study with two diarization algorithms in children-adult interactions within a recording studio and assess the effectiveness of the algorithms in different age groups and genders. All participants are native Costa Rican Spanish speakers. The children have ages between 3 to 14 years, and the interaction combines guided repetition of words or short phrases, as well as natural speech.
The results demonstrate how the age affects the diarization performance, both in cluster purity and speaker purity, in a direct but non-linear fashion.