{"title":"GMM-Based Maghreb Dialect IdentificationSystem","authors":"Lachachi Nour-Eddine, Adla Abdelkader","doi":"10.3745/JIPS.02.0015","DOIUrl":null,"url":null,"abstract":"While Modern Standard Arabic is the formal spoken and written language of the Arab world; dialects are the major communication mode for everyday life. Therefore, identifying a speaker`s dialect is critical in the Arabic-speaking world for speech processing tasks, such as automatic speech recognition or identification. In this paper, we examine two approaches that reduce the Universal Background Model (UBM) in the automatic dialect identification system across the five following Arabic Maghreb dialects: Moroccan, Tunisian, and 3 dialects of the western (Oranian), central (Algiersian), and eastern (Constantinian) regions of Algeria. We applied our approaches to the Maghreb dialect detection domain that contains a collection of 10-second utterances and we compared the performance precision gained against the dialect samples from a baseline GMM-UBM system and the ones from our own improved GMM-UBM system that uses a Reduced UBM algorithm. Our experiments show that our approaches significantly improve identification performance over purely acoustic features with an identification rate of 80.49%.","PeriodicalId":46825,"journal":{"name":"Journal of Information Processing Systems","volume":"1 1","pages":"22-38"},"PeriodicalIF":0.8000,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Processing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3745/JIPS.02.0015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 9
Abstract
While Modern Standard Arabic is the formal spoken and written language of the Arab world; dialects are the major communication mode for everyday life. Therefore, identifying a speaker`s dialect is critical in the Arabic-speaking world for speech processing tasks, such as automatic speech recognition or identification. In this paper, we examine two approaches that reduce the Universal Background Model (UBM) in the automatic dialect identification system across the five following Arabic Maghreb dialects: Moroccan, Tunisian, and 3 dialects of the western (Oranian), central (Algiersian), and eastern (Constantinian) regions of Algeria. We applied our approaches to the Maghreb dialect detection domain that contains a collection of 10-second utterances and we compared the performance precision gained against the dialect samples from a baseline GMM-UBM system and the ones from our own improved GMM-UBM system that uses a Reduced UBM algorithm. Our experiments show that our approaches significantly improve identification performance over purely acoustic features with an identification rate of 80.49%.