Quoc Bao Nguyen, Van Hai Do, Ba Quyen Dam, Minh Hung Le
{"title":"Development of a Vietnamese speech recognition system for Viettel call center","authors":"Quoc Bao Nguyen, Van Hai Do, Ba Quyen Dam, Minh Hung Le","doi":"10.1109/ICSDA.2017.8384456","DOIUrl":null,"url":null,"abstract":"In this paper, we first present our effort to collect a 85.8 hour corpus for Vietnamese telephone conversational speech from our Viettel call center. After that, various techniques such as time delay deep neural network (TDNN) with sequence training, data augmentation are applied to build the speech recognition system. Our final system achieves a low word error rate at 17.44% for this challenging corpus. To the best of our knowledge, it is the first attempt to build Vietnamese corpus and speech recognition system for the customer service domain.","PeriodicalId":255147,"journal":{"name":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2017.8384456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In this paper, we first present our effort to collect a 85.8 hour corpus for Vietnamese telephone conversational speech from our Viettel call center. After that, various techniques such as time delay deep neural network (TDNN) with sequence training, data augmentation are applied to build the speech recognition system. Our final system achieves a low word error rate at 17.44% for this challenging corpus. To the best of our knowledge, it is the first attempt to build Vietnamese corpus and speech recognition system for the customer service domain.