Vanessa Lim, Hui Shan Ang, Estelle Lee, Boon Pang Lim
{"title":"Towards an Interactive Voice Agent for Singapore Hokkien","authors":"Vanessa Lim, Hui Shan Ang, Estelle Lee, Boon Pang Lim","doi":"10.1145/2974804.2980495","DOIUrl":null,"url":null,"abstract":"Singapore Hokkien (SH) is the most commonly spoken non-Mandarin Chinese dialect in Singapore. It is an important language for many members of Singapore's pioneer generation, but much less so for the younger generation who prefer English. In recent years, the greying of this demographic has placed an increasing demand on for assistive devices to support them. We report ongoing efforts to build limited-vocabulary speech recognition, with the eventual goal of a conversational voice agent in SH that can support applications in home-automation or in-hospital use case scenarios. This process is challenging as sizeable SH speech corpora do not yet exist, and SH is sufficiently different from existing Mandarin or Minnan such that other corpora cannot be directly used. We document our efforts at building language resources -- audio corpora, pronunciation lexicons -- and present some preliminary findings on multilingual training.","PeriodicalId":185756,"journal":{"name":"Proceedings of the Fourth International Conference on Human Agent Interaction","volume":"141 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth International Conference on Human Agent Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2974804.2980495","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Singapore Hokkien (SH) is the most commonly spoken non-Mandarin Chinese dialect in Singapore. It is an important language for many members of Singapore's pioneer generation, but much less so for the younger generation who prefer English. In recent years, the greying of this demographic has placed an increasing demand on for assistive devices to support them. We report ongoing efforts to build limited-vocabulary speech recognition, with the eventual goal of a conversational voice agent in SH that can support applications in home-automation or in-hospital use case scenarios. This process is challenging as sizeable SH speech corpora do not yet exist, and SH is sufficiently different from existing Mandarin or Minnan such that other corpora cannot be directly used. We document our efforts at building language resources -- audio corpora, pronunciation lexicons -- and present some preliminary findings on multilingual training.