Emmanuel Malaay, Michael Simora, R. J. Cabatic, Nathaniel Oco, R. Roxas
{"title":"多语言孤立数字语料库的开发","authors":"Emmanuel Malaay, Michael Simora, R. J. Cabatic, Nathaniel Oco, R. Roxas","doi":"10.1109/ICSDA.2017.8384452","DOIUrl":null,"url":null,"abstract":"We present a multilingual speech corpus for isolated digits. As case study, we focused on languages in the Philippines: English, Filipino, Ilocano, Cebuano, and Spanish. Our isolated digits speech corpus has a duration of almost nine hours, collection from 262 speakers. These data were word- level annotated and will be used to train the acoustic models using the ASR toolkits. The corpus will be used for an automatic speech recognition (ASR) system and therefore the database must be sufficient to develop an ASR system.","PeriodicalId":255147,"journal":{"name":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","volume":"165 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development of a multilingual isolated digits speech corpus\",\"authors\":\"Emmanuel Malaay, Michael Simora, R. J. Cabatic, Nathaniel Oco, R. Roxas\",\"doi\":\"10.1109/ICSDA.2017.8384452\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a multilingual speech corpus for isolated digits. As case study, we focused on languages in the Philippines: English, Filipino, Ilocano, Cebuano, and Spanish. Our isolated digits speech corpus has a duration of almost nine hours, collection from 262 speakers. These data were word- level annotated and will be used to train the acoustic models using the ASR toolkits. The corpus will be used for an automatic speech recognition (ASR) system and therefore the database must be sufficient to develop an ASR system.\",\"PeriodicalId\":255147,\"journal\":{\"name\":\"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)\",\"volume\":\"165 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSDA.2017.8384452\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2017.8384452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Development of a multilingual isolated digits speech corpus
We present a multilingual speech corpus for isolated digits. As case study, we focused on languages in the Philippines: English, Filipino, Ilocano, Cebuano, and Spanish. Our isolated digits speech corpus has a duration of almost nine hours, collection from 262 speakers. These data were word- level annotated and will be used to train the acoustic models using the ASR toolkits. The corpus will be used for an automatic speech recognition (ASR) system and therefore the database must be sufficient to develop an ASR system.