Y. Liao, Y. Chang, Sing-Yue Wang, Jhih-wei Chen, Sheng-Ming Wang, Jenq-Haur Wang
{"title":"台湾普通话广播语料库项目进展报告","authors":"Y. Liao, Y. Chang, Sing-Yue Wang, Jhih-wei Chen, Sheng-Ming Wang, Jenq-Haur Wang","doi":"10.1109/ICSDA.2017.8384450","DOIUrl":null,"url":null,"abstract":"The Taiwan Mandarin Radio Speech Corpus contains 300 (and growing) hours of high-quality recordings selected from Taiwan's National Education Radio (NER) archive. The corpus features speech (of various speaking styles, produced by hundreds of speakers) and their corresponding transcriptions (automatically transcribed and manually corrected) and annotations, which are suitable for speech and language research. In this paper, we report the progress of the corpus development and especially show the experimental results of audio event detection/segmentation and semi-supervised acoustic model training on this corpus.","PeriodicalId":255147,"journal":{"name":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A progress report of the Taiwan Mandarin radio speech corpus project\",\"authors\":\"Y. Liao, Y. Chang, Sing-Yue Wang, Jhih-wei Chen, Sheng-Ming Wang, Jenq-Haur Wang\",\"doi\":\"10.1109/ICSDA.2017.8384450\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Taiwan Mandarin Radio Speech Corpus contains 300 (and growing) hours of high-quality recordings selected from Taiwan's National Education Radio (NER) archive. The corpus features speech (of various speaking styles, produced by hundreds of speakers) and their corresponding transcriptions (automatically transcribed and manually corrected) and annotations, which are suitable for speech and language research. In this paper, we report the progress of the corpus development and especially show the experimental results of audio event detection/segmentation and semi-supervised acoustic model training on this corpus.\",\"PeriodicalId\":255147,\"journal\":{\"name\":\"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSDA.2017.8384450\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2017.8384450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A progress report of the Taiwan Mandarin radio speech corpus project
The Taiwan Mandarin Radio Speech Corpus contains 300 (and growing) hours of high-quality recordings selected from Taiwan's National Education Radio (NER) archive. The corpus features speech (of various speaking styles, produced by hundreds of speakers) and their corresponding transcriptions (automatically transcribed and manually corrected) and annotations, which are suitable for speech and language research. In this paper, we report the progress of the corpus development and especially show the experimental results of audio event detection/segmentation and semi-supervised acoustic model training on this corpus.