An undergraduate Mandarin speech database for speaker recognition research

2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI:10.1109/ICSDA.2009.5278370

Hong Wang, Jingui Pan

{"title":"An undergraduate Mandarin speech database for speaker recognition research","authors":"Hong Wang, Jingui Pan","doi":"10.1109/ICSDA.2009.5278370","DOIUrl":null,"url":null,"abstract":"This paper describes the development of a new speech database for speaker recognition research, UMSD (undergraduate Mandarin speech database). In UMSD, there are total 12 sessions of utterances for each of the selected 24 undergraduate students, while all recordings are conducted in different session intervals. The phonetically balanced corpus content include isolated digits (0∼9), digit strings (5 phone numbers and 2 postal codes), words and phrases with different length from 1 to 10 characters (10 for each given length), the Chinese Phonetic Alphabet Table (21 Initials and 35 Finals), 2 ancient poems and a 200 words paragraph extracted from a well-known essay. Additionally, in order to effectively extract and process the interesting speech segments from UMSD, a speech database management system has been proposed on the base of MATLAB and MS-ACCESS. Results of preliminary evaluation show that the performance attained with UMSD is good, it not only meets the needs of our own recent effort in text-dependent and text-independent speaker recognition, but also allows the further research of the long term intra-speaker variability thanks to its multi-session records with different session intervals.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"8 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2009.5278370","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

This paper describes the development of a new speech database for speaker recognition research, UMSD (undergraduate Mandarin speech database). In UMSD, there are total 12 sessions of utterances for each of the selected 24 undergraduate students, while all recordings are conducted in different session intervals. The phonetically balanced corpus content include isolated digits (0∼9), digit strings (5 phone numbers and 2 postal codes), words and phrases with different length from 1 to 10 characters (10 for each given length), the Chinese Phonetic Alphabet Table (21 Initials and 35 Finals), 2 ancient poems and a 200 words paragraph extracted from a well-known essay. Additionally, in order to effectively extract and process the interesting speech segments from UMSD, a speech database management system has been proposed on the base of MATLAB and MS-ACCESS. Results of preliminary evaluation show that the performance attained with UMSD is good, it not only meets the needs of our own recent effort in text-dependent and text-independent speaker recognition, but also allows the further research of the long term intra-speaker variability thanks to its multi-session records with different session intervals.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向说话人识别研究的大学生普通话语音数据库

本文介绍了一个用于说话人识别研究的新型语音数据库UMSD(本科生普通话语音数据库)的开发。在UMSD中，选出的24名本科生每人总共有12个会话，而所有的录音都是在不同的会话间隔进行的。语音均衡的语料库内容包括孤立的数字(0 ~ 9)、数字串(5个电话号码和2个邮政编码)、1 ~ 10个字符(每个给定长度10个字符)的不同长度的单词和短语、汉语音标表(21个声母和35个韵母)、2首古诗和一篇200字的知名文章段落。此外，为了有效地提取和处理UMSD中感兴趣的语音片段，提出了一种基于MATLAB和MS-ACCESS的语音数据库管理系统。初步评价结果表明，该方法取得了良好的性能，不仅满足了我们目前在依赖文本和不依赖文本的说话人识别方面的需要，而且由于它具有不同会话间隔的多会话记录，可以进一步研究长期的说话人内部变异性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

自引率

0.00%

发文量

期刊最新文献

Message from the Oriental-COCOSDA convener Maximum Entropy combined FSM stemming method for Uyghur Multilingual number expansion for TTS Design and development of phonetically rich Urdu speech corpus Uyghur vowel weakening processing system