{"title":"Automatic detection of unnatural word-level segments in unit-selection speech synthesis","authors":"William Yang Wang, Kallirroi Georgila","doi":"10.1109/ASRU.2011.6163946","DOIUrl":null,"url":null,"abstract":"We investigate the problem of automatically detecting unnatural word-level segments in unit selection speech synthesis. We use a large set of features, namely, target and join costs, language models, prosodic cues, energy and spectrum, and Delta Term Frequency Inverse Document Frequency (TF-IDF), and we report comparative results between different feature types and their combinations. We also compare three modeling methods based on Support Vector Machines (SVMs), Random Forests, and Conditional Random Fields (CRFs). We then discuss our results and present a comprehensive error analysis.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2011.6163946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
We investigate the problem of automatically detecting unnatural word-level segments in unit selection speech synthesis. We use a large set of features, namely, target and join costs, language models, prosodic cues, energy and spectrum, and Delta Term Frequency Inverse Document Frequency (TF-IDF), and we report comparative results between different feature types and their combinations. We also compare three modeling methods based on Support Vector Machines (SVMs), Random Forests, and Conditional Random Fields (CRFs). We then discuss our results and present a comprehensive error analysis.