A scheme discriminating between synthetic speech and normal speech

Jilun Chen, Weiqiang Zhang, Jia Liu
{"title":"A scheme discriminating between synthetic speech and normal speech","authors":"Jilun Chen, Weiqiang Zhang, Jia Liu","doi":"10.1109/ICALIP.2016.7846613","DOIUrl":null,"url":null,"abstract":"This paper develops a system to automatically distinguish natural speech from synthetic speech. The issue of feature selection is considered. We take commonly used feature Mel-Frequency Cepstrum Coefficient (MFCC) in consideration, as well as other features such as Relative Phase Shift (RPS) and pitch tuned for Automatically Speech Recognition (ASR). We found some features are complimentary in the task of discriminating synthetic and natural speech. Gaussian Mixture Model Support Vector Machine (GMM-SVM) system is applied as classifier with feature input modified and compared to that of feature is applied in speaker recognition. Experiment on Librespeech versus online Text-to-Speech (TTS) speech synthesis platforms data set verified the effectiveness of the combination of these features.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALIP.2016.7846613","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper develops a system to automatically distinguish natural speech from synthetic speech. The issue of feature selection is considered. We take commonly used feature Mel-Frequency Cepstrum Coefficient (MFCC) in consideration, as well as other features such as Relative Phase Shift (RPS) and pitch tuned for Automatically Speech Recognition (ASR). We found some features are complimentary in the task of discriminating synthetic and natural speech. Gaussian Mixture Model Support Vector Machine (GMM-SVM) system is applied as classifier with feature input modified and compared to that of feature is applied in speaker recognition. Experiment on Librespeech versus online Text-to-Speech (TTS) speech synthesis platforms data set verified the effectiveness of the combination of these features.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种区分合成语音和正常语音的方案
本文开发了一个自动识别自然语音和合成语音的系统。考虑了特征选择问题。我们考虑了常用的Mel-Frequency倒频谱系数(MFCC)特征,以及自动语音识别(ASR)的相对相移(RPS)和音调调谐等其他特征。我们发现在区分合成语音和自然语音的任务中,一些特征是互补的。将高斯混合模型支持向量机(GMM-SVM)系统作为分类器,对特征输入进行修正,并与特征输入进行比较,用于说话人识别。在Librespeech和在线文本到语音(TTS)语音合成平台数据集上的实验验证了这些特征组合的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of student activities trajectory and design of attendance management based on internet of things An RFID indoor positioning system by using Particle Swarm Optimization-based Artificial Neural Network Comparison of sparse-view CT image reconstruction algorithms Face recognition based on LBPH and regression of Local Binary features Research and application of dynamic and interactive data visualization based on D3
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1