SPTK4: An Open-Source Software Toolkit for Speech Signal Processing

Takenori Yoshimura, Takato Fujimoto, Keiichiro Oura, K. Tokuda
{"title":"SPTK4: An Open-Source Software Toolkit for Speech Signal Processing","authors":"Takenori Yoshimura, Takato Fujimoto, Keiichiro Oura, K. Tokuda","doi":"10.21437/ssw.2023-33","DOIUrl":null,"url":null,"abstract":"The Speech Signal Processing ToolKit (SPTK) is an open-source suite of speech signal processing tools, which has been developed and maintained by the SPTK working group and has widely contributed to the speech signal processing community since 1998. Although SPTK has reached over a hundred thousand downloads, the concepts as well as the features have not yet been widely disseminated. This paper gives an overview of SPTK and demonstrations to provide a better understanding of the toolkit. We have recently developed its differentiable Py-Torch version, diffsptk , to adapt to advancements in the deep learning field. The details of diffsptk are also presented in this paper. We hope that the toolkit will help developers and researchers working in the field of speech signal processing.","PeriodicalId":346639,"journal":{"name":"12th ISCA Speech Synthesis Workshop (SSW2023)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"12th ISCA Speech Synthesis Workshop (SSW2023)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ssw.2023-33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The Speech Signal Processing ToolKit (SPTK) is an open-source suite of speech signal processing tools, which has been developed and maintained by the SPTK working group and has widely contributed to the speech signal processing community since 1998. Although SPTK has reached over a hundred thousand downloads, the concepts as well as the features have not yet been widely disseminated. This paper gives an overview of SPTK and demonstrations to provide a better understanding of the toolkit. We have recently developed its differentiable Py-Torch version, diffsptk , to adapt to advancements in the deep learning field. The details of diffsptk are also presented in this paper. We hope that the toolkit will help developers and researchers working in the field of speech signal processing.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SPTK4:一个用于语音信号处理的开源软件工具包
语音信号处理工具包(SPTK)是一个开源的语音信号处理工具套件,自1998年以来一直由SPTK工作组开发和维护,并为语音信号处理社区做出了广泛贡献。尽管SPTK的下载量已经超过10万次,但其概念和特性尚未得到广泛传播。本文给出了SPTK的概述和演示,以便更好地理解该工具包。我们最近开发了其可微分的Py-Torch版本diffsptk,以适应深度学习领域的进步。本文还详细介绍了diffsptk。我们希望该工具包能够帮助语音信号处理领域的开发人员和研究人员。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Re-examining the quality dimensions of synthetic speech Synthesising turn-taking cues using natural conversational data Diffusion Transformer for Adaptive Text-to-Speech Adaptive Duration Modification of Speech using Masked Convolutional Networks and Open-Loop Time Warping Audiobook synthesis with long-form neural text-to-speech
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1