SPTK4: An Open-Source Software Toolkit for Speech Signal Processing

12th ISCA Speech Synthesis Workshop (SSW2023) Pub Date : 2023-08-26 DOI:10.21437/ssw.2023-33

Takenori Yoshimura, Takato Fujimoto, Keiichiro Oura, K. Tokuda

引用次数: 0

Abstract

The Speech Signal Processing ToolKit (SPTK) is an open-source suite of speech signal processing tools, which has been developed and maintained by the SPTK working group and has widely contributed to the speech signal processing community since 1998. Although SPTK has reached over a hundred thousand downloads, the concepts as well as the features have not yet been widely disseminated. This paper gives an overview of SPTK and demonstrations to provide a better understanding of the toolkit. We have recently developed its differentiable Py-Torch version, diffsptk , to adapt to advancements in the deep learning field. The details of diffsptk are also presented in this paper. We hope that the toolkit will help developers and researchers working in the field of speech signal processing.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SPTK4:一个用于语音信号处理的开源软件工具包

语音信号处理工具包(SPTK)是一个开源的语音信号处理工具套件，自1998年以来一直由SPTK工作组开发和维护，并为语音信号处理社区做出了广泛贡献。尽管SPTK的下载量已经超过10万次，但其概念和特性尚未得到广泛传播。本文给出了SPTK的概述和演示，以便更好地理解该工具包。我们最近开发了其可微分的Py-Torch版本diffsptk，以适应深度学习领域的进步。本文还详细介绍了diffsptk。我们希望该工具包能够帮助语音信号处理领域的开发人员和研究人员。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

12th ISCA Speech Synthesis Workshop (SSW2023)

自引率

0.00%

发文量

期刊最新文献

Re-examining the quality dimensions of synthetic speech Synthesising turn-taking cues using natural conversational data Diffusion Transformer for Adaptive Text-to-Speech Adaptive Duration Modification of Speech using Masked Convolutional Networks and Open-Loop Time Warping Audiobook synthesis with long-form neural text-to-speech