基于数据增强的说话人自适应实验

1995 International Conference on Acoustics, Speech, and Signal Processing Pub Date : 1995-05-09 DOI:10.1109/ICASSP.1995.479788

J. Bellegarda, P. D. Souza, D. Nahamoo, M. Padmanabhan, M. Picheny, L. Bahl

{"title":"基于数据增强的说话人自适应实验","authors":"J. Bellegarda, P. D. Souza, D. Nahamoo, M. Padmanabhan, M. Picheny, L. Bahl","doi":"10.1109/ICASSP.1995.479788","DOIUrl":null,"url":null,"abstract":"Speaker adaptation typically involves customizing some existing (reference) models in order to account for the characteristics of a new speaker. This work considers the slightly different paradigm of customizing some reference data for the purpose of populating the new speaker's space, and then using the resulting (augmented) data to derive the customized models. The data augmentation technique is based on the metamorphic algorithm first proposed in Bellegarda et al. [1992], assuming that a relatively modest amount of data (100 sentences) is available from each new speaker. This contraint requires that reference speakers be selected with some care. The performance of this method is illustrated on a portion of the Wall Street Journal task.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Experiments using data augmentation for speaker adaptation\",\"authors\":\"J. Bellegarda, P. D. Souza, D. Nahamoo, M. Padmanabhan, M. Picheny, L. Bahl\",\"doi\":\"10.1109/ICASSP.1995.479788\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker adaptation typically involves customizing some existing (reference) models in order to account for the characteristics of a new speaker. This work considers the slightly different paradigm of customizing some reference data for the purpose of populating the new speaker's space, and then using the resulting (augmented) data to derive the customized models. The data augmentation technique is based on the metamorphic algorithm first proposed in Bellegarda et al. [1992], assuming that a relatively modest amount of data (100 sentences) is available from each new speaker. This contraint requires that reference speakers be selected with some care. The performance of this method is illustrated on a portion of the Wall Street Journal task.\",\"PeriodicalId\":300119,\"journal\":{\"name\":\"1995 International Conference on Acoustics, Speech, and Signal Processing\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1995 International Conference on Acoustics, Speech, and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.1995.479788\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1995 International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1995.479788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

说话者适应通常包括定制一些现有的(参考)模型，以考虑新说话者的特征。这项工作考虑了一种稍微不同的范式，即定制一些参考数据，以填充新说话者的空间，然后使用结果(增强)数据来派生定制模型。数据增强技术基于Bellegarda等人[1992]首次提出的变形算法(metamorphic algorithm)，假设每个新说话者的数据量相对适中(100个句子)。这个约束要求在选择参考发言者时要小心。该方法的性能在《华尔街日报》任务的一部分中得到了说明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Experiments using data augmentation for speaker adaptation

Speaker adaptation typically involves customizing some existing (reference) models in order to account for the characteristics of a new speaker. This work considers the slightly different paradigm of customizing some reference data for the purpose of populating the new speaker's space, and then using the resulting (augmented) data to derive the customized models. The data augmentation technique is based on the metamorphic algorithm first proposed in Bellegarda et al. [1992], assuming that a relatively modest amount of data (100 sentences) is available from each new speaker. This contraint requires that reference speakers be selected with some care. The performance of this method is illustrated on a portion of the Wall Street Journal task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

1995 International Conference on Acoustics, Speech, and Signal Processing

自引率

0.00%

发文量

期刊最新文献

Language identification with phonological and lexical models Computationally efficient wavelet packet coding of wide-band stereo audio signals Signaling techniques using solitons Blind source detection and separation using second order non-stationarity On blind channel identification for impulsive signal environments