5th International Conference on Spoken Language Processing (ICSLP 1998)最新文献

英文中文

Syllable-onset acoustic properties associated with syllable-coda voicing 音节开头与音节结尾发音相关的声学特性

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-736

N. Nguyen, S. Hawkins

This study investigates durational and spectral variation in syllable-onset /l/s dependent on voicing in the coda. 1560 pairs of (C)lVC monosyllables differing in the voicing of the final stop were read by 4 British English speakers. Onset /l/ was longer before voiced than voiceless codas, and darker (for 3 speakers) as measured by F2 frequency and spectral centre of gravity. Differences due to other variables (lexical status, isolation/carrier context, syllable onset, vowel quality and regional accent) are outlined. It is proposed that coda voicing is a feature associated with the whole syllable, phonetically implemented as a variety of properties spread throughout the syllabic domain. Implications for word recognition are outlined.

本研究探讨了音节起头/l/s随尾音的持续时间和频谱变化。4名英国人读了1560对(C)lVC单音节，它们在最后一站的发音不同。通过F2频率和频谱重心测量，音尾/l/在发声之前比不发声的音尾长，并且较暗(3个说话者)。由于其他变量(词汇状态，隔离/载体上下文，音节起头，元音质量和区域口音)的差异被概述。本文提出，词尾发声是一个与整个音节相关的特征，在语音上实现为遍布整个音节域的各种属性。对单词识别的含义进行了概述。

引用次数: 6

Keyword extraction of radio news using domain identification based on categories of an encyclopedia 基于百科全书分类的领域识别广播新闻关键词提取

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-508

Yoshimi Suzuki, Fumiyo Fukumoto, Y. Sekiguchi

In this paper, we propose a keyword extraction method for dictation of radio news which consists of several domains. In our method, newspaper articles which are au-tomatically classi(cid:12)ed into suitable domains are used in order to calculate feature vectors. The feature vectors show term-domain interdependence and are used for selecting a suitable domain of each part of radio news. Keywords are extracted by using the selected domain. The results of keyword extraction experiments showed that our methods are e(cid:11)ective for keyword extraction of radio news.

本文提出了一种由多个领域组成的广播新闻听写关键字提取方法。在我们的方法中，使用自动分类(cid:12)到合适的域的报纸文章来计算特征向量。特征向量表现出词域间的相互依赖关系，并用于为广播新闻的每个部分选择合适的域。使用选择的域提取关键字。关键词提取实验结果表明，我们的方法对广播新闻的关键词提取是有效的。

引用次数: 2

Interfacing of CASA and partial recognition based on a multistream technique 基于多流技术的CASA与部分识别的接口

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-317

F. Berthommier, H. Glotin, Emmanuel Tessier, H. Bourlard

Keywords: speech Reference EPFL-CONF-82470 URL: http://publications.idiap.ch/downloads/reports/1998/glotin-icslp98.pdf Record created on 2006-03-10, modified on 2017-05-10

关键词:语音参考EPFL-CONF-82470 URL: http://publications.idiap.ch/downloads/reports/1998/glotin-icslp98.pdf创建日期:2006-03-10，修改日期:2017-05-10

引用次数: 13

Modeling vowel duration for Japanese text-to-speech synthesis 日语文本到语音合成的元音持续时间建模

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-46

J. Venditti, J. V. Santen

Accurate estimation of segmental durations is crucial for natural-sounding text-to-speech (TTS) synthesis. This paper presents a model of vowel duration used in the Bell Labs JapaneseTTS system. We describe the constraints on vowel devoicing, and effects of factors such as phone identity, surrounding phone identities, accentuation, syllabic structure, and phrasal position on the duration of both long and short vowels. A Sum-of-Products ap-proach is used to model key interactions observed in the data, and to predict values of factor combinations not found in the speech database. We report root mean squared deviations between observed and predicted durations ranging from 8 to 15 ms, and an overall correlation of 0.89. in Tokyo Japanese read speech for in Labs JapaneseTTS

片段持续时间的准确估计是自然声音文本到语音(TTS)合成的关键。本文提出了一个用于贝尔实验室日语系统的元音音长模型。我们描述了元音发声的制约因素，以及诸如电话身份、周围电话身份、重音、音节结构和短语位置等因素对长、短元音持续时间的影响。使用产品和方法对数据中观察到的关键交互进行建模，并预测语音数据库中未发现的因素组合的值。我们报告了观察到的持续时间和预测的持续时间之间的均方根偏差，范围从8到15毫秒，总体相关性为0.89。在东京，日本人为实验室的日本人朗读演讲

引用次数: 11

Robust features for speech recognition systems 语音识别系统的鲁棒特性

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-316

A. Bayya, B. Yegnanarayana

In this paper we propose a set of features based on group delay spectrum for speech recognition systems. These features appear to be more robust to channel variations and environmental changes compared to features based on Melspectral coefficients. The main idea is to derive cepstrumlike features from group delay spectrum instead of deriving them from power spectrum. The group delay spectrum is computed from modified auto-correlation-like function. The effectiveness of the new feature set is demonstrated by the results of both speaker-independent (SI) and speaker-dependent (SD) recognition tasks. Preliminary results indicate that using the new features, we can obtain results comparable to Mel cepstra and PLP cepstra in most of the cases and a slight improvement in noisy cases. More optimization of the parameters is needed to fully exploit the nature of the new features.

本文提出了一套基于群延迟谱的语音识别系统特征。与基于mel谱系数的特征相比，这些特征似乎对通道变化和环境变化更加稳健。主要思想是从群延迟谱中推导倒频谱特征，而不是从功率谱中推导倒频谱特征。利用改进的类自相关函数计算群延迟谱。独立于说话人(SI)和依赖于说话人(SD)的识别结果证明了新特征集的有效性。初步结果表明，利用这些新特征，我们在大多数情况下可以获得与Mel倒频谱和PLP倒频谱相当的结果，在有噪声的情况下略有改善。为了充分利用新特征的性质，需要对参数进行更多的优化。

引用次数: 6

Generating emotional speech with a concatenative synthesizer 用连接合成器生成情感演讲

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-134

E. Rank, Hannes Pirker

We describe the attempt to synthesize emotional speech with a concatenative speech synthesizer using a parameter space cover-ing not only f0, duration and amplitude, but also voice quality parameters, spectral energy distribution, harmonics-to-noise ratio, and articulatory precision. The application of these extended parameter set offers the possibility to combine the high segmental quality of concatenative synthesis with a wider range of control settings needed for the synthesis of natural affected speech.

我们描述了用串联语音合成器合成情感语音的尝试，该合成器的参数空间不仅包括f0、持续时间和幅度，还包括语音质量参数、频谱能量分布、谐波噪声比和发音精度。这些扩展参数集的应用提供了将连接合成的高片段质量与自然影响语音合成所需的更广泛的控制设置相结合的可能性。

引用次数: 56

Describing intonation with a parametric model 用参数模型描述语调

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-59

G. Möhler

In this study a data-based approach to intonation modeling is presented. The model incorporates knowledge from intonation theories like the expected types of F 0 movements and syllable anchoring. The knowledge is integrated into the model using an appropriate approximation function for F 0 parametrization. The F 0 parameters that result from the parametrization are predicted from a set of features using neural nets. The quality of the generated contours is assessed by means of numerical measures and perception tests. They show that the basic hypotheses about intonation description and modeling are in principle correct and that they have the potential to be successfully applied to speech synthesis. We argue for a clear interface with a linguistic description (using pitch-accent and boundary labels as input) and discourse structure (using pitch-range normalized F 0 parameters), even though current text-to-speech systems usually still do not have the capability to predict most of the appropriate information.

本文提出了一种基于数据的语调建模方法。该模型结合了来自语调理论的知识，如f0动作的预期类型和音节锚定。知识是集成到模型使用适当的近似函数为f0参数化。由参数化产生的f0参数使用神经网络从一组特征中预测。通过数值测量和感知测试来评估生成轮廓的质量。结果表明，关于语调描述和建模的基本假设在原则上是正确的，并且具有成功应用于语音合成的潜力。我们主张使用语言描述(使用音调重音和边界标签作为输入)和话语结构(使用音调范围归一化f0参数)的清晰界面，即使当前的文本到语音系统通常仍然没有能力预测大多数适当的信息。

引用次数: 25

The selection of pronunciation variants: comparing the performance of man and machine 语音变体的选择:人机性能比较

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-604

J. Kessens, M. Wester, C. Cucchiarini, H. Strik

In this paper the performance of an automatic transcription tool is evaluated. The transcription tool is a Continuous Speech Recognizer (CSR) running in forced recognition mode. For evaluation the performance of the CSR was compared to that of nine expert listeners. Both man and the machine carried out exactly the same task: deciding whether a segment was present or not in 467 cases. It turned out that the performance of the CSR is comparable to that of the experts.

本文对一种自动转录工具的性能进行了评价。转录工具是一个连续语音识别器(CSR)在强制识别模式下运行。为了评价企业社会责任的表现，将其与9位专家听众的表现进行了比较。人和机器都执行完全相同的任务:在467个案例中判断一个片段是否存在。事实证明，企业社会责任的表现与专家的表现相当。

引用次数: 18

Predicting language scores from the speech perception scores of hearing-impaired children 从听障儿童的言语感知得分预测语言得分

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-786

L. Martin, J. Bench

The ability to understand speech in 27 hearing impaired children was assessed using the BKB/A Picture Related Sentence test for children. The mean sentence score for the group was 72% (range 100-18%). Language scores (CELF-R) and Verbal Scale IQ (WISC-R) scores were significantly below the norm (72.8 and 89.2 respectively). Performance Scale IQ scores were slightly above the norm (106.3). Sentences scores were correlated significantly with language score (r = 0.49). Further investigation showed that the predicability of language scores could be improved when sensation level was taken into account. Sensation level was negatively correlated with language scores (r = - 0.51), demonstrating that children with better language abilities perceived speech at relatively lower intensity levels. The observed sensation levels from the group were compared with the expected levels for normally hearing children. This difference measure yielded a correlation coefficient of - 0.73 with language scores.

采用BKB/A儿童图片相关句子测试对27名听障儿童的言语理解能力进行了评估。该组的平均句子得分为72%(范围100-18%)。语言得分(CELF-R)和言语量表智商(WISC-R)得分显著低于正常值(分别为72.8和89.2)。表现量表(Performance Scale)的智商得分略高于正常值(106.3)。句子得分与语言得分显著相关(r = 0.49)。进一步的研究表明，当考虑感觉水平时，语言分数的可预测性可以得到提高。感觉水平与语言得分呈负相关(r = - 0.51)，表明语言能力较好的儿童在相对较低的强度水平下感知语言。他们观察到的感觉水平与正常听力儿童的预期水平进行了比较。这种差异测量与语言分数的相关系数为- 0.73。

引用次数: 0

How to handle "foreign" sounds in Swedish text-to-speech conversion: approaching the 'xenophone' problem 如何处理瑞典语文本到语音转换中的“外国”声音:解决“异色音”问题

5th International Conference on Spoken Language Processing (ICSLP 1998)

Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-54

R. Eklund, Anders Lindström

This paper discusses the problem of handling “foreign” speech sounds in Swedish speech technology systems, in particular speech synthesis. A production study is made, where it is shown that Swedish ...

本文讨论了瑞典语语音技术系统中“外来”语音的处理问题，特别是语音合成问题。一项生产研究表明，瑞典…

引用次数: 19

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

5th International Conference on Spoken Language Processing (ICSLP 1998)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀