首页 > 最新文献

Laboratory Phonology最新文献

英文 中文
Shared Representations Underlie Metaphonological Judgments and Speech Motor Control 共同表征是隐喻判断和言语运动控制的基础
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-10-25 DOI: 10.5334/LABPHON.52
Sam Tilsen, A. Cohn
Researchers often use metalinguistic judgments to investigate phonological representations. The representations are assumed to govern speech motor control and thereby shape articulatory and acoustic characteristics of speech. Yet little is known about the relationship between metalinguistic judgments, phonological representations, and motor control. This paper reports on an experiment that directly investigates the relation between metalinguistic judgments and articulatory control, hypothesizing that the two share a common representation. This hypothesis predicts that differences in judgments should be correlated with differences in the acoustic characteristics of responses. An experiment was conducted in which syllable count judgments and productions of words with tense vowel/diphthong nuclei and liquid codas were obtained from native speakers of English. A subset of these words have previously been shown to exhibit ­variation in syllable count judgments. Acoustic analyses of productions showed that rime ­durations and formant trajectories differed between words associated with monosyllabic vs. disyllabic syllable count judgments. These results support the hypothesis that a common representation is utilized by the processes responsible for metaphonological judgments of syllable count and speech motor control.
研究人员经常使用元语言判断来研究语音表征。这些表征被假定为控制言语运动控制,从而塑造言语的发音和声学特征。然而,人们对元语言判断、语音表征和运动控制之间的关系知之甚少。本文报道了一项直接研究元语言判断和发音控制之间关系的实验,假设两者具有共同的表征。这一假设预测,判断的差异应该与反应的声学特征的差异相关。本实验以英语为母语的人为实验对象,对音节数的判断和带有时态元音/双元音核和液态尾音的词的产生进行了实验。这些词的一个子集已经被证明在音节数判断中表现出变化。声学分析表明,与单音节和双音节判断相关的单词之间的时间持续时间和形成峰轨迹不同。这些结果支持了一种假设,即音节数和言语运动控制的隐喻判断过程利用了共同表征。
{"title":"Shared Representations Underlie Metaphonological Judgments and Speech Motor Control","authors":"Sam Tilsen, A. Cohn","doi":"10.5334/LABPHON.52","DOIUrl":"https://doi.org/10.5334/LABPHON.52","url":null,"abstract":"Researchers often use metalinguistic judgments to investigate phonological representations. The representations are assumed to govern speech motor control and thereby shape articulatory and acoustic characteristics of speech. Yet little is known about the relationship between metalinguistic judgments, phonological representations, and motor control. This paper reports on an experiment that directly investigates the relation between metalinguistic judgments and articulatory control, hypothesizing that the two share a common representation. This hypothesis predicts that differences in judgments should be correlated with differences in the acoustic characteristics of responses. An experiment was conducted in which syllable count judgments and productions of words with tense vowel/diphthong nuclei and liquid codas were obtained from native speakers of English. A subset of these words have previously been shown to exhibit ­variation in syllable count judgments. Acoustic analyses of productions showed that rime ­durations and formant trajectories differed between words associated with monosyllabic vs. disyllabic syllable count judgments. These results support the hypothesis that a common representation is utilized by the processes responsible for metaphonological judgments of syllable count and speech motor control.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
The VOT Category Boundary in Word-Initial Stops: Counter-Evidence Against Rate Normalization in English Spontaneous Speech 词首停顿的VOT范畴边界:反对英语自发性言语语速归一化的反证据
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-10-07 DOI: 10.5334/LABPHON.49
Satsuki Nakai, J. Scobbie
Some languages, such as many varieties of English, use short-lag and long-lag VOT to distinguish word- and syllable-initial voiced vs. voiceless stop phonemes. According to a popular view, the optimal VOT category boundary between the two types of stops moves towards larger values as articulation rate becomes slower (and speech segments longer), and listeners accordingly shift the perceptual VOT category boundary. According to an alternative view, listeners do not shift the VOT category boundary with a change in articulation rate, because the same category boundary remains optimal across different rates of articulation in normal speech, although a shift in the optimal boundary location can be induced in the laboratory by instructing speakers to use artificially extreme articulation rates. In this study we compared the effectiveness of rate-independent VOT category boundaries applied to word-initial stop phonemes in spontaneous English speech, against the effectiveness of Miller et al.’s (1986) rate-dependent VOT category boundary applied to laboratory speech. The effectiveness of the two types of category boundaries were comparable, when spontaneous speech data were controlled for factors other than articulation rate. Our results suggest that perceptual VOT category boundaries need not shift with a change in articulation rate under normal circumstances.
一些语言,比如英语的许多变体,使用短滞后和长滞后的元音来区分单词和音节开头的浊音和不浊音停止音素。根据一种流行的观点,随着发音速度变慢(以及语音段变长),两种类型的停顿之间的最佳VOT类别边界向更大的值移动,听众相应地移动感知VOT类别边界。根据另一种观点,听者不会随着发音速度的变化而移动VOT类别边界,因为在正常讲话中,同一类别边界在不同的发音速度下仍然是最佳的,尽管在实验室中可以通过指导说话者人为地使用极端的发音速度来诱导最佳边界位置的移动。在这项研究中,我们比较了独立于语速的VOT类别边界应用于自发英语语音中单词起始停止音素的有效性,与Miller等人(1986)的语速依赖的VOT类别边界应用于实验室语音的有效性。当自发语音数据控制除发音率以外的因素时,两种类型的类别边界的有效性是可比较的。我们的研究结果表明,在正常情况下,感知VOT类别边界不需要随着发音速率的变化而变化。
{"title":"The VOT Category Boundary in Word-Initial Stops: Counter-Evidence Against Rate Normalization in English Spontaneous Speech","authors":"Satsuki Nakai, J. Scobbie","doi":"10.5334/LABPHON.49","DOIUrl":"https://doi.org/10.5334/LABPHON.49","url":null,"abstract":"Some languages, such as many varieties of English, use short-lag and long-lag VOT to distinguish word- and syllable-initial voiced vs. voiceless stop phonemes. According to a popular view, the optimal VOT category boundary between the two types of stops moves towards larger values as articulation rate becomes slower (and speech segments longer), and listeners accordingly shift the perceptual VOT category boundary. According to an alternative view, listeners do not shift the VOT category boundary with a change in articulation rate, because the same category boundary remains optimal across different rates of articulation in normal speech, although a shift in the optimal boundary location can be induced in the laboratory by instructing speakers to use artificially extreme articulation rates. In this study we compared the effectiveness of rate-independent VOT category boundaries applied to word-initial stop phonemes in spontaneous English speech, against the effectiveness of Miller et al.’s (1986) rate-dependent VOT category boundary applied to laboratory speech. The effectiveness of the two types of category boundaries were comparable, when spontaneous speech data were controlled for factors other than articulation rate. Our results suggest that perceptual VOT category boundaries need not shift with a change in articulation rate under normal circumstances.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Iambic-Trochaic Law Effects among Native Speakers of Spanish and English 抑扬格-扬格律在西班牙语和英语母语者中的作用
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-10-07 DOI: 10.5334/LABPHON.42
Megan J. Crowhurst
The Iambic-Trochaic Law (Bolton, 1894; Hayes, 1995; Woodrow, 1909) asserts that listeners associate greater intensity with group beginnings (a loud-first preference) and greater duration with group endings (a long-last preference). Hayes (1987; 1995) posits a natural connection between the prominences referred to in the ITL and the locations of stressed syllables in feet. However, not all lengthening in final positions originates with stressed syllables, and greater duration may also be associated with stress in nonfinal (trochaic) positions. The research described here challenged the notion that presumptive long-last effects necessarily reflect stress-related duration patterns, and investigated the general hypothesis that the robustness of long-last effects should vary depending on the strength of the association between final positions and increased duration, whatever its source. Two ITL studies were conducted in which native speakers of Spanish and of English grouped streams of rhythmically alternating syllables in which vowel intensity and/or duration levels were varied. These languages were chosen because while they are prosodically similar, increased duration on constituent-final syllables is both more common and more salient in English than Spanish. Outcomes revealed robust loud-first effects in both language groups. Long-last effects were significantly weaker in the Spanish group when vowel duration was varied singly. However, long-last effects were present and comparable in both language groups when intensity and duration were covaried. Intensity was a more robust predictor of responses than duration. A primary conclusion was that whether or not humans’ rhythmic grouping preferences have an innate component, duration-based grouping preferences, at least, and the magnitude of intensity-based effects are shaped by listeners’ backgrounds.
抑扬格-扬格律(博尔顿,1894年;海斯,1995;Woodrow(1909)断言,听众将更大的强度与群体开始(一种响亮的偏好)联系在一起,而更长的持续时间与群体结束(一种持久的偏好)联系在一起。海耶斯(1987;1995)假定在ITL中提到的突出位置和脚中重读音节的位置之间存在自然联系。然而,并非所有末音位置的拉长都源于重读音节,更长的持续时间也可能与非末音(扬抑音)位置的重音有关。本文所述的研究挑战了假定的长期效应必然反映与压力相关的持续时间模式的观念,并调查了长期效应的稳健性应该取决于最终位置和持续时间增加之间的关联强度(无论其来源如何)的一般假设。在进行的两项国际语言研究中,以西班牙语和英语为母语的人将有节奏地交替的音节流分组,其中元音强度和/或持续时间水平各不相同。之所以选择这两种语言,是因为虽然它们在韵律上相似,但在英语中,组成音节-结尾音节持续时间的增加比西班牙语更常见,也更突出。结果显示,两种语言组都有明显的“大声优先”效应。在西班牙语组中,当元音持续时间单独变化时,持久效应明显较弱。然而,当强度和持续时间共变时,两种语言组的长期效果都存在并具有可比性。强度是比持续时间更可靠的反应预测因子。一个主要的结论是,无论人类的节奏分组偏好是否有先天的成分,至少基于持续时间的分组偏好,以及基于强度的影响的大小是由听者的背景决定的。
{"title":"Iambic-Trochaic Law Effects among Native Speakers of Spanish and English","authors":"Megan J. Crowhurst","doi":"10.5334/LABPHON.42","DOIUrl":"https://doi.org/10.5334/LABPHON.42","url":null,"abstract":"The Iambic-Trochaic Law (Bolton, 1894; Hayes, 1995; Woodrow, 1909) asserts that listeners associate greater intensity with group beginnings (a loud-first preference) and greater duration with group endings (a long-last preference). Hayes (1987; 1995) posits a natural connection between the prominences referred to in the ITL and the locations of stressed syllables in feet. However, not all lengthening in final positions originates with stressed syllables, and greater duration may also be associated with stress in nonfinal (trochaic) positions. The research described here challenged the notion that presumptive long-last effects necessarily reflect stress-related duration patterns, and investigated the general hypothesis that the robustness of long-last effects should vary depending on the strength of the association between final positions and increased duration, whatever its source. Two ITL studies were conducted in which native speakers of Spanish and of English grouped streams of rhythmically alternating syllables in which vowel intensity and/or duration levels were varied. These languages were chosen because while they are prosodically similar, increased duration on constituent-final syllables is both more common and more salient in English than Spanish. Outcomes revealed robust loud-first effects in both language groups. Long-last effects were significantly weaker in the Spanish group when vowel duration was varied singly. However, long-last effects were present and comparable in both language groups when intensity and duration were covaried. Intensity was a more robust predictor of responses than duration. A primary conclusion was that whether or not humans’ rhythmic grouping preferences have an innate component, duration-based grouping preferences, at least, and the magnitude of intensity-based effects are shaped by listeners’ backgrounds.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Analytical Decisions in Intonation Research and the Role of Representations: Lessons from Romani 语调研究中的分析决策与表征的作用:来自罗马语的经验
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-06-30 DOI: 10.5334/LABPHON.14
A. Arvaniti
This paper presents an analysis of the intonational system of Greek Thrace Romani. The analysis serves to highlight the difficulties that spontaneous fieldwork data pose for traditional methods of intonational research largely developed for use with controlled speech elicited in the laboratory or under laboratory-like conditions from educated speakers of standardized languages. It leads to proposing a set of principles and procedures which can help deal with the variability inherent in spontaneous data; these principles and procedures apply particularly to data from less homogeneous speech communities but are relevant for the intonation analysis of any linguistic system. This approach relies on the understanding that autosegmental-metrical representations of intonation are phonological representations, not means of faithfully depicting pitch contours per se. It follows that representations should capture what is contrastive in the intonational system under analysis. In turn, this entails that new categories are posited, taking the meaning of tonal events into account and after due consideration of all legitimate sources of phonetic variation. It is argued that following this procedure allows for more robust analyses and is particularly advantageous when data are highly variable. This view is discussed in light of the analysis of Greek Thrace Romani, and in combination with recent proposals for greater uniformity and phonetic transparency in intonational representations, traits which are said to lead to greater insights in typological and cross-varietal research. It is shown that these goals are not better served by a level of broad phonetic transcription which encodes an arbitrary selection of phonetic variants.
本文对希腊色雷斯罗马语的语调系统进行了分析。该分析强调了自发的实地调查数据给传统的语调研究方法带来的困难,这些方法主要用于在实验室或类似实验室的条件下从受过教育的标准化语言使用者那里引出的控制语音。它导致提出一套原则和程序,可以帮助处理自发数据中固有的可变性;这些原则和程序特别适用于来自不太同质的语音社区的数据,但与任何语言系统的语调分析相关。这种方法依赖于这样一种理解,即语调的自分段韵律表征是语音表征,而不是忠实地描绘音高轮廓本身的手段。由此可见,表征应该捕捉被分析的语调系统中的对比。反过来,这就需要假设新的类别,考虑到音调事件的意义,并在适当考虑语音变化的所有合法来源之后。有人认为,遵循这一程序允许更稳健的分析,并在数据高度可变时特别有利。这一观点是根据对希腊色雷斯罗马语的分析来讨论的,并结合最近提出的在语调表征中更大的统一性和语音透明度的建议,这些特征据说会在类型学和跨品种研究中产生更大的见解。这表明,这些目标并没有更好地服务于一个水平的广泛的语音转录编码的任意选择的语音变体。
{"title":"Analytical Decisions in Intonation Research and the Role of Representations: Lessons from Romani","authors":"A. Arvaniti","doi":"10.5334/LABPHON.14","DOIUrl":"https://doi.org/10.5334/LABPHON.14","url":null,"abstract":"This paper presents an analysis of the intonational system of Greek Thrace Romani. The analysis serves to highlight the difficulties that spontaneous fieldwork data pose for traditional methods of intonational research largely developed for use with controlled speech elicited in the laboratory or under laboratory-like conditions from educated speakers of standardized languages. It leads to proposing a set of principles and procedures which can help deal with the variability inherent in spontaneous data; these principles and procedures apply particularly to data from less homogeneous speech communities but are relevant for the intonation analysis of any linguistic system. This approach relies on the understanding that autosegmental-metrical representations of intonation are phonological representations, not means of faithfully depicting pitch contours per se. It follows that representations should capture what is contrastive in the intonational system under analysis. In turn, this entails that new categories are posited, taking the meaning of tonal events into account and after due consideration of all legitimate sources of phonetic variation. It is argued that following this procedure allows for more robust analyses and is particularly advantageous when data are highly variable. This view is discussed in light of the analysis of Greek Thrace Romani, and in combination with recent proposals for greater uniformity and phonetic transparency in intonational representations, traits which are said to lead to greater insights in typological and cross-varietal research. It is shown that these goals are not better served by a level of broad phonetic transcription which encodes an arbitrary selection of phonetic variants.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Surface and Structure: Transcribing Intonation within and across Languages 表面和结构:语言内部和跨语言的语调转录
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-06-30 DOI: 10.5334/LABPHON.10
Sónia Frota
Intonation is the phonologically structured variation in phonetic features, primarily pitch, to express phrase-level meanings. As in other speech sound domains, analyzing intonation involves mapping continuously variable physical parameters to categories. The categories of intonation are organized in a set of relations and rule-governed distributions that define the intonation system of a language. From physical realizations, as shown by pitch tracks, surface or phonetic tonal patterns can be identified in terms of tonal targets. Whether surface patterns correspond or not to categories within a given intonation system requires looking at their distributions and contrastiveness. In this paper, I assume the view that a transcription is an analysis of the intonation system, which ultimately aims to identify the contrastive intonation categories of a given language and establish how they signal meaning. Under this view, it is crucial to discuss the ways surface pitch patterns and structural pitch patterns (or phonological categories) are related. Given that intonational analysis is driven by system-internal considerations and that cues to a given category can vary across languages, it is also important to address the issue of how a language-specific transcription can be reconciled with the need and ability to do cross-language comparison of intonation. Bearing on these two issues, I discuss surface and structure in intonational analysis, drawing on mismatches between (dis)similarities in the phonetics and phonology of pitch contours, across languages and language varieties.
语调是语音特征(主要是音高)在语音结构上的变化,用以表达短语层面的意义。与其他语音领域一样,分析语调涉及到将连续变化的物理参数映射到类别。语调的类别被组织在一组关系和规则支配的分布中,这些分布定义了语言的语调系统。从物理实现来看,如音高轨迹所示,表面或语音的音调模式可以根据音调目标来识别。表面模式是否与给定语调系统中的类别相对应,需要观察它们的分布和对比性。在本文中,我认为抄写是对语调系统的分析,其最终目的是识别给定语言的对比语调类别,并确定它们如何表示意义。在这种观点下,讨论表面音高模式和结构音高模式(或语音类别)的关系是至关重要的。鉴于语调分析是由系统内部考虑驱动的,并且给定类别的线索在不同语言之间可能有所不同,因此解决特定语言的转录如何与跨语言语调比较的需要和能力相协调的问题也很重要。在这两个问题上,我讨论了语调分析的表面和结构,利用音高轮廓的语音和音系的不匹配(不)相似性,跨语言和语言品种。
{"title":"Surface and Structure: Transcribing Intonation within and across Languages","authors":"Sónia Frota","doi":"10.5334/LABPHON.10","DOIUrl":"https://doi.org/10.5334/LABPHON.10","url":null,"abstract":"Intonation is the phonologically structured variation in phonetic features, primarily pitch, to express phrase-level meanings. As in other speech sound domains, analyzing intonation involves mapping continuously variable physical parameters to categories. The categories of intonation are organized in a set of relations and rule-governed distributions that define the intonation system of a language. From physical realizations, as shown by pitch tracks, surface or phonetic tonal patterns can be identified in terms of tonal targets. Whether surface patterns correspond or not to categories within a given intonation system requires looking at their distributions and contrastiveness. In this paper, I assume the view that a transcription is an analysis of the intonation system, which ultimately aims to identify the contrastive intonation categories of a given language and establish how they signal meaning. Under this view, it is crucial to discuss the ways surface pitch patterns and structural pitch patterns (or phonological categories) are related. Given that intonational analysis is driven by system-internal considerations and that cues to a given category can vary across languages, it is also important to address the issue of how a language-specific transcription can be reconciled with the need and ability to do cross-language comparison of intonation. Bearing on these two issues, I discuss surface and structure in intonational analysis, drawing on mismatches between (dis)similarities in the phonetics and phonology of pitch contours, across languages and language varieties.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
New Methods for Prosodic Transcription: Capturing Variability as a Source of Information 韵律转录的新方法:捕捉可变性作为信息源
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-06-30 DOI: 10.5334/LABPHON.29
J. Cole, S. Shattuck-Hufnagel
Understanding the role of prosody in encoding linguistic meaning and in shaping phonetic form requires the analysis of prosodically annotated speech drawn from a wide variety of speech materials. Yet obtaining accurate and reliable prosodic annotations for even small datasets is challenging due to the time and expertise required. We discuss several factors that make prosodic annotation difficult and impact its reliability, all of which relate to variability : in the patterning of prosodic elements (features and structures) as they relate to the linguistic and discourse context, in the acoustic cues for those prosodic elements, and in the parameter values of the cues. We propose two novel methods for prosodic transcription that capture variability as a source of information relevant to the linguistic analysis of prosody. The first is Rapid Prosody Transcription (RPT), which can be performed by non-experts using a simple set of unary labels to mark prominence and boundaries based on immediate auditory impression. Inter-transcriber variability is used to calculate continuous-valued prosody ‘scores’ that are assigned to each word and represent the perceptual salience of its prosodic features or structure. RPT can be used to model the relative influence of top-down factors and acoustic cues in prosody perception, and to model prosodic variation across many dimensions, including language variety,speech style, or speaker’s affect. The second proposed method is the identification of individual cues to the contrastive prosodic elements of an utterance. Cue specification provides a link between the contrastive symbolic categories of prosodic structures and the continuous-valued parameters in the acoustic signal, and offers a framework for investigating how factors related to the grammatical and situational context influence the phonetic form of spoken words and phrases. While cue specification as a transcription tool has not yet been explored as RPT has, it has the potential to provide a level of detail that will be useful in modelling systematic context-governed variation in the implementation of prosodic categories, with applications in automatic speech synthesis and recognition, as well as modelling human speech production and perception. We discuss how RPT and cue specification, particularly when combined, can improve the efficiency and reliability of prosodic transcription and how they can be integrated with expert phonological transcription.
要理解韵律在编码语言意义和塑造语音形式方面的作用,就需要分析从各种各样的语音材料中提取的韵律注释语音。然而,由于需要时间和专业知识,即使是小数据集也很难获得准确可靠的韵律注释。我们讨论了使韵律注释困难并影响其可靠性的几个因素,所有这些因素都与变异性有关:与语言和话语上下文相关的韵律元素(特征和结构)的模式,韵律元素的声学线索,以及线索的参数值。我们提出了两种韵律转录的新方法,这些方法捕获了与韵律语言分析相关的变异性信息来源。第一种是快速韵律转录(RPT),它可以由非专家使用一组简单的单一标签来标记基于即时听觉印象的突出和边界。转录者间的可变性用于计算分配给每个单词的连续值韵律“分数”,并表示其韵律特征或结构的感知显著性。RPT可用于模拟韵律感知中自上而下因素和声音线索的相对影响,并在多个维度上模拟韵律变化,包括语言多样性、语音风格或说话者的影响。第二种方法是识别话语中对比韵律元素的个体线索。提示规范提供了韵律结构的对比符号类别与声信号中的连续值参数之间的联系,并为研究语法和情景上下文相关因素如何影响口语单词和短语的语音形式提供了一个框架。虽然线索规范作为一种转录工具还没有像RPT那样被探索,但它有潜力提供一定程度的细节,这将有助于在韵律类别的实现中对系统的上下文控制的变化进行建模,并应用于自动语音合成和识别,以及对人类语音产生和感知进行建模。我们讨论了RPT和线索规范,特别是当结合使用时,如何提高韵律转录的效率和可靠性,以及如何将它们与专家语音转录相结合。
{"title":"New Methods for Prosodic Transcription: Capturing Variability as a Source of Information","authors":"J. Cole, S. Shattuck-Hufnagel","doi":"10.5334/LABPHON.29","DOIUrl":"https://doi.org/10.5334/LABPHON.29","url":null,"abstract":"Understanding the role of prosody in encoding linguistic meaning and in shaping phonetic form requires the analysis of prosodically annotated speech drawn from a wide variety of speech materials. Yet obtaining accurate and reliable prosodic annotations for even small datasets is challenging due to the time and expertise required. We discuss several factors that make prosodic annotation difficult and impact its reliability, all of which relate to variability : in the patterning of prosodic elements (features and structures) as they relate to the linguistic and discourse context, in the acoustic cues for those prosodic elements, and in the parameter values of the cues. We propose two novel methods for prosodic transcription that capture variability as a source of information relevant to the linguistic analysis of prosody. The first is Rapid Prosody Transcription (RPT), which can be performed by non-experts using a simple set of unary labels to mark prominence and boundaries based on immediate auditory impression. Inter-transcriber variability is used to calculate continuous-valued prosody ‘scores’ that are assigned to each word and represent the perceptual salience of its prosodic features or structure. RPT can be used to model the relative influence of top-down factors and acoustic cues in prosody perception, and to model prosodic variation across many dimensions, including language variety,speech style, or speaker’s affect. The second proposed method is the identification of individual cues to the contrastive prosodic elements of an utterance. Cue specification provides a link between the contrastive symbolic categories of prosodic structures and the continuous-valued parameters in the acoustic signal, and offers a framework for investigating how factors related to the grammatical and situational context influence the phonetic form of spoken words and phrases. While cue specification as a transcription tool has not yet been explored as RPT has, it has the potential to provide a level of detail that will be useful in modelling systematic context-governed variation in the implementation of prosodic categories, with applications in automatic speech synthesis and recognition, as well as modelling human speech production and perception. We discuss how RPT and cue specification, particularly when combined, can improve the efficiency and reliability of prosodic transcription and how they can be integrated with expert phonological transcription.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
The Importance of a Distributional Approach to Categoriality in Autosegmental-Metrical Accounts of Intonation 分布方法在语调自音段格律描述中的重要性
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-06-30 DOI: 10.5334/LABPHON.28
F. Cangemi, M. Grice
When annotating a speech signal using an autosegmental-metrical model of intonation, transcribers associate portions of the F 0 contour with labels from a finite inventory of tonal categories. In the models we are concerned with here, these categories have the status of phonological units (phonological form), bridging the intrinsic variability of the speech signal (substance) with the intrinsic fuzziness of post-lexical function (meaning). This, together with the relatively small size of the label inventory, precludes a one-to-one relationship between form and substance, and/or between form and function. A Neapolitan Italian corpus of read speech is used to investigate the distributional properties of two pitch accents that have been studied extensively with respect to substance (the alignment of F 0 peaks) and meaning (sentence modality). Although there is a general consensus that peaks in this variety are aligned earlier in declaratives than in interrogatives, evidence is provided of contexts in which the converse is true, i.e., in which interrogative peaks are even earlier than their declarative counterparts. In this respect, interrogatives have a richer internal structure than declaratives. We argue that differences in how variably a prosodic category is encoded can be dealt with in an intonation transcription system, as long as this system relates phonological form (the choice of pitch accent in this case) both to phonetic substance and to meaning in a transparent way.
当使用语调的自分段-韵律模型注释语音信号时,转录者将f0轮廓的部分与音调类别的有限库存中的标签相关联。在我们这里所关注的模型中,这些类别具有语音单位(音位形式)的地位,将语音信号(实质)的内在变异性与词汇后功能(意义)的内在模糊性联系起来。这一点,再加上标签库存相对较小的尺寸,排除了形式与实质之间,和/或形式与功能之间的一对一关系。本文使用那不勒斯意大利语读语音语料库来研究两种音高重音的分布特性,这两种音高重音在物质(f0峰的排列)和意义(句子情态)方面得到了广泛的研究。尽管普遍的共识是,这种类型的峰值在陈述句中比在疑问句中排列得更早,但有证据表明,在相反的情况下,疑问句的峰值甚至比陈述句的峰值更早。在这方面,疑问句比陈述句具有更丰富的内部结构。我们认为,韵律类别编码的差异可以在语调转录系统中处理,只要该系统以透明的方式将语音形式(在这种情况下是音高重音的选择)与语音实质和意义联系起来。
{"title":"The Importance of a Distributional Approach to Categoriality in Autosegmental-Metrical Accounts of Intonation","authors":"F. Cangemi, M. Grice","doi":"10.5334/LABPHON.28","DOIUrl":"https://doi.org/10.5334/LABPHON.28","url":null,"abstract":"When annotating a speech signal using an autosegmental-metrical model of intonation, transcribers associate portions of the F 0 contour with labels from a finite inventory of tonal categories. In the models we are concerned with here, these categories have the status of phonological units (phonological form), bridging the intrinsic variability of the speech signal (substance) with the intrinsic fuzziness of post-lexical function (meaning). This, together with the relatively small size of the label inventory, precludes a one-to-one relationship between form and substance, and/or between form and function. A Neapolitan Italian corpus of read speech is used to investigate the distributional properties of two pitch accents that have been studied extensively with respect to substance (the alignment of F 0 peaks) and meaning (sentence modality). Although there is a general consensus that peaks in this variety are aligned earlier in declaratives than in interrogatives, evidence is provided of contexts in which the converse is true, i.e., in which interrogative peaks are even earlier than their declarative counterparts. In this respect, interrogatives have a richer internal structure than declaratives. We argue that differences in how variably a prosodic category is encoded can be dealt with in an intonation transcription system, as long as this system relates phonological form (the choice of pitch accent in this case) both to phonetic substance and to meaning in a transparent way.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Introducing Advancing Prosodic Transcription 介绍进阶韵律转录
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-06-30 DOI: 10.5334/LABPHON.32
Mariapaola D’Imperio, F. Cangemi, M. Grice
{"title":"Introducing Advancing Prosodic Transcription","authors":"Mariapaola D’Imperio, F. Cangemi, M. Grice","doi":"10.5334/LABPHON.32","DOIUrl":"https://doi.org/10.5334/LABPHON.32","url":null,"abstract":"","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Towards an International Prosodic Alphabet (IPrA) 走向国际韵律字母表(IPrA)
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-06-30 DOI: 10.5334/LABPHON.11
J. Hualde, P. Prieto
In this article we present a set of arguments in favor of having access to two levels of prosodic representation, broad phonetic and phonological, and the motivations for developing a set of cross-linguistically transparent and consistent labels (e.g., an International Prosodic Alphabet, IPrA) based on the Autosegmental-Metrical (AM) framework and the ToBI notation. Regarding segmental phonology, as well as lexical suprasegmentals (lexical tone and stress), both the use of two levels of representation and the existence of an international phonetic alphabet have proved to be very useful. The same benefits of adopting these conventions are likely to accrue in the study of intonation.
在这篇文章中,我们提出了一组论点,支持使用两个层次的韵律表示,广泛的语音和语音,以及基于自动分段-韵律(AM)框架和ToBI符号开发一套跨语言透明和一致的标签(例如,国际韵律字母表,IPrA)的动机。对于音段音系,以及词汇上的超音段(词汇的语调和重音),两层表征的使用和国际音标的存在都被证明是非常有用的。在语调的研究中,采用这些习惯也可能带来同样的好处。
{"title":"Towards an International Prosodic Alphabet (IPrA)","authors":"J. Hualde, P. Prieto","doi":"10.5334/LABPHON.11","DOIUrl":"https://doi.org/10.5334/LABPHON.11","url":null,"abstract":"In this article we present a set of arguments in favor of having access to two levels of prosodic representation, broad phonetic and phonological, and the motivations for developing a set of cross-linguistically transparent and consistent labels (e.g., an International Prosodic Alphabet, IPrA) based on the Autosegmental-Metrical (AM) framework and the ToBI notation. Regarding segmental phonology, as well as lexical suprasegmentals (lexical tone and stress), both the use of two levels of representation and the existence of an international phonetic alphabet have proved to be very useful. The same benefits of adopting these conventions are likely to accrue in the study of intonation.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Analysis of Intonation: the Case of MAE_ToBI 语调分析:以MAE_ToBI为例
IF 1.5 2区 文学 Q1 Health Professions Pub Date : 2016-06-30 DOI: 10.5334/labphon.30
C. Gussenhoven
Annotation systems for intonation contours are ideally based on a well-motivated phonological analysis of the language in question, such that instances of indecision are restricted to uncertainties over what intonational structure the speaker has used, rather than over the choice of label in situations where no suitably distinctive label is available or more than one suitable label is available. This contribution inventorizes a number of cases of overanalysis and underanalysis in MAE_ToBI and argues that they are in large part due to the decision by Pierrehumbert (1980) to analyze a rising-falling accent as a rising pitch accent (L+H*) followed by a L-tone from a different source (an ‘on-ramp’ analysis). It is shown how the opposite choice, a falling pitch accent preceded by a L-tone from a different source (an ‘off-ramp’ analysis), avoids most of these problems. Results from a perception experiment testing MAE_ToBI’s prediction of intonational boundaries show that steep falls do not always signal a boundary. The inclusion of a tritonal prenuclear pitch accent, which explains the absence of an intonational boundary after a steep fall followed by a gradual rise, can readily be accommodated in the ‘off-ramp’ analysis, but not in MAE_ToBI.
语调轮廓的标注系统在理想情况下是基于对所讨论语言的良好动机的音系分析,这样,犹豫不决的情况仅限于说话者使用的语调结构的不确定性,而不是在没有合适的独特标签或有多个合适标签的情况下选择标签。这篇文章整理了MAE_ToBI中一些过度分析和分析不足的案例,并认为这在很大程度上是由于Pierrehumbert(1980)的决定,他将升-降重音分析为升音重音(L+H*),然后是来自不同来源的L音(“入-匝道”分析)。它展示了相反的选择是如何避免大多数这些问题的,即一个降调重音前面有一个来自不同来源的l音(“off-ramp”分析)。一项测试MAE_ToBI对语调边界预测的感知实验结果表明,陡峭的跌落并不总是边界的信号。包含一个常规的核前音高重音,这解释了在急剧下降之后逐渐上升的语调边界的缺失,可以很容易地在“off-ramp”分析中被容纳,但在MAE_ToBI中没有。
{"title":"Analysis of Intonation: the Case of MAE_ToBI","authors":"C. Gussenhoven","doi":"10.5334/labphon.30","DOIUrl":"https://doi.org/10.5334/labphon.30","url":null,"abstract":"Annotation systems for intonation contours are ideally based on a well-motivated phonological analysis of the language in question, such that instances of indecision are restricted to uncertainties over what intonational structure the speaker has used, rather than over the choice of label in situations where no suitably distinctive label is available or more than one suitable label is available. This contribution inventorizes a number of cases of overanalysis and underanalysis in MAE_ToBI and argues that they are in large part due to the decision by Pierrehumbert (1980) to analyze a rising-falling accent as a rising pitch accent (L+H*) followed by a L-tone from a different source (an ‘on-ramp’ analysis). It is shown how the opposite choice, a falling pitch accent preceded by a L-tone from a different source (an ‘off-ramp’ analysis), avoids most of these problems. Results from a perception experiment testing MAE_ToBI’s prediction of intonational boundaries show that steep falls do not always signal a boundary. The inclusion of a tritonal prenuclear pitch accent, which explains the absence of an intonational boundary after a steep fall followed by a gradual rise, can readily be accommodated in the ‘off-ramp’ analysis, but not in MAE_ToBI.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70691513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
Laboratory Phonology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1