Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-158
Heini Kallio, Rosa Suviranta, M. Kuronen, Anna von Zansen
While utterance fluency measures are often studied in relation to perceived L2 fluency and proficiency, the effect of creaky voice remains ignored. However, creaky voice is frequent in a number of languages, including Finnish, where it serves as a cue for phrase-boundaries and turn-taking. In this study we investigate the roles of creaky voice and utterance fluency measures in predicting fluency and proficiency ratings of spontaneous L2 Finnish (F2) speech. In so doing, 16 expert raters participated in assessing narrative spontaneous speech samples from 160 learners of Finnish. The effect of creaky voice and utterance fluency measures on proficiency and fluency ratings was studied using linear regression models. The results indicate that creaky voice can contribute to both oral proficiency and fluency alongside utterance fluency measures. Furthermore, average duration of composite breaks – a measure combining breakdown and repair phenomena – proved to be the most significant predictor of fluency. Based on these findings we recommend further investigation of the effect of creaky voice to the assessment of L2 speech as well as reconsideration of the utterance fluency measures used in predicting L2 fluency or proficiency.
{"title":"Creaky voice and utterance fluency measures in predicting fluency and oral proficiency of spontaneous L2 Finnish","authors":"Heini Kallio, Rosa Suviranta, M. Kuronen, Anna von Zansen","doi":"10.21437/speechprosody.2022-158","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-158","url":null,"abstract":"While utterance fluency measures are often studied in relation to perceived L2 fluency and proficiency, the effect of creaky voice remains ignored. However, creaky voice is frequent in a number of languages, including Finnish, where it serves as a cue for phrase-boundaries and turn-taking. In this study we investigate the roles of creaky voice and utterance fluency measures in predicting fluency and proficiency ratings of spontaneous L2 Finnish (F2) speech. In so doing, 16 expert raters participated in assessing narrative spontaneous speech samples from 160 learners of Finnish. The effect of creaky voice and utterance fluency measures on proficiency and fluency ratings was studied using linear regression models. The results indicate that creaky voice can contribute to both oral proficiency and fluency alongside utterance fluency measures. Furthermore, average duration of composite breaks – a measure combining breakdown and repair phenomena – proved to be the most significant predictor of fluency. Based on these findings we recommend further investigation of the effect of creaky voice to the assessment of L2 speech as well as reconsideration of the utterance fluency measures used in predicting L2 fluency or proficiency.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"186 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123041239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-70
Zhiqiang Zhu, P. Mok
Whether speech rate can transfer between languages with distinctive speech rates is an understudied issue. Impressionistically, Japanese is faster than Mandarin Chinese. We investigated the speech rate of native Japanese and Mandarin speakers, advanced L2 learners and simultaneous bilinguals respectively. Nine native Beijing Mandarin speakers, five native Japanese speakers, thirteen advanced L1 Japanese learners of Mandarin and eleven Japanese-Mandarin simultaneous bilinguals participated in a passage reading task and a spontaneous speech task. The comparison between Japanese and Mandarin by native Mandarin speakers and native Japanese speakers confirmed that the speech rate of native Japanese was faster than that of Mandarin. Comparison between the speech rate of Japanese and Mandarin by advanced Japanese learners and simultaneous bilinguals showed that both groups produced Japanese constantly faster than their Mandarin. Both advanced Japanese learners and simultaneous bilinguals produced Japanese similarly as native Japanese speakers did. However, the Mandarin speech rate by advanced Japanese learners was significantly slower than that of native Mandarin speakers, while the Mandarin speech rate between simultaneous bilinguals and native Mandarin speakers remained non-significant. The findings challenge previous proposals that speech rate transfer could happen at a language level. Moreover, simultaneous bilinguals showed an advantage over advanced L2 learners in speech rate mastery.
{"title":"Can speech rate transfer between languages? Evidence from Japanese and Mandarin Chinese","authors":"Zhiqiang Zhu, P. Mok","doi":"10.21437/speechprosody.2022-70","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-70","url":null,"abstract":"Whether speech rate can transfer between languages with distinctive speech rates is an understudied issue. Impressionistically, Japanese is faster than Mandarin Chinese. We investigated the speech rate of native Japanese and Mandarin speakers, advanced L2 learners and simultaneous bilinguals respectively. Nine native Beijing Mandarin speakers, five native Japanese speakers, thirteen advanced L1 Japanese learners of Mandarin and eleven Japanese-Mandarin simultaneous bilinguals participated in a passage reading task and a spontaneous speech task. The comparison between Japanese and Mandarin by native Mandarin speakers and native Japanese speakers confirmed that the speech rate of native Japanese was faster than that of Mandarin. Comparison between the speech rate of Japanese and Mandarin by advanced Japanese learners and simultaneous bilinguals showed that both groups produced Japanese constantly faster than their Mandarin. Both advanced Japanese learners and simultaneous bilinguals produced Japanese similarly as native Japanese speakers did. However, the Mandarin speech rate by advanced Japanese learners was significantly slower than that of native Mandarin speakers, while the Mandarin speech rate between simultaneous bilinguals and native Mandarin speakers remained non-significant. The findings challenge previous proposals that speech rate transfer could happen at a language level. Moreover, simultaneous bilinguals showed an advantage over advanced L2 learners in speech rate mastery.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123983734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-48
Zhenyang Xi, Yan Gu, G. Vigliocco
Past research has shown that while speaking children before 3-year-old often use gesture to supplement speech while not using gesture as an integrated system with speech, and that the relationship between speech and gesture may relate to vocabulary development. However, such a relationship is unknown in 3-4-year-old children, a period in which we can capture key developmental changes from using gestures alone to using them along with speech. Using a new corpus of semi-naturalistic interaction between caregivers and their 3-4-year-old children (ECOLANG Corpus), this study investigates (1) the effect of age on children’s speaking and gesture rate, (2) the relationship between speaking and gesture rates and (3) their correlation with word learning. Specifically, we studied speaking and gesture rates of 32 English-speaking children while talking with their caregivers about sets of pre-selected toys. The children completed a vocabulary test at the time of the experiment and one year later. Results show that there was no effect of age on speaking and gesture rates at this age range, but we found that children with a fast speaking rate also had a higher gesture rate. Additionally, neither speaking rate nor gesture rate correlates with word. Thus, our findings show that by this age, children use gestures that are integrated with speech and their relationship is no longer a predictor of vocabulary learning. We speculate that the transition in the relationship is mainly a result of enhanced conceptual representation ability.
{"title":"Speaking Rate in 3-4-Year-Old Children: Its Correlation with Gesture Rate and Word Learning","authors":"Zhenyang Xi, Yan Gu, G. Vigliocco","doi":"10.21437/speechprosody.2022-48","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-48","url":null,"abstract":"Past research has shown that while speaking children before 3-year-old often use gesture to supplement speech while not using gesture as an integrated system with speech, and that the relationship between speech and gesture may relate to vocabulary development. However, such a relationship is unknown in 3-4-year-old children, a period in which we can capture key developmental changes from using gestures alone to using them along with speech. Using a new corpus of semi-naturalistic interaction between caregivers and their 3-4-year-old children (ECOLANG Corpus), this study investigates (1) the effect of age on children’s speaking and gesture rate, (2) the relationship between speaking and gesture rates and (3) their correlation with word learning. Specifically, we studied speaking and gesture rates of 32 English-speaking children while talking with their caregivers about sets of pre-selected toys. The children completed a vocabulary test at the time of the experiment and one year later. Results show that there was no effect of age on speaking and gesture rates at this age range, but we found that children with a fast speaking rate also had a higher gesture rate. Additionally, neither speaking rate nor gesture rate correlates with word. Thus, our findings show that by this age, children use gestures that are integrated with speech and their relationship is no longer a predictor of vocabulary learning. We speculate that the transition in the relationship is mainly a result of enhanced conceptual representation ability.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128415797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-15
Marita K Everhardt, A. Sarampalis, M. Coler, D. Başkent, W. Lowie
This study assesses how a cochlear implant (CI) simulation influences the interpretation of prosodically marked linguistic focus in a non-native language. In an online experiment, two groups of normal-hearing native Dutch learners of English of different ages (12–14 year-old adolescents vs. 18 + year-old adults) and with different proficiency levels in English (A2 vs. B2/C1) were asked to listen to CI-simulated and non-CI-simulated English sentences differing in prosodically marked focus and indicate which of four possible context questions the speaker answered. Results show that, as expected, focus interpretation is significantly less accurate in the CI-simulated condition compared to the non-CI-simulated condition and that more proficient non-native listeners outperform less proficient non-native listeners. However, there was no interaction between the influence of the spectro-temporal degradation of the CI-simulated speech signal and that of the English proficiency level of the non-native listeners, suggesting that less proficient nonnative listeners are not more strongly affected by the spectro-temporal degradation of an electric speech signal than more proficient non-native listeners.
{"title":"Interpretation of prosodically marked focus in cochlear implant-simulated speech by non-native listeners","authors":"Marita K Everhardt, A. Sarampalis, M. Coler, D. Başkent, W. Lowie","doi":"10.21437/speechprosody.2022-15","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-15","url":null,"abstract":"This study assesses how a cochlear implant (CI) simulation influences the interpretation of prosodically marked linguistic focus in a non-native language. In an online experiment, two groups of normal-hearing native Dutch learners of English of different ages (12–14 year-old adolescents vs. 18 + year-old adults) and with different proficiency levels in English (A2 vs. B2/C1) were asked to listen to CI-simulated and non-CI-simulated English sentences differing in prosodically marked focus and indicate which of four possible context questions the speaker answered. Results show that, as expected, focus interpretation is significantly less accurate in the CI-simulated condition compared to the non-CI-simulated condition and that more proficient non-native listeners outperform less proficient non-native listeners. However, there was no interaction between the influence of the spectro-temporal degradation of the CI-simulated speech signal and that of the English proficiency level of the non-native listeners, suggesting that less proficient nonnative listeners are not more strongly affected by the spectro-temporal degradation of an electric speech signal than more proficient non-native listeners.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116823430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-111
Yaqian Huang
Period-doubled phonation is a type of creaky voice that contains two alternating periods. By presenting data from Mandarin Chinese read speech recordings, this study probes the articulatory properties of period-doubled phonation and its tonal distribution based on time-domain measures using electroglottography (EGG). Period doubling (PD) was found across all the tones (T3: 43% > T2 > T4 > T1: 11%), which was more prevalent than vocal fry, found mainly in T3 (48%) and T2 (43%), and only sporadically in T4 (7%) and T1 (2%). We calculated the two alternating glottal periods in PD, and they exhibited a ratio close to 3:2 or 2:1. The two pulses also alternated between higher and lower amplitudes with a mean ratio approximating 2 or 1.6. Women tended to produce more PD than men. Moreover, the contact quotient of PD, measured via EGG using the hybrid method, was around 0.5, similar to modal voice (0.54) and smaller than that of vocal fry (0.74), implying a more balanced opening and contact phase during phonation. Alternation of contact quotient and symmetry quotient was also seen in a few samples, suggesting that PD is likely articulated through two alternating pulses with distinct voice qualities and pitches.
{"title":"Articulatory properties of period-doubled voice in Mandarin","authors":"Yaqian Huang","doi":"10.21437/speechprosody.2022-111","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-111","url":null,"abstract":"Period-doubled phonation is a type of creaky voice that contains two alternating periods. By presenting data from Mandarin Chinese read speech recordings, this study probes the articulatory properties of period-doubled phonation and its tonal distribution based on time-domain measures using electroglottography (EGG). Period doubling (PD) was found across all the tones (T3: 43% > T2 > T4 > T1: 11%), which was more prevalent than vocal fry, found mainly in T3 (48%) and T2 (43%), and only sporadically in T4 (7%) and T1 (2%). We calculated the two alternating glottal periods in PD, and they exhibited a ratio close to 3:2 or 2:1. The two pulses also alternated between higher and lower amplitudes with a mean ratio approximating 2 or 1.6. Women tended to produce more PD than men. Moreover, the contact quotient of PD, measured via EGG using the hybrid method, was around 0.5, similar to modal voice (0.54) and smaller than that of vocal fry (0.74), implying a more balanced opening and contact phase during phonation. Alternation of contact quotient and symmetry quotient was also seen in a few samples, suggesting that PD is likely articulated through two alternating pulses with distinct voice qualities and pitches.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114595145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-96
L. Plug, R. Lennon, Rachel Smith
We report on an experiment aimed to test the hypothesis that listeners orient to canonical forms when judging the tempo of reduced speech. Orientation to canonical forms should yield higher tempo estimates than orientation to surface phone strings when canonical phones are deleted. We tested the hypothesis for English, capitalizing on the fact that the non-realization of schwa in an unstressed syllable (e.g. support ) may result in a surface phone string associated with a different word than the intended one ( sport ). We presented listeners with sentences containing ambiguous surface realizations, along with orthographic representations which convinced some that they were listening to disyllabic words ( support etc.) and others that they were listening to monosyllabic ones ( sport etc.). Asking listeners to judge the tempo of the sentences allowed us to assess whether the difference in imposed lexical interpretation had an impact on perceived tempo. Our results reveal the predicted effect of the imposed interpretation: sentences with a ‘disyllabic’ interpretation for the ambiguous word form were judged faster than (the same) sentences with a ‘monosyllabic’ interpretation.
{"title":"Schwa deletion and perceived tempo in English","authors":"L. Plug, R. Lennon, Rachel Smith","doi":"10.21437/speechprosody.2022-96","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-96","url":null,"abstract":"We report on an experiment aimed to test the hypothesis that listeners orient to canonical forms when judging the tempo of reduced speech. Orientation to canonical forms should yield higher tempo estimates than orientation to surface phone strings when canonical phones are deleted. We tested the hypothesis for English, capitalizing on the fact that the non-realization of schwa in an unstressed syllable (e.g. support ) may result in a surface phone string associated with a different word than the intended one ( sport ). We presented listeners with sentences containing ambiguous surface realizations, along with orthographic representations which convinced some that they were listening to disyllabic words ( support etc.) and others that they were listening to monosyllabic ones ( sport etc.). Asking listeners to judge the tempo of the sentences allowed us to assess whether the difference in imposed lexical interpretation had an impact on perceived tempo. Our results reveal the predicted effect of the imposed interpretation: sentences with a ‘disyllabic’ interpretation for the ambiguous word form were judged faster than (the same) sentences with a ‘monosyllabic’ interpretation.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114859609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-27
Laurence White, H. Grimes
Prosodic features anecdotally associated with the speech of people with clinical depression include slower rate, lower pitch range and reduced loudness, but there is a significant degree of contradiction in the literature regarding depressed prosody. This complex picture reflects the heterogeneity of depression aetiology, symptomatology and prognosis. It is also likely to be influenced by elicitation methods, in particular, whether natural dialogue contexts are employed and whether the interlocutor’s prosody is also considered. We analysed 40 patient-therapist dialogues from the first and last of 29 weekly sessions of a behavioural therapy for refractory depression, sampling early and late in both sessions. Across all dialogues, we found that therapists spoke faster than patients, as expected, but in female-female therapist-patient dialogues (the majority of our sample), patients’ articulation rate increased substantially over the first session. Moreover, and contrary to expectations, there was a positive correlation between articulation rate and assessed depression severity (PHQ-9 scale) in the final therapy session, also evident in therapists’ speech for female-female dialogues. We suggest that this may reflect features of anxiety in speakers with ongoing depression and possibly also personality characteristics. We also consider evidence for prosodic convergence between patients and therapists.
{"title":"Articulation rate in psychotherapeutic dialogues for depression: patients and therapists","authors":"Laurence White, H. Grimes","doi":"10.21437/speechprosody.2022-27","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-27","url":null,"abstract":"Prosodic features anecdotally associated with the speech of people with clinical depression include slower rate, lower pitch range and reduced loudness, but there is a significant degree of contradiction in the literature regarding depressed prosody. This complex picture reflects the heterogeneity of depression aetiology, symptomatology and prognosis. It is also likely to be influenced by elicitation methods, in particular, whether natural dialogue contexts are employed and whether the interlocutor’s prosody is also considered. We analysed 40 patient-therapist dialogues from the first and last of 29 weekly sessions of a behavioural therapy for refractory depression, sampling early and late in both sessions. Across all dialogues, we found that therapists spoke faster than patients, as expected, but in female-female therapist-patient dialogues (the majority of our sample), patients’ articulation rate increased substantially over the first session. Moreover, and contrary to expectations, there was a positive correlation between articulation rate and assessed depression severity (PHQ-9 scale) in the final therapy session, also evident in therapists’ speech for female-female dialogues. We suggest that this may reflect features of anxiety in speakers with ongoing depression and possibly also personality characteristics. We also consider evidence for prosodic convergence between patients and therapists.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128173137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-16
Kexin Du, S. Avrutin, Aoju Chen
Past research on the role of prosody in reference is primarily concerned with how adults and children use prosodic cues to signal accessibility change from givenness to newness of the same noun phrase. This study explores the role of prosody in the referential dependencies between one antecedent noun-phrase and one reflexive anaphor ‘zi-ji’ (oneself) in Mandarin-speaking adults and children. In sentences like “Boris dreamed that Miffy painted ‘zi-ji’”, ‘zi-ji’ can establish two types of anaphor-antecedent dependencies: (1) a local dependency where ‘zi-ji’ refers to Miffy, (2) a non-local dependency where ‘zi-ji’ refers to Boris. Such sentences were elicited in both interpretations from Mandarin-speaking adults and 6 to 10-year-olds in a picture-matching game. Duration analysis on ‘zi-ji’ shows that adults produced ‘zi-ji’ with a longer duration in the non-local dependency condition than in the local dependency condition. This result can be explained by the economy hierarchy model, whereby the local antecedent made more accessible by the locality constraint is preferred, thus necessitating the use of more prosodic prominence to mark the less accessible non-local antecedent. This pattern was not found in children’s production, suggesting prolonged acquisition of using prosody to build anaphor-antecedent dependencies for ‘zi-ji’.
{"title":"Building bridges: The role of prosody in Mandarin-speaking adults' and children's anaphora resolution","authors":"Kexin Du, S. Avrutin, Aoju Chen","doi":"10.21437/speechprosody.2022-16","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-16","url":null,"abstract":"Past research on the role of prosody in reference is primarily concerned with how adults and children use prosodic cues to signal accessibility change from givenness to newness of the same noun phrase. This study explores the role of prosody in the referential dependencies between one antecedent noun-phrase and one reflexive anaphor ‘zi-ji’ (oneself) in Mandarin-speaking adults and children. In sentences like “Boris dreamed that Miffy painted ‘zi-ji’”, ‘zi-ji’ can establish two types of anaphor-antecedent dependencies: (1) a local dependency where ‘zi-ji’ refers to Miffy, (2) a non-local dependency where ‘zi-ji’ refers to Boris. Such sentences were elicited in both interpretations from Mandarin-speaking adults and 6 to 10-year-olds in a picture-matching game. Duration analysis on ‘zi-ji’ shows that adults produced ‘zi-ji’ with a longer duration in the non-local dependency condition than in the local dependency condition. This result can be explained by the economy hierarchy model, whereby the local antecedent made more accessible by the locality constraint is preferred, thus necessitating the use of more prosodic prominence to mark the less accessible non-local antecedent. This pattern was not found in children’s production, suggesting prolonged acquisition of using prosody to build anaphor-antecedent dependencies for ‘zi-ji’.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126719239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-114
Bogdan Ludusan, P. Wagner
Laughter is one of the most widely-encountered paralinguis-tic phenomena in human interaction and it has been studied from different perspectives throughout the years, including its acoustic-prosodic realization. However, previous studies have mostly focused on fundamental frequency and duration measures, with other prosodic features being less studied. We ex-amine here the acoustic marking of laughter in terms of intensity and voice quality characteristics, by using a corpus of spontaneous conversations. We operationalized the two cues by means of the root-mean-square energy and the cepstral peak prominence, respectively. Examining laughs, speech-laughs and speech instances at two different levels (that of the entire event and at the syllable nucleus level) we observed the least regular phonation for laughs and the most regular one for speech, while intensity was the highest for speech-laughs, followed by laughter and the lowest for speech. Using mixed effect models we determined that all three vocalization classes differ significantly from one another, in terms of both acoustic cues. Moreover, an interesting effect of syllable position was seen for laughter, with phonation becoming more regular for later syllables.
{"title":"ha-HA-hha? Intensity and voice quality characteristics of laughter","authors":"Bogdan Ludusan, P. Wagner","doi":"10.21437/speechprosody.2022-114","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-114","url":null,"abstract":"Laughter is one of the most widely-encountered paralinguis-tic phenomena in human interaction and it has been studied from different perspectives throughout the years, including its acoustic-prosodic realization. However, previous studies have mostly focused on fundamental frequency and duration measures, with other prosodic features being less studied. We ex-amine here the acoustic marking of laughter in terms of intensity and voice quality characteristics, by using a corpus of spontaneous conversations. We operationalized the two cues by means of the root-mean-square energy and the cepstral peak prominence, respectively. Examining laughs, speech-laughs and speech instances at two different levels (that of the entire event and at the syllable nucleus level) we observed the least regular phonation for laughs and the most regular one for speech, while intensity was the highest for speech-laughs, followed by laughter and the lowest for speech. Using mixed effect models we determined that all three vocalization classes differ significantly from one another, in terms of both acoustic cues. Moreover, an interesting effect of syllable position was seen for laughter, with phonation becoming more regular for later syllables.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123555313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}