Crossroads Corpus creation: Design and case study

Abbie Hantgan-Sonko
{"title":"Crossroads Corpus creation: Design and case study","authors":"Abbie Hantgan-Sonko","doi":"10.1515/yplm-2017-0009","DOIUrl":null,"url":null,"abstract":"Abstract This paper illustrates a methodological approach to the design of an annotated corpus using a case study of phonetic convergences and divergences by multilingual speakers in southwestern Senegal’s Casamance region. The newly compiled corpus contains approximately 183,000 annotations of multilingual, spoken data, gathered by eight researchers over a ten year span using methods ranging from structured lexical elicitation in controlled contexts to naturally occurring, multilingual conversations. The area from which the data were collected consists of three villages and their primary languages, and yet many more contribute to the linguistic landscape. Detailed metadata inform analyses of variation, the context in which a speech act took place and between whom, the speakers’ linguistic repertoires, trajectories, and social networks, as well as the larger language context. A potential path for convergence or divergence that emerged during data collection and in building and searching the corpus is the crossroads in the phonetic production of word-initial velar plosives. Word-initial [k] emerges in one language where only [ɡ] is present in the other; the third utilizes both. The corpus design makes it feasible, not only to identify areas of accommodation, but to grasp the context, enabling a sociolinguistically informed analysis of the speakers’ linguistic behavior.","PeriodicalId":431433,"journal":{"name":"Yearbook of the Poznan Linguistic Meeting","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Yearbook of the Poznan Linguistic Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/yplm-2017-0009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract This paper illustrates a methodological approach to the design of an annotated corpus using a case study of phonetic convergences and divergences by multilingual speakers in southwestern Senegal’s Casamance region. The newly compiled corpus contains approximately 183,000 annotations of multilingual, spoken data, gathered by eight researchers over a ten year span using methods ranging from structured lexical elicitation in controlled contexts to naturally occurring, multilingual conversations. The area from which the data were collected consists of three villages and their primary languages, and yet many more contribute to the linguistic landscape. Detailed metadata inform analyses of variation, the context in which a speech act took place and between whom, the speakers’ linguistic repertoires, trajectories, and social networks, as well as the larger language context. A potential path for convergence or divergence that emerged during data collection and in building and searching the corpus is the crossroads in the phonetic production of word-initial velar plosives. Word-initial [k] emerges in one language where only [ɡ] is present in the other; the third utilizes both. The corpus design makes it feasible, not only to identify areas of accommodation, but to grasp the context, enabling a sociolinguistically informed analysis of the speakers’ linguistic behavior.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
十字路口语料库创建:设计与案例研究
摘要本文通过对塞内加尔西南部卡萨芒斯地区多语使用者语音融合和差异的案例研究,阐述了一种方法方法来设计一个带注释的语料库。新编译的语料库包含大约183,000个多语言口语数据注释,由8位研究人员在10年的时间里收集,使用的方法从受控环境中的结构化词汇引出到自然发生的多语言对话。收集数据的地区包括三个村庄和他们的主要语言,但还有更多的村庄对语言景观做出了贡献。详细的元数据为变异分析提供了信息,包括言语行为发生的语境、说话者的语言技能、轨迹、社交网络以及更大的语言语境。在数据收集和语料库的构建和搜索过程中出现的趋同或分歧的潜在路径是单词起始元音爆破音的语音产生的十字路口。单词首字母[k]出现在一种语言中,而在另一种语言中只有[j]出现;第三种是两者兼而有之。语料库的设计使其变得可行,不仅可以确定适应的领域,而且可以掌握上下文,从而对说话者的语言行为进行社会语言学上的分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Semantic prosody of extended lexical units: A case study London calling (or cooling?): Feature theory, phonetic variation, and phonological change New vs. similar sound production accuracy: The uneven fight A critical look at partial acceptability in English and Polish Foreword to the special section
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1