Crowdsourcing ratings for single lexical items

Elena Volodina, David Alfter, Therese Lindström Tiedemann
{"title":"Crowdsourcing ratings for single lexical items","authors":"Elena Volodina, David Alfter, Therese Lindström Tiedemann","doi":"10.4312/slo2.0.2022.2.5-61","DOIUrl":null,"url":null,"abstract":"In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learners’ rankings can be used for assigning levels to unseen vocabulary. The study is performed on Swedish single-word items.\nThe four hypotheses we examine are: (1) there is core vocabulary for each proficiency level, but this is only true until CEFR level B2 (upper-intermediate); (2) core vocabulary shows more systematicity in its behavior and usage, whereas peripheral items have more idiosyncratic behavior; (3) given that we have truly core items (aka anchor items) for each level, we can place any new unseen item in relation to the identified core items by using a series of comparative judgment tasks, this way assigning a “target” level for a previously unseen item; and (4) non-experts will perform on par with experts in a comparative judgment setting. The hypotheses have been largely confirmed: In relation to (1) and (2), our results show that there seems to be some systematicity in core vocabulary for early to mid-levels (A1-B1) while we find less systematicity for higher levels (B2-C1). In relation to (3), we suggest crowdsourcing word rankings using comparative judgment with known anchor words as a method to assign a “target” level to unseen words. With regard to (4), we confirm the previous findings that non-experts, in our case language learners, can be effectively used for the linguistic annotation tasks in a comparative judgment setting.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Slovenščina 2.0: empirical, applied and interdisciplinary research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4312/slo2.0.2022.2.5-61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learners’ rankings can be used for assigning levels to unseen vocabulary. The study is performed on Swedish single-word items. The four hypotheses we examine are: (1) there is core vocabulary for each proficiency level, but this is only true until CEFR level B2 (upper-intermediate); (2) core vocabulary shows more systematicity in its behavior and usage, whereas peripheral items have more idiosyncratic behavior; (3) given that we have truly core items (aka anchor items) for each level, we can place any new unseen item in relation to the identified core items by using a series of comparative judgment tasks, this way assigning a “target” level for a previously unseen item; and (4) non-experts will perform on par with experts in a comparative judgment setting. The hypotheses have been largely confirmed: In relation to (1) and (2), our results show that there seems to be some systematicity in core vocabulary for early to mid-levels (A1-B1) while we find less systematicity for higher levels (B2-C1). In relation to (3), we suggest crowdsourcing word rankings using comparative judgment with known anchor words as a method to assign a “target” level to unseen words. With regard to (4), we confirm the previous findings that non-experts, in our case language learners, can be effectively used for the linguistic annotation tasks in a comparative judgment setting.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
众包单个词汇的评分
在本研究中,我们采用统计学方法结合众包的方法,探讨了与区分不同语言熟练程度的核心词汇和外围词汇相关的理论和实践问题。我们还研究了众包第二语言学习者的排名是否可以用于为未见过的词汇分配级别。这项研究是在瑞典语单词项目上进行的。我们检验的四个假设是:(1)每个熟练程度都有核心词汇,但这只适用于CEFR B2(中高级)水平;(2)核心词汇的行为和使用表现出更强的系统性,而外围词汇的行为表现出更强的特质性;(3)鉴于我们在每个关卡中都拥有真正的核心道具(即锚定道具),我们可以通过使用一系列比较判断任务将任何新的未见道具与已识别的核心道具联系起来,从而为之前未见的道具分配“目标”关卡;(4)在比较判断设置中,非专家的表现与专家相当。假设在很大程度上得到了证实:关于(1)和(2),我们的结果表明,在早期到中期水平(A1-B1)的核心词汇中似乎存在一些系统性,而在较高水平(B2-C1)的核心词汇中,我们发现系统性较低。关于(3),我们建议使用与已知锚词比较判断的众包词排名作为一种为未见词分配“目标”级别的方法。关于(4),我们证实了之前的研究结果,即在我们的案例中,语言学习者可以有效地用于比较判断设置中的语言注释任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Application of crowdsourcing in education on the example of eTwinning EnetCollect – European Network for Combining Language Learning with Crowdsourcing Techniques (COST Action CA16105) Crowdsourcing and language learning habits and practices in Turkey, Bosnia and Herzegovina, the Republic of North Macedonia and Poland in the pre-pandemic and pandemic periods Crowdsourcing ratings for single lexical items Data preparation in crowdsourcing for pedagogical purposes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1