面向中学学习者的数字科学资源词汇

Rebeca Arndt
{"title":"面向中学学习者的数字科学资源词汇","authors":"Rebeca Arndt","doi":"10.1016/j.acorp.2022.100023","DOIUrl":null,"url":null,"abstract":"<div><p>This corpus-based study examined the vocabulary in a 2.7-million-token corpus composed of digital science resources for middle school (6–8 grade) students in the United States. The findings of this study show that to reach the suggested 95%–98% lexical coverage thresholds of the Digital Science Corpus (DSC) that are conventionally deemed to facilitate minimal and optimal reading comprehension (Laufer, 2020), middle school (MS) students grade 6–8 must recognize the first 6,000 and 14,000 most frequent word families in the BNC/COCA (Nation, 2012), respectively, plus proper nouns and marginal words. The results of the lexical analysis across the three sub-corpora in the DSC suggest that the Life Science sub-corpora has a considerably larger vocabulary load than the Physical Science and Earth and Space Science sub-corpora. Additionally, while 98.60% of the most frequent 1,000 BNC/COCA word families occurred at least six times in the DSC, the 2,000–7,000 BNC/COCA word families provided significantly fewer opportunities for repeated occurrence. Since more than half of the words in the 5,000–7,000 BNC/COCA bands occurred five times or less in the overall corpus, most words across these bands do not have high enough frequency in the digital science resources to allow MS students to learn them incidentally from reading the texts found in digital science resources. Several pedagogically relevant suggestions for middle school science teachers are discussed.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vocabulary in digital science resources for middle school learners\",\"authors\":\"Rebeca Arndt\",\"doi\":\"10.1016/j.acorp.2022.100023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This corpus-based study examined the vocabulary in a 2.7-million-token corpus composed of digital science resources for middle school (6–8 grade) students in the United States. The findings of this study show that to reach the suggested 95%–98% lexical coverage thresholds of the Digital Science Corpus (DSC) that are conventionally deemed to facilitate minimal and optimal reading comprehension (Laufer, 2020), middle school (MS) students grade 6–8 must recognize the first 6,000 and 14,000 most frequent word families in the BNC/COCA (Nation, 2012), respectively, plus proper nouns and marginal words. The results of the lexical analysis across the three sub-corpora in the DSC suggest that the Life Science sub-corpora has a considerably larger vocabulary load than the Physical Science and Earth and Space Science sub-corpora. Additionally, while 98.60% of the most frequent 1,000 BNC/COCA word families occurred at least six times in the DSC, the 2,000–7,000 BNC/COCA word families provided significantly fewer opportunities for repeated occurrence. Since more than half of the words in the 5,000–7,000 BNC/COCA bands occurred five times or less in the overall corpus, most words across these bands do not have high enough frequency in the digital science resources to allow MS students to learn them incidentally from reading the texts found in digital science resources. Several pedagogically relevant suggestions for middle school science teachers are discussed.</p></div>\",\"PeriodicalId\":72254,\"journal\":{\"name\":\"Applied Corpus Linguistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Corpus Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666799122000089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Corpus Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666799122000089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

这项基于语料库的研究检查了由美国中学(6-8年级)学生的数字科学资源组成的270万个token语料库中的词汇。本研究的结果表明,要达到数字科学语料库(DSC) 95%-98%的词汇覆盖阈值,即通常被认为有助于最小和最佳阅读理解(Laufer, 2020), 6-8年级的中学生必须分别识别BNC/COCA (Nation, 2012)中出现频率最高的前6,000和14,000个词族,以及专有名词和边缘词。对DSC中三个子语料库的词汇量分析结果表明,生命科学子语料库的词汇量明显大于物理科学和地球与空间科学子语料库。此外,在频率最高的1000个BNC/COCA词族中,98.60%的词族在DSC中至少出现6次,而在2000 - 7000个BNC/COCA词族中,重复出现的机会显著减少。由于5000 - 7000个BNC/COCA频带中超过一半的单词在整个语料库中出现了5次或更少的次数,因此这些频带中的大多数单词在数字科学资源中的频率不够高,无法让MS学生通过阅读数字科学资源中的文本来偶然学习它们。对中学科学教师的教学建议进行了探讨。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Vocabulary in digital science resources for middle school learners

This corpus-based study examined the vocabulary in a 2.7-million-token corpus composed of digital science resources for middle school (6–8 grade) students in the United States. The findings of this study show that to reach the suggested 95%–98% lexical coverage thresholds of the Digital Science Corpus (DSC) that are conventionally deemed to facilitate minimal and optimal reading comprehension (Laufer, 2020), middle school (MS) students grade 6–8 must recognize the first 6,000 and 14,000 most frequent word families in the BNC/COCA (Nation, 2012), respectively, plus proper nouns and marginal words. The results of the lexical analysis across the three sub-corpora in the DSC suggest that the Life Science sub-corpora has a considerably larger vocabulary load than the Physical Science and Earth and Space Science sub-corpora. Additionally, while 98.60% of the most frequent 1,000 BNC/COCA word families occurred at least six times in the DSC, the 2,000–7,000 BNC/COCA word families provided significantly fewer opportunities for repeated occurrence. Since more than half of the words in the 5,000–7,000 BNC/COCA bands occurred five times or less in the overall corpus, most words across these bands do not have high enough frequency in the digital science resources to allow MS students to learn them incidentally from reading the texts found in digital science resources. Several pedagogically relevant suggestions for middle school science teachers are discussed.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Corpus Linguistics
Applied Corpus Linguistics Linguistics and Language
CiteScore
1.30
自引率
0.00%
发文量
0
审稿时长
70 days
期刊最新文献
Breach of pacta sunt servanda: A corpus-assisted analysis of newspaper discourse on the AUKUS agreement Identifying ChatGPT-generated texts in EFL students’ writing: Through comparative analysis of linguistic fingerprints English podcasts for schoolchildren and their vocabulary demands Capturing chronological variation in L2 speech through lexical measurements and regression analysis Investigating spoken classroom interactions in linguistically heterogeneous learning groups – An interdisciplinary approach to process video-based data in second language acquisition classrooms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1