Michael Flor, Steven Holtzman, Paul Deane, Isaac Bejar
{"title":"Mapping of American English vocabulary by grade levels","authors":"Michael Flor, Steven Holtzman, Paul Deane, Isaac Bejar","doi":"10.1075/itl.22025.flo","DOIUrl":null,"url":null,"abstract":"\n We describe a large-scale effort to map English-language vocabulary by U.S. school grade levels. Our motivation is\n to rapidly expand graded vocabulary resources for work with native English speakers in the USA, while taking into consideration\n school-related influences rather than relying on just the corpus-frequency approaches. We report on the initial effort of data\n collection, with mapping of about 22K word forms. We provide comparisons of this mapping to some other recent vocabulary mapping\n efforts, such as age-of-acquisition. We then describe the efforts to automatically expand this resource by using linguistically\n motivated variables and corpus-based methods. Our current resource maps more than 126K English word forms to US school grade\n levels. We also compare a subset of our L1 mapped data to English L2 vocabulary levels, as expressed on the CEFR scale, and find\n that there is a considerable overlap in the order of vocabulary learning in L1 and L2 English.","PeriodicalId":510772,"journal":{"name":"ITL - International Journal of Applied Linguistics","volume":"204 S619","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ITL - International Journal of Applied Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1075/itl.22025.flo","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We describe a large-scale effort to map English-language vocabulary by U.S. school grade levels. Our motivation is
to rapidly expand graded vocabulary resources for work with native English speakers in the USA, while taking into consideration
school-related influences rather than relying on just the corpus-frequency approaches. We report on the initial effort of data
collection, with mapping of about 22K word forms. We provide comparisons of this mapping to some other recent vocabulary mapping
efforts, such as age-of-acquisition. We then describe the efforts to automatically expand this resource by using linguistically
motivated variables and corpus-based methods. Our current resource maps more than 126K English word forms to US school grade
levels. We also compare a subset of our L1 mapped data to English L2 vocabulary levels, as expressed on the CEFR scale, and find
that there is a considerable overlap in the order of vocabulary learning in L1 and L2 English.