A corpus-based developmental investigation of linguistic complexity in children's writing

Yaling Hsiao , Nicola J. Dawson , Nilanjana Banerji , Kate Nation
{"title":"A corpus-based developmental investigation of linguistic complexity in children's writing","authors":"Yaling Hsiao ,&nbsp;Nicola J. Dawson ,&nbsp;Nilanjana Banerji ,&nbsp;Kate Nation","doi":"10.1016/j.acorp.2024.100084","DOIUrl":null,"url":null,"abstract":"<div><p>Writing proficiency is associated with linguistic complexity. We used measures of linguistic complexity to investigate the development of children's narrative writing using a large corpus of short stories (<em>N</em>&gt;100,000) written by children aged 5–13 in the UK. Linguistic complexity was assessed using both lexical (<em>N</em> = 30) and syntactic (<em>N</em> = 14) measures. Most measures were associated with age, with writing by older children showing greater lexical density, sophistication, and diversity than writing by younger children. Older children also used longer sentences, and longer T-units and clauses, and the density of smaller syntactic units inside larger units was also higher. Principal Component Analysis identified a number of dimensions associated with complexity, with the first two dimensions capturing nearly 50 % of variance. Lexical diversity was mainly represented on the first dimension and syntactic complexity on the second. Across the age range, there was wider variation in syntactic complexity than in lexical diversity, suggesting that syntactic development is subject to more individual differences than the ability to use a diverse set of lexical items. Our findings quantify the nature and content of children's writing through mid-childhood, and we discuss the utility of analysing children's writing using a computational, data-driven approach.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000017/pdfft?md5=26f900f0c1ffa0cd9e4f6495f4ba3386&pid=1-s2.0-S2666799124000017-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Corpus Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666799124000017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Writing proficiency is associated with linguistic complexity. We used measures of linguistic complexity to investigate the development of children's narrative writing using a large corpus of short stories (N>100,000) written by children aged 5–13 in the UK. Linguistic complexity was assessed using both lexical (N = 30) and syntactic (N = 14) measures. Most measures were associated with age, with writing by older children showing greater lexical density, sophistication, and diversity than writing by younger children. Older children also used longer sentences, and longer T-units and clauses, and the density of smaller syntactic units inside larger units was also higher. Principal Component Analysis identified a number of dimensions associated with complexity, with the first two dimensions capturing nearly 50 % of variance. Lexical diversity was mainly represented on the first dimension and syntactic complexity on the second. Across the age range, there was wider variation in syntactic complexity than in lexical diversity, suggesting that syntactic development is subject to more individual differences than the ability to use a diverse set of lexical items. Our findings quantify the nature and content of children's writing through mid-childhood, and we discuss the utility of analysing children's writing using a computational, data-driven approach.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于语料库的儿童写作语言复杂性发展调查
写作能力与语言复杂性有关。我们利用英国 5-13 岁儿童撰写的大量短篇故事语料库(N>100,000),采用语言复杂性测量方法来研究儿童叙事性写作的发展。语言复杂性的评估采用词法(30 个)和句法(14 个)两种测量方法。大多数测量结果都与年龄有关,与年龄较小的儿童相比,年龄较大的儿童所写的文章在词汇密度、复杂性和多样性方面都更胜一筹。大龄儿童使用的句子也更长,T-单位和分句也更长,大单位内小句法单位的密度也更高。主成分分析确定了一些与复杂性相关的维度,其中前两个维度占了近 50% 的方差。词汇多样性主要体现在第一个维度上,句法复杂性则体现在第二个维度上。在不同年龄段,句法复杂性的差异比词汇多样性的差异更大,这表明句法的发展受个体差异的影响要大于使用多样化词汇的能力。我们的研究结果量化了儿童中期写作的性质和内容,并讨论了使用计算、数据驱动方法分析儿童写作的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Corpus Linguistics
Applied Corpus Linguistics Linguistics and Language
CiteScore
1.30
自引率
0.00%
发文量
0
审稿时长
70 days
期刊最新文献
Breach of pacta sunt servanda: A corpus-assisted analysis of newspaper discourse on the AUKUS agreement Identifying ChatGPT-generated texts in EFL students’ writing: Through comparative analysis of linguistic fingerprints English podcasts for schoolchildren and their vocabulary demands Capturing chronological variation in L2 speech through lexical measurements and regression analysis Investigating spoken classroom interactions in linguistically heterogeneous learning groups – An interdisciplinary approach to process video-based data in second language acquisition classrooms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1