{"title":"一种基于NLP的风格计量法追踪一级书面语言能力的演变","authors":"Alessio Miaschi, D. Brunato, F. Dell’Orletta","doi":"10.17239/JOWR-2021.13.01.03","DOIUrl":null,"url":null,"abstract":": In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students’ essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student","PeriodicalId":45632,"journal":{"name":"Journal of Writing Research","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"A NLP-based stylometric approach for tracking the evolution of L1 written language competence\",\"authors\":\"Alessio Miaschi, D. Brunato, F. Dell’Orletta\",\"doi\":\"10.17239/JOWR-2021.13.01.03\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students’ essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student\",\"PeriodicalId\":45632,\"journal\":{\"name\":\"Journal of Writing Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2021-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Writing Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17239/JOWR-2021.13.01.03\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Writing Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17239/JOWR-2021.13.01.03","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 10
摘要
:在这项研究中,我们提出了一种基于自然语言处理(NLP)的风格测量方法,用于跟踪意大利L1学习者书面语言能力的演变。该方法依赖于捕捉文本风格方面的一系列语言动机特征,这些特征是从CItA(Corpus Italiano di Apprendenti L1)中包含的学生论文中提取的,CItA是意大利一年级和二年级学生撰写的第一个纵向文本语料库。我们将书面语言发展建模问题作为一项有监督的分类任务来解决,该任务包括预测同一学生在不同时间跨度写的文章的时间顺序。在几个分类场景中获得的有希望的结果使我们能够得出结论,可以自动对影响书面语言随时间演变的高度相关的变化进行建模,并确定哪些特征更能预测这一过程。在文章的最后一部分,我们将注意力集中在背景变量对语言学习的可能影响上,并介绍了一项试点研究的初步结果,该研究旨在了解观察到的发展模式如何受到与学生学校环境相关的信息的影响
A NLP-based stylometric approach for tracking the evolution of L1 written language competence
: In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students’ essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student
期刊介绍:
The Journal of Writing Research is an international peer reviewed journal that publishes high quality theoretical, empirical, and review papers covering the broad spectrum of writing research. The Journal primarily publishes papers that describe scientific studies of the processes by which writing is produced or the means by which writing can be effectively taught. The journal is inherently cross-disciplinary, publishing original research in the different domains of writing research. The Journal of Writing Research is an open access journal (no reader fee - no author fee).