Pub Date : 2025-01-01Epub Date: 2024-09-13DOI: 10.1007/s40593-024-00426-w
Julian F Lohmann, Fynn Junge, Jens Möller, Johanna Fleckenstein, Ruth Trüb, Stefan Keller, Thorben Jansen, Andrea Horbach
Recent investigations in automated essay scoring research imply that hybrid models, which combine feature engineering and the powerful tools of deep neural networks (DNNs), reach state-of-the-art performance. However, most of these findings are from holistic scoring tasks. In the present study, we use a total of four prompts from two different corpora consisting of both L1 and L2 learner essays annotated with trait scores (e.g., content, organization, and language quality). In our main experiments, we compare three variants of trait-specific models using different inputs: (1) models based on 220 linguistic features, (2) models using essay-level contextual embeddings from the distilled version of the pre-trained transformer BERT (DistilBERT), and (3) a hybrid model using both types of features. Results imply that when trait-specific models are trained based on a single resource, the feature-based models slightly outperform the embedding-based models. These differences are most prominent for the organization traits. The hybrid models outperform the single-resource models, indicating that linguistic features and embeddings indeed capture partially different aspects relevant for the assessment of essay traits. To gain more insights into the interplay between both feature types, we run addition and ablation tests for individual feature groups. Trait-specific addition tests across prompts indicate that the embedding-based models can most consistently be enhanced in content assessment when combined with morphological complexity features. Most consistent performance gains in the organization traits are achieved when embeddings are combined with length features, and most consistent performance gains in the assessment of the language traits when combined with lexical complexity, error, and occurrence features. Cross-prompt scoring again reveals slight advantages for the feature-based models.
{"title":"Neural Networks or Linguistic Features? - Comparing Different Machine-Learning Approaches for Automated Assessment of Text Quality Traits Among L1- and L2-Learners' Argumentative Essays.","authors":"Julian F Lohmann, Fynn Junge, Jens Möller, Johanna Fleckenstein, Ruth Trüb, Stefan Keller, Thorben Jansen, Andrea Horbach","doi":"10.1007/s40593-024-00426-w","DOIUrl":"10.1007/s40593-024-00426-w","url":null,"abstract":"<p><p>Recent investigations in automated essay scoring research imply that hybrid models, which combine feature engineering and the powerful tools of deep neural networks (DNNs), reach state-of-the-art performance. However, most of these findings are from holistic scoring tasks. In the present study, we use a total of four prompts from two different corpora consisting of both L1 and L2 learner essays annotated with trait scores (e.g., content, organization, and language quality). In our main experiments, we compare three variants of trait-specific models using different inputs: (1) models based on 220 linguistic features, (2) models using essay-level contextual embeddings from the distilled version of the pre-trained transformer BERT (DistilBERT), and (3) a hybrid model using both types of features. Results imply that when trait-specific models are trained based on a single resource, the feature-based models slightly outperform the embedding-based models. These differences are most prominent for the organization traits. The hybrid models outperform the single-resource models, indicating that linguistic features and embeddings indeed capture partially different aspects relevant for the assessment of essay traits. To gain more insights into the interplay between both feature types, we run addition and ablation tests for individual feature groups. Trait-specific addition tests across prompts indicate that the embedding-based models can most consistently be enhanced in content assessment when combined with morphological complexity features. Most consistent performance gains in the organization traits are achieved when embeddings are combined with length features, and most consistent performance gains in the assessment of the language traits when combined with lexical complexity, error, and occurrence features. Cross-prompt scoring again reveals slight advantages for the feature-based models.</p>","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"35 3","pages":"1178-1217"},"PeriodicalIF":8.5,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12450813/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-05-14DOI: 10.1007/s40593-025-00480-y
Max van der Velde, Wieke Harmsen, Bernard P Veldkamp, Remco Feskens, Jos Keuning, Nicole Swart
Although the ability to comprehend what one is reading is one of the most fundamental necessities to function within society, the reading comprehension skills of students have recently been on the decline in many countries. An essential prerequisite to reading comprehension is the ability to read fluently, which is defined as the ability to read (aloud) with accuracy, speed, automaticity and prosody. Current oral reading fluency assessment instruments seldom provide detailed diagnostics however, and bestow a heavy testing burden on practitioners. Recent developments in Artificial Intelligence-based assessment methodology might provide a solution to current assessment issues, but thorough validations of such procedures have proven scarce. This study evaluates whether valid word decoding and passage reading measures (accuracy, speed and automaticity) can be generated for a semi-transparent language, using an automatic speech recognition (ASR) based oral reading fluency assessment instrument. A validation study was conducted, using the Argument-Based Approach to Validation. Data concerned 176 h of speech data, and the results of 569 and 622 oral word- and passage reading tests that are currently administered in primary schools, from 653 children attending the second- or third grade of Dutch primary education. The results of the validation indicate that it is possible to generate fluency metrics for a semi-transparent language, using an ASR-based oral reading fluency assessment instrument. Future researchers are advised to further optimize the ASR, evaluate its errors, and realize a prosody component, completing the envisioned reading fluency assessment instrument, thereby improving reading fluency assessment throughout primary education.
Supplementary information: The online version contains supplementary material available at 10.1007/s40593-025-00480-y.
{"title":"Speech Enabled Reading Fluency Assessment: a Validation Study.","authors":"Max van der Velde, Wieke Harmsen, Bernard P Veldkamp, Remco Feskens, Jos Keuning, Nicole Swart","doi":"10.1007/s40593-025-00480-y","DOIUrl":"10.1007/s40593-025-00480-y","url":null,"abstract":"<p><p>Although the ability to comprehend what one is reading is one of the most fundamental necessities to function within society, the reading comprehension skills of students have recently been on the decline in many countries. An essential prerequisite to reading comprehension is the ability to read fluently, which is defined as the ability to read (aloud) with accuracy, speed, automaticity and prosody. Current oral reading fluency assessment instruments seldom provide detailed diagnostics however, and bestow a heavy testing burden on practitioners. Recent developments in Artificial Intelligence-based assessment methodology might provide a solution to current assessment issues, but thorough validations of such procedures have proven scarce. This study evaluates whether valid word decoding and passage reading measures (accuracy, speed and automaticity) can be generated for a semi-transparent language, using an automatic speech recognition (ASR) based oral reading fluency assessment instrument. A validation study was conducted, using the Argument-Based Approach to Validation. Data concerned 176 h of speech data, and the results of 569 and 622 oral word- and passage reading tests that are currently administered in primary schools, from 653 children attending the second- or third grade of Dutch primary education. The results of the validation indicate that it is possible to generate fluency metrics for a semi-transparent language, using an ASR-based oral reading fluency assessment instrument. Future researchers are advised to further optimize the ASR, evaluate its errors, and realize a prosody component, completing the envisioned reading fluency assessment instrument, thereby improving reading fluency assessment throughout primary education.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s40593-025-00480-y.</p>","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"35 4","pages":"2569-2595"},"PeriodicalIF":8.5,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12686063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-10DOI: 10.1007/s40593-023-00388-5
Nesra Yannier, Scott E. Hudson, Henry Chang, K. Koedinger
{"title":"AI Adaptivity in a Mixed-Reality System Improves Learning","authors":"Nesra Yannier, Scott E. Hudson, Henry Chang, K. Koedinger","doi":"10.1007/s40593-023-00388-5","DOIUrl":"https://doi.org/10.1007/s40593-023-00388-5","url":null,"abstract":"","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"41 13","pages":""},"PeriodicalIF":4.9,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139441147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-02DOI: 10.1007/s40593-023-00378-7
Duong Ngo, Andy Nguyen, Belle Dang, Ha Ngo
{"title":"Facial Expression Recognition for Examining Emotional Regulation in Synchronous Online Collaborative Learning","authors":"Duong Ngo, Andy Nguyen, Belle Dang, Ha Ngo","doi":"10.1007/s40593-023-00378-7","DOIUrl":"https://doi.org/10.1007/s40593-023-00378-7","url":null,"abstract":"","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"42 4","pages":"1-20"},"PeriodicalIF":4.9,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139390593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-20DOI: 10.1007/s40593-023-00386-7
Robert-Mihai Botarleanu, Micah Watanabe, Mihai Dascalu, S. Crossley, Danielle S. McNamara
{"title":"Multilingual Age of Exposure 2.0","authors":"Robert-Mihai Botarleanu, Micah Watanabe, Mihai Dascalu, S. Crossley, Danielle S. McNamara","doi":"10.1007/s40593-023-00386-7","DOIUrl":"https://doi.org/10.1007/s40593-023-00386-7","url":null,"abstract":"","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"22 13","pages":""},"PeriodicalIF":4.9,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138955249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-18DOI: 10.1007/s40593-023-00385-8
Kevin C. Haudek, X. Zhai
{"title":"Examining the Effect of Assessment Construct Characteristics on Machine Learning Scoring of Scientific Argumentation","authors":"Kevin C. Haudek, X. Zhai","doi":"10.1007/s40593-023-00385-8","DOIUrl":"https://doi.org/10.1007/s40593-023-00385-8","url":null,"abstract":"","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":" 5","pages":""},"PeriodicalIF":4.9,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138994594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-07DOI: 10.1007/s40593-023-00381-y
Yugo Hayashi
{"title":"Modeling Synchronization for Detecting Collaborative Learning Process Using a Pedagogical Conversational Agent: Investigation Using Recurrent Indicators of Gaze, Language, and Facial Expression","authors":"Yugo Hayashi","doi":"10.1007/s40593-023-00381-y","DOIUrl":"https://doi.org/10.1007/s40593-023-00381-y","url":null,"abstract":"","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"49 8","pages":""},"PeriodicalIF":4.9,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138594002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-07DOI: 10.1007/s40593-023-00383-w
Ulrike Padó, Yunus Eryilmaz, Larissa Kirschner
{"title":"Short-Answer Grading for German: Addressing the Challenges","authors":"Ulrike Padó, Yunus Eryilmaz, Larissa Kirschner","doi":"10.1007/s40593-023-00383-w","DOIUrl":"https://doi.org/10.1007/s40593-023-00383-w","url":null,"abstract":"","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"44 23","pages":""},"PeriodicalIF":4.9,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138593470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.1007/s40593-023-00379-6
Anisha Gupta, Dan Carpenter, Wookhee Min, Jonathan Rowe, Roger Azevedo, James C. Lester
{"title":"Detecting and Mitigating Encoded Bias in Deep Learning-Based Stealth Assessment Models for Reflection-Enriched Game-Based Learning Environments","authors":"Anisha Gupta, Dan Carpenter, Wookhee Min, Jonathan Rowe, Roger Azevedo, James C. Lester","doi":"10.1007/s40593-023-00379-6","DOIUrl":"https://doi.org/10.1007/s40593-023-00379-6","url":null,"abstract":"","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":" 12","pages":""},"PeriodicalIF":4.9,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138615957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}