特征提取方法与机器学习模型在自动作文评分中的性能比较

Chinese/English journal of educational measurement and evaluation Pub Date : 2023-09-01 DOI:10.59863/dqiz8440

Lihua Yao, Hong Jiao

{"title":"特征提取方法与机器学习模型在自动作文评分中的性能比较","authors":"Lihua Yao, Hong Jiao","doi":"10.59863/dqiz8440","DOIUrl":null,"url":null,"abstract":"This study used Kaggle data, the ASAP data set, and applied NLP and Bidirectional Encoder Representations from Transformers (BERT) for corpus processing and feature extraction, and applied different machine learning models, both traditional machine-learning classifiers and neural-network-based approaches. Supervised learning models were used for the scoring system, where six out of the eight essay prompts were trained separately and concatenated. Compared with previous study, we found that adding more features such as readability scores using Spacy Textsta improved the prediction results for the essay scoring system. The neural network model, trained on all prompt data and utilizing NLP for corpus processing and feature extraction, performed better than other models with an overall test quadratic weighted kappa (QWK) of 0.9724. It achieved the highest QWK score of 0.859 for prompt 1 and an average QWK of 0.771 across all 6 prompts, making it the best-performing machine learning model that was tested.","PeriodicalId":72586,"journal":{"name":"Chinese/English journal of educational measurement and evaluation","volume":"37 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing Performance of Feature Extraction Methods and Machine Learning Models in Automatic Essay Scoring\",\"authors\":\"Lihua Yao, Hong Jiao\",\"doi\":\"10.59863/dqiz8440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study used Kaggle data, the ASAP data set, and applied NLP and Bidirectional Encoder Representations from Transformers (BERT) for corpus processing and feature extraction, and applied different machine learning models, both traditional machine-learning classifiers and neural-network-based approaches. Supervised learning models were used for the scoring system, where six out of the eight essay prompts were trained separately and concatenated. Compared with previous study, we found that adding more features such as readability scores using Spacy Textsta improved the prediction results for the essay scoring system. The neural network model, trained on all prompt data and utilizing NLP for corpus processing and feature extraction, performed better than other models with an overall test quadratic weighted kappa (QWK) of 0.9724. It achieved the highest QWK score of 0.859 for prompt 1 and an average QWK of 0.771 across all 6 prompts, making it the best-performing machine learning model that was tested.\",\"PeriodicalId\":72586,\"journal\":{\"name\":\"Chinese/English journal of educational measurement and evaluation\",\"volume\":\"37 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chinese/English journal of educational measurement and evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.59863/dqiz8440\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese/English journal of educational measurement and evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59863/dqiz8440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究使用Kaggle数据，ASAP数据集，并应用NLP和双向编码器表示(BERT)进行语料库处理和特征提取，并应用不同的机器学习模型，包括传统的机器学习分类器和基于神经网络的方法。监督学习模型被用于评分系统，其中八个作文提示中的六个被单独训练并串联起来。与之前的研究相比，我们发现添加更多的特征，如使用Spacy Textsta的可读性分数，提高了论文评分系统的预测结果。该神经网络模型对所有提示数据进行训练，并利用NLP进行语料处理和特征提取，整体测试二次加权kappa (QWK)为0.9724，优于其他模型。它在提示1中获得了0.859的最高QWK分数，在所有6个提示中获得了0.771的平均QWK分数，使其成为测试中表现最好的机器学习模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Comparing Performance of Feature Extraction Methods and Machine Learning Models in Automatic Essay Scoring

This study used Kaggle data, the ASAP data set, and applied NLP and Bidirectional Encoder Representations from Transformers (BERT) for corpus processing and feature extraction, and applied different machine learning models, both traditional machine-learning classifiers and neural-network-based approaches. Supervised learning models were used for the scoring system, where six out of the eight essay prompts were trained separately and concatenated. Compared with previous study, we found that adding more features such as readability scores using Spacy Textsta improved the prediction results for the essay scoring system. The neural network model, trained on all prompt data and utilizing NLP for corpus processing and feature extraction, performed better than other models with an overall test quadratic weighted kappa (QWK) of 0.9724. It achieved the highest QWK score of 0.859 for prompt 1 and an average QWK of 0.771 across all 6 prompts, making it the best-performing machine learning model that was tested.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Chinese/English journal of educational measurement and evaluation

自引率

0.00%

发文量

期刊最新文献

Non-Parametric CD-CAT Item Selection Strategy and Termination Rules Based on Binary Search Algorithm 基于二分搜索算法构建的非参数CD-CAT选题策略及终止规则 An Efficient Non-parametric Item Selection Method for Polytomous Scoring CD-CAT ETS Skills Taxonomy 一种高效的且适用于多级计分CD-CAT非参数选题方法