Automated Marking System for Essay Questions

O. Obot, Peter G. Obike, Imaobong James
{"title":"Automated Marking System for Essay Questions","authors":"O. Obot, Peter G. Obike, Imaobong James","doi":"10.9734/jerr/2024/v26i51139","DOIUrl":null,"url":null,"abstract":"The stress of marking assessment scripts of many candidates often results in fatigue that could lead to low productivity and reduced consistency. In most cases, candidates use words, phrases and sentences that are synonyms or related in meaning to those stated in the marking scheme, however, examiners rely solely on the exact words specified in the marking scheme. This often leads to inconsistent grading and in most cases, candidates are disadvantaged. This study seeks to address these inconsistencies during assessment by evaluating the marked answer scripts and the marking scheme of Introduction to File Processing (CSC 221) from the Department of Computer Science, University of Uyo, Nigeria. These were collected and used with the Microsoft Research Paraphrase (MSRP) corpus. After preprocessing the datasets, they were subjected to Logistic Regression (LR), a machine learning technique where the semantic similarity of the answers of the candidates was measured in relation to the marking scheme of the examiner using the MSRP corpus model earlier trained on the Term Frequency-Inverse Document Frequency (TF-IDF) vectorization. Results of the experiment show a strong correlation coefficient of 0.89 and a Mean Relative Error (MRE) of 0.59 compared with the scores awarded by the human marker (examiner). Analysis of the error indicates that block marks were assigned to answers in the marking scheme while the automated marking system breaks the block marks into chunks based on phrases both in the marking scheme and the candidates’ answers. It also shows that some semantically related words were ignored by the examiner.","PeriodicalId":508164,"journal":{"name":"Journal of Engineering Research and Reports","volume":"74 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Engineering Research and Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9734/jerr/2024/v26i51139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The stress of marking assessment scripts of many candidates often results in fatigue that could lead to low productivity and reduced consistency. In most cases, candidates use words, phrases and sentences that are synonyms or related in meaning to those stated in the marking scheme, however, examiners rely solely on the exact words specified in the marking scheme. This often leads to inconsistent grading and in most cases, candidates are disadvantaged. This study seeks to address these inconsistencies during assessment by evaluating the marked answer scripts and the marking scheme of Introduction to File Processing (CSC 221) from the Department of Computer Science, University of Uyo, Nigeria. These were collected and used with the Microsoft Research Paraphrase (MSRP) corpus. After preprocessing the datasets, they were subjected to Logistic Regression (LR), a machine learning technique where the semantic similarity of the answers of the candidates was measured in relation to the marking scheme of the examiner using the MSRP corpus model earlier trained on the Term Frequency-Inverse Document Frequency (TF-IDF) vectorization. Results of the experiment show a strong correlation coefficient of 0.89 and a Mean Relative Error (MRE) of 0.59 compared with the scores awarded by the human marker (examiner). Analysis of the error indicates that block marks were assigned to answers in the marking scheme while the automated marking system breaks the block marks into chunks based on phrases both in the marking scheme and the candidates’ answers. It also shows that some semantically related words were ignored by the examiner.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
作文题自动评分系统
为众多考生评卷的压力往往会导致疲劳,从而导致工作效率低下和一致性降低。在大多数情况下,考生使用的单词、短语和句子与评分标准中规定的单词、短语和句子是同义词或意思相关,但考官却完全依赖评分标准中规定的确切单词。这往往导致评分不一致,在大多数情况下,考生处于不利地位。本研究试图通过评估尼日利亚乌约大学计算机科学系《文件处理入门》(CSC 221)课程的答卷评分和评分标准来解决评估过程中的不一致问题。这些数据被收集起来,并与 Microsoft Research Paraphrase (MSRP) 语料库一起使用。对数据集进行预处理后,对其进行了机器学习技术 Logistic Regression (LR),利用 MSRP 语料库模型,结合考官的评分标准测量考生答案的语义相似性,该模型先前在术语频率-反向文档频率 (TF-IDF) 矢量化中进行过训练。实验结果表明,与人工阅卷员(考官)给出的分数相比,相关系数为 0.89,平均相对误差为 0.59。对误差的分析表明,在评分标准中,整块分数是分配给答案的,而自动评分系统则根据评分标准和考生答案中的短语将整块分数分成若干块。分析还表明,考官忽略了一些语义相关的词语。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Resilience and Recovery Mechanisms for Software-Defined Networking (SDN) and Cloud Networks Experimental Multi-dimensional Study on Corrosion Resistance of Inorganic Phosphate Coatings on 17-4PH Stainless Steel Modelling and Optimization of a Brewery Plant from Starch Sources using Aspen Plus Innovations in Thermal Management Techniques for Enhanced Performance and Reliability in Engineering Applications Development Status and Outlook of Hydrogen Internal Combustion Engine
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1