人工智能辅助大学数学考试手写简答题自动评分

arXiv - MATH - History and Overview Pub Date : 2024-08-21 DOI:arxiv-2408.11728

Tianyi Liu, Julia Chatain, Laura Kobel-Keller, Gerd Kortemeyer, Thomas Willwacher, Mrinmaya Sachan

{"title":"人工智能辅助大学数学考试手写简答题自动评分","authors":"Tianyi Liu, Julia Chatain, Laura Kobel-Keller, Gerd Kortemeyer, Thomas Willwacher, Mrinmaya Sachan","doi":"arxiv-2408.11728","DOIUrl":null,"url":null,"abstract":"Effective and timely feedback in educational assessments is essential but\nlabor-intensive, especially for complex tasks. Recent developments in automated\nfeedback systems, ranging from deterministic response grading to the evaluation\nof semi-open and open-ended essays, have been facilitated by advances in\nmachine learning. The emergence of pre-trained Large Language Models, such as\nGPT-4, offers promising new opportunities for efficiently processing diverse\nresponse types with minimal customization. This study evaluates the\neffectiveness of a pre-trained GPT-4 model in grading semi-open handwritten\nresponses in a university-level mathematics exam. Our findings indicate that\nGPT-4 provides surprisingly reliable and cost-effective initial grading,\nsubject to subsequent human verification. Future research should focus on\nrefining grading rules and enhancing the extraction of handwritten responses to\nfurther leverage these technologies.","PeriodicalId":501462,"journal":{"name":"arXiv - MATH - History and Overview","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams\",\"authors\":\"Tianyi Liu, Julia Chatain, Laura Kobel-Keller, Gerd Kortemeyer, Thomas Willwacher, Mrinmaya Sachan\",\"doi\":\"arxiv-2408.11728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Effective and timely feedback in educational assessments is essential but\\nlabor-intensive, especially for complex tasks. Recent developments in automated\\nfeedback systems, ranging from deterministic response grading to the evaluation\\nof semi-open and open-ended essays, have been facilitated by advances in\\nmachine learning. The emergence of pre-trained Large Language Models, such as\\nGPT-4, offers promising new opportunities for efficiently processing diverse\\nresponse types with minimal customization. This study evaluates the\\neffectiveness of a pre-trained GPT-4 model in grading semi-open handwritten\\nresponses in a university-level mathematics exam. Our findings indicate that\\nGPT-4 provides surprisingly reliable and cost-effective initial grading,\\nsubject to subsequent human verification. Future research should focus on\\nrefining grading rules and enhancing the extraction of handwritten responses to\\nfurther leverage these technologies.\",\"PeriodicalId\":501462,\"journal\":{\"name\":\"arXiv - MATH - History and Overview\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - History and Overview\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.11728\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - History and Overview","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.11728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在教育评估中，有效而及时的反馈是必不可少的，但这需要大量的人力，尤其是在复杂的任务中。机器学习技术的进步促进了自动反馈系统的最新发展，从确定性答卷评分到半开放式和开放式作文评价，不一而足。预训练大语言模型（如 GPT-4）的出现，为高效处理各种类型的答卷提供了新的机遇，只需进行最少的定制即可。本研究评估了预先训练好的 GPT-4 模型在大学数学考试中对半开放式手写答卷进行评分的效果。我们的研究结果表明，GPT-4 提供了令人惊讶的可靠和经济高效的初始评分，但还需要后续的人工验证。未来的研究应侧重于完善评分规则和加强手写答卷的提取，以进一步利用这些技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams

Effective and timely feedback in educational assessments is essential but labor-intensive, especially for complex tasks. Recent developments in automated feedback systems, ranging from deterministic response grading to the evaluation of semi-open and open-ended essays, have been facilitated by advances in machine learning. The emergence of pre-trained Large Language Models, such as GPT-4, offers promising new opportunities for efficiently processing diverse response types with minimal customization. This study evaluates the effectiveness of a pre-trained GPT-4 model in grading semi-open handwritten responses in a university-level mathematics exam. Our findings indicate that GPT-4 provides surprisingly reliable and cost-effective initial grading, subject to subsequent human verification. Future research should focus on refining grading rules and enhancing the extraction of handwritten responses to further leverage these technologies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助