基于 NLP 的大型多选题试题库管理

IF 2.9 Q1 EDUCATION & EDUCATIONAL RESEARCH Journal of Learning Analytics Pub Date : 2023-12-12 DOI:10.18608/jla.2023.7897
Valentina Albano, D. Firmani, Luigi Laura, Jerin George Mathew, Anna Lucia Paoletti, Irene Torrente
{"title":"基于 NLP 的大型多选题试题库管理","authors":"Valentina Albano, D. Firmani, Luigi Laura, Jerin George Mathew, Anna Lucia Paoletti, Irene Torrente","doi":"10.18608/jla.2023.7897","DOIUrl":null,"url":null,"abstract":"Multiple-choice questions (MCQs) are widely used in educational assessments and professional certification exams. Managing large repositories of MCQs, however, poses several challenges due to the high volume of questions and the need to maintain their quality and relevance over time. One of these challenges is the presence of questions that duplicate concepts but are formulated differently. Such questions can indeed elude syntactic controls but provide no added value to the repository.In this paper, we focus on this specific challenge and propose a workflow for the discovery and management of potential duplicate questions in large MCQ repositories. Overall, the workflow comprises three main steps: MCQ preprocessing, similarity computation, and finally a graph-based exploration and analysis of the obtained similarity values. For the preprocessing phase, we consider three main strategies: (i) removing the list of candidate answers from each question, (ii) augmenting each question with the correct answer, or (iii) augmenting each question with all candidate answers. Then, we use deep learning–based natural language processing (NLP) techniques, based on the Transformers architecture, to compute similarities between MCQs based on semantics. Finally, we propose a new approach to graph exploration based on graph communities to analyze the similarities and relationships between MCQs in the graph. We illustrate the approach with a case study of the Competenze Digital program, a large-scale assessment project by the Italian government. ","PeriodicalId":36754,"journal":{"name":"Journal of Learning Analytics","volume":"34 14","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NLP-Based Management of Large Multiple-Choice Test Item Repositories\",\"authors\":\"Valentina Albano, D. Firmani, Luigi Laura, Jerin George Mathew, Anna Lucia Paoletti, Irene Torrente\",\"doi\":\"10.18608/jla.2023.7897\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multiple-choice questions (MCQs) are widely used in educational assessments and professional certification exams. Managing large repositories of MCQs, however, poses several challenges due to the high volume of questions and the need to maintain their quality and relevance over time. One of these challenges is the presence of questions that duplicate concepts but are formulated differently. Such questions can indeed elude syntactic controls but provide no added value to the repository.In this paper, we focus on this specific challenge and propose a workflow for the discovery and management of potential duplicate questions in large MCQ repositories. Overall, the workflow comprises three main steps: MCQ preprocessing, similarity computation, and finally a graph-based exploration and analysis of the obtained similarity values. For the preprocessing phase, we consider three main strategies: (i) removing the list of candidate answers from each question, (ii) augmenting each question with the correct answer, or (iii) augmenting each question with all candidate answers. Then, we use deep learning–based natural language processing (NLP) techniques, based on the Transformers architecture, to compute similarities between MCQs based on semantics. Finally, we propose a new approach to graph exploration based on graph communities to analyze the similarities and relationships between MCQs in the graph. We illustrate the approach with a case study of the Competenze Digital program, a large-scale assessment project by the Italian government. \",\"PeriodicalId\":36754,\"journal\":{\"name\":\"Journal of Learning Analytics\",\"volume\":\"34 14\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Learning Analytics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18608/jla.2023.7897\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Learning Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18608/jla.2023.7897","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

摘要

多项选择题(MCQ)被广泛应用于教育评估和专业认证考试中。然而,由于问题数量庞大,而且需要长期保持其质量和相关性,因此管理大型 MCQ 题库面临着诸多挑战。其中一个挑战是存在概念重复但表述不同的问题。在本文中,我们重点讨论了这一具体挑战,并提出了在大型 MCQ 库中发现和管理潜在重复问题的工作流程。总的来说,工作流程包括三个主要步骤:MCQ 预处理、相似性计算,最后是对获得的相似性值进行基于图的探索和分析。在预处理阶段,我们考虑了三种主要策略:(i) 删除每个问题的候选答案列表;(ii) 用正确答案增强每个问题;或 (iii) 用所有候选答案增强每个问题。然后,我们在 Transformers 架构的基础上使用基于深度学习的自然语言处理(NLP)技术,根据语义计算 MCQ 之间的相似性。最后,我们提出了一种基于图群落的图探索新方法,用于分析图中 MCQ 之间的相似性和关系。我们以意大利政府的大型评估项目 Competenze Digital 计划为例,对该方法进行了说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
NLP-Based Management of Large Multiple-Choice Test Item Repositories
Multiple-choice questions (MCQs) are widely used in educational assessments and professional certification exams. Managing large repositories of MCQs, however, poses several challenges due to the high volume of questions and the need to maintain their quality and relevance over time. One of these challenges is the presence of questions that duplicate concepts but are formulated differently. Such questions can indeed elude syntactic controls but provide no added value to the repository.In this paper, we focus on this specific challenge and propose a workflow for the discovery and management of potential duplicate questions in large MCQ repositories. Overall, the workflow comprises three main steps: MCQ preprocessing, similarity computation, and finally a graph-based exploration and analysis of the obtained similarity values. For the preprocessing phase, we consider three main strategies: (i) removing the list of candidate answers from each question, (ii) augmenting each question with the correct answer, or (iii) augmenting each question with all candidate answers. Then, we use deep learning–based natural language processing (NLP) techniques, based on the Transformers architecture, to compute similarities between MCQs based on semantics. Finally, we propose a new approach to graph exploration based on graph communities to analyze the similarities and relationships between MCQs in the graph. We illustrate the approach with a case study of the Competenze Digital program, a large-scale assessment project by the Italian government. 
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Learning Analytics
Journal of Learning Analytics Social Sciences-Education
CiteScore
7.40
自引率
5.10%
发文量
25
期刊最新文献
Generative AI and Learning Analytics NLP-Based Management of Large Multiple-Choice Test Item Repositories Session-Based Time-Window Identification in Virtual Learning Environments Effectiveness of a Learning Analytics Dashboard for Increasing Student Engagement Levels Bayesian Generative Modelling of Student Results in Course Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1