通过使用人工智能生成问题模板而不是问题,让人类有效地参与环路:混合 AIG 的有效性证据。

IF 3.3 2区 教育学 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Medical Teacher Pub Date : 2024-11-27 DOI:10.1080/0142159X.2024.2430360
Yavuz Selim Kıyak, Emre Emekli, Özlem Coşkun, Işıl İrem Budakoğlu
{"title":"通过使用人工智能生成问题模板而不是问题,让人类有效地参与环路:混合 AIG 的有效性证据。","authors":"Yavuz Selim Kıyak, Emre Emekli, Özlem Coşkun, Işıl İrem Budakoğlu","doi":"10.1080/0142159X.2024.2430360","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Manually creating multiple-choice questions (MCQ) is inefficient. Automatic item generation (AIG) offers a scalable solution, with two main approaches: template-based and non-template-based (AI-driven). Template-based AIG ensures accuracy but requires significant expert input to develop templates. In contrast, AI-driven AIG can generate questions quickly but with inaccuracies. The Hybrid AIG combines the strengths of both methods. However, neither have MCQs been generated using the Hybrid AIG approach nor has any validity evidence been provided.</p><p><strong>Methods: </strong>We generated MCQs using the Hybrid AIG approach and investigated the validity evidence of these questions by determining whether experts could identify the correct answers. We used a custom ChatGPT to develop an item template, which were then fed into Gazitor, a template-based AIG (non-AI) software. A panel of medical doctors identified the answers.</p><p><strong>Results: </strong>Of 105 decisions, 101 (96.2%) matched the software's correct answer. In all MCQs (100%), the experts reached a consensus on the correct answer. The evidence corresponds to the 'Relations to Other Variables' in Messick's validity framework.</p><p><strong>Conclusions: </strong>The Hybrid AIG approach can enhance the efficiency of MCQ generation while maintaining accuracy. It mitigates concerns about hallucinations while benefiting from AI.</p>","PeriodicalId":18643,"journal":{"name":"Medical Teacher","volume":" ","pages":"1-4"},"PeriodicalIF":3.3000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Keeping humans in the loop efficiently by generating question templates instead of questions using AI: Validity evidence on Hybrid AIG.\",\"authors\":\"Yavuz Selim Kıyak, Emre Emekli, Özlem Coşkun, Işıl İrem Budakoğlu\",\"doi\":\"10.1080/0142159X.2024.2430360\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Manually creating multiple-choice questions (MCQ) is inefficient. Automatic item generation (AIG) offers a scalable solution, with two main approaches: template-based and non-template-based (AI-driven). Template-based AIG ensures accuracy but requires significant expert input to develop templates. In contrast, AI-driven AIG can generate questions quickly but with inaccuracies. The Hybrid AIG combines the strengths of both methods. However, neither have MCQs been generated using the Hybrid AIG approach nor has any validity evidence been provided.</p><p><strong>Methods: </strong>We generated MCQs using the Hybrid AIG approach and investigated the validity evidence of these questions by determining whether experts could identify the correct answers. We used a custom ChatGPT to develop an item template, which were then fed into Gazitor, a template-based AIG (non-AI) software. A panel of medical doctors identified the answers.</p><p><strong>Results: </strong>Of 105 decisions, 101 (96.2%) matched the software's correct answer. In all MCQs (100%), the experts reached a consensus on the correct answer. The evidence corresponds to the 'Relations to Other Variables' in Messick's validity framework.</p><p><strong>Conclusions: </strong>The Hybrid AIG approach can enhance the efficiency of MCQ generation while maintaining accuracy. It mitigates concerns about hallucinations while benefiting from AI.</p>\",\"PeriodicalId\":18643,\"journal\":{\"name\":\"Medical Teacher\",\"volume\":\" \",\"pages\":\"1-4\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical Teacher\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1080/0142159X.2024.2430360\",\"RegionNum\":2,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Teacher","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/0142159X.2024.2430360","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0

摘要

背景介绍手动创建多选题(MCQ)的效率很低。自动项目生成(AIG)提供了一种可扩展的解决方案,主要有两种方法:基于模板和非模板(人工智能驱动)。基于模板的 AIG 可确保准确性,但需要大量专家投入来开发模板。相比之下,人工智能驱动的 AIG 可以快速生成问题,但存在误差。混合式 AIG 结合了这两种方法的优点。然而,混合型 AIG 方法既没有生成 MCQ,也没有提供任何有效性证据:方法:我们使用混合 AIG 方法生成 MCQ,并通过确定专家是否能识别正确答案来调查这些问题的有效性证据。我们使用定制的 ChatGPT 开发了一个项目模板,然后将其输入基于模板的 AIG(非人工智能)软件 Gazitor。一个由医生组成的小组对答案进行了鉴定:在 105 个决定中,101 个(96.2%)符合软件的正确答案。在所有 MCQ 中(100%),专家们就正确答案达成了共识。这些证据符合梅西克有效性框架中的 "与其他变量的关系":混合 AIG 方法可以提高 MCQ 生成的效率,同时保持准确性。结论:混合 AIG 方法既能提高 MCQ 生成的效率,又能保持准确性。它既能减轻对幻觉的担忧,又能从人工智能中获益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Keeping humans in the loop efficiently by generating question templates instead of questions using AI: Validity evidence on Hybrid AIG.

Background: Manually creating multiple-choice questions (MCQ) is inefficient. Automatic item generation (AIG) offers a scalable solution, with two main approaches: template-based and non-template-based (AI-driven). Template-based AIG ensures accuracy but requires significant expert input to develop templates. In contrast, AI-driven AIG can generate questions quickly but with inaccuracies. The Hybrid AIG combines the strengths of both methods. However, neither have MCQs been generated using the Hybrid AIG approach nor has any validity evidence been provided.

Methods: We generated MCQs using the Hybrid AIG approach and investigated the validity evidence of these questions by determining whether experts could identify the correct answers. We used a custom ChatGPT to develop an item template, which were then fed into Gazitor, a template-based AIG (non-AI) software. A panel of medical doctors identified the answers.

Results: Of 105 decisions, 101 (96.2%) matched the software's correct answer. In all MCQs (100%), the experts reached a consensus on the correct answer. The evidence corresponds to the 'Relations to Other Variables' in Messick's validity framework.

Conclusions: The Hybrid AIG approach can enhance the efficiency of MCQ generation while maintaining accuracy. It mitigates concerns about hallucinations while benefiting from AI.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Medical Teacher
Medical Teacher 医学-卫生保健
CiteScore
7.80
自引率
8.50%
发文量
396
审稿时长
3-6 weeks
期刊介绍: Medical Teacher provides accounts of new teaching methods, guidance on structuring courses and assessing achievement, and serves as a forum for communication between medical teachers and those involved in general education. In particular, the journal recognizes the problems teachers have in keeping up-to-date with the developments in educational methods that lead to more effective teaching and learning at a time when the content of the curriculum—from medical procedures to policy changes in health care provision—is also changing. The journal features reports of innovation and research in medical education, case studies, survey articles, practical guidelines, reviews of current literature and book reviews. All articles are peer reviewed.
期刊最新文献
A randomised cross-over trial assessing the impact of AI-generated individual feedback on written online assignments for medical students. Assessing readability of explanations and reliability of answers by GPT-3.5 and GPT-4 in non-traumatic spinal cord injury education. Balancing innovation and tradition: A critical reflection on the assessment PROFILE framework. Response to: "Rethinking peer evaluation in team-based learning". Response to: 'A confidentiality conundrum: Case tracking for medical education'.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1