Chatbot responses suggest that hypothetical biology questions are harder than realistic ones

IF 1.6 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Journal of Microbiology & Biology Education Pub Date : 2023-11-07 DOI:10.1128/jmbe.00153-23
Gregory J. Crowther, Usha Sankar, Leena S. Knight, Deborah L. Myers, Kevin T. Patton, Lekelia D. Jenkins, Thomas A. Knight
{"title":"Chatbot responses suggest that hypothetical biology questions are harder than realistic ones","authors":"Gregory J. Crowther, Usha Sankar, Leena S. Knight, Deborah L. Myers, Kevin T. Patton, Lekelia D. Jenkins, Thomas A. Knight","doi":"10.1128/jmbe.00153-23","DOIUrl":null,"url":null,"abstract":"ABSTRACT The biology education literature includes compelling assertions that unfamiliar problems are especially useful for revealing students’ true understanding of biology. However, there is only limited evidence that such novel problems have different cognitive requirements than more familiar problems. Here, we sought additional evidence by using chatbots based on large language models as models of biology students. For human physiology and cell biology, we developed sets of realistic and hypothetical problems matched to the same lesson learning objectives (LLOs). Problems were considered hypothetical if (i) known biological entities (molecules and organs) were given atypical or counterfactual properties (redefinition) or (ii) fictitious biological entities were introduced (invention). Several chatbots scored significantly worse on hypothetical problems than on realistic problems, with scores declining by an average of 13%. Among hypothetical questions, redefinition questions appeared especially difficult, with many chatbots scoring as if guessing randomly. These results suggest that, for a given LLO, hypothetical problems may have different cognitive demands than realistic problems and may more accurately reveal students’ ability to apply biology core concepts to diverse contexts. The Test Question Templates (TQT) framework, which explicitly connects LLOs with examples of assessment questions, can help educators generate problems that are challenging (due to their novelty), yet fair (due to their alignment with pre-specified LLOs). Finally, ChatGPT’s rapid improvement toward expert-level answers suggests that future educators cannot reasonably expect to ignore or outwit chatbots but must do what we can to make assessments fair and equitable.","PeriodicalId":46416,"journal":{"name":"Journal of Microbiology & Biology Education","volume":"1 3","pages":"0"},"PeriodicalIF":1.6000,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Microbiology & Biology Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1128/jmbe.00153-23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0

Abstract

ABSTRACT The biology education literature includes compelling assertions that unfamiliar problems are especially useful for revealing students’ true understanding of biology. However, there is only limited evidence that such novel problems have different cognitive requirements than more familiar problems. Here, we sought additional evidence by using chatbots based on large language models as models of biology students. For human physiology and cell biology, we developed sets of realistic and hypothetical problems matched to the same lesson learning objectives (LLOs). Problems were considered hypothetical if (i) known biological entities (molecules and organs) were given atypical or counterfactual properties (redefinition) or (ii) fictitious biological entities were introduced (invention). Several chatbots scored significantly worse on hypothetical problems than on realistic problems, with scores declining by an average of 13%. Among hypothetical questions, redefinition questions appeared especially difficult, with many chatbots scoring as if guessing randomly. These results suggest that, for a given LLO, hypothetical problems may have different cognitive demands than realistic problems and may more accurately reveal students’ ability to apply biology core concepts to diverse contexts. The Test Question Templates (TQT) framework, which explicitly connects LLOs with examples of assessment questions, can help educators generate problems that are challenging (due to their novelty), yet fair (due to their alignment with pre-specified LLOs). Finally, ChatGPT’s rapid improvement toward expert-level answers suggests that future educators cannot reasonably expect to ignore or outwit chatbots but must do what we can to make assessments fair and equitable.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
聊天机器人的回答表明,假设的生物学问题比现实问题更难
生物教育文献包括令人信服的断言,即不熟悉的问题对于揭示学生对生物学的真正理解特别有用。然而,只有有限的证据表明这些新问题与更熟悉的问题具有不同的认知要求。在这里,我们通过使用基于大型语言模型的聊天机器人作为生物学学生的模型来寻找额外的证据。对于人类生理学和细胞生物学,我们开发了一系列与相同的课程学习目标(LLOs)相匹配的现实和假设问题。如果(i)已知的生物实体(分子和器官)被赋予非典型或反事实属性(重新定义)或(ii)引入虚构的生物实体(发明),则认为问题是假设性的。有几个聊天机器人在假设问题上的得分明显低于现实问题,平均得分下降了13%。在假设性问题中,重新定义问题似乎特别困难,许多聊天机器人的得分就像随机猜测一样。这些结果表明,对于给定的LLO,假设问题可能具有不同于现实问题的认知需求,并且可能更准确地揭示学生将生物学核心概念应用于不同情境的能力。测试问题模板(TQT)框架明确地将LLOs与评估问题的示例联系起来,可以帮助教育工作者生成具有挑战性(由于它们的新颖性)但公平(由于它们与预先指定的LLOs一致)的问题。最后,ChatGPT向专家级答案的快速改进表明,未来的教育工作者不能理所当然地期望忽视或智过聊天机器人,而是必须尽我们所能使评估公平公正。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Microbiology & Biology Education
Journal of Microbiology & Biology Education EDUCATION, SCIENTIFIC DISCIPLINES-
CiteScore
3.00
自引率
26.30%
发文量
95
审稿时长
22 weeks
期刊最新文献
Applying Beer's Law in the undergraduate cell biology laboratory: examining the mathematical relationship between optical density, cell concentration, and cell size using budding yeast. Development of a simple, low-cost, blue light-emitting diode illuminator for hands-on training of DNA detection experiments using agarose gel electrophoresis. Student reflections on emotional engagement reveal science fatigue during the COVID-19 online learning transition. Visualization of giant Mimivirus in a movie for biology classrooms. Training undergraduate biomedical science majors in peer review and constructive criticism through a senior capstone course.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1