通过人工智能聊天机器人评估任务解决情况的评分标准开发与验证

IF 2.4 Q1 EDUCATION & EDUCATIONAL RESEARCH Electronic Journal of e-Learning Pub Date : 2024-05-17 DOI:10.34190/ejel.22.6.3292
Mohammad Hmoud, Hadeel Swaity, Eman Anjass, Eva María Aguaded-Ramírez
{"title":"通过人工智能聊天机器人评估任务解决情况的评分标准开发与验证","authors":"Mohammad Hmoud, Hadeel Swaity, Eman Anjass, Eva María Aguaded-Ramírez","doi":"10.34190/ejel.22.6.3292","DOIUrl":null,"url":null,"abstract":"This research aimed to develop and validate a rubric to assess Artificial Intelligence (AI) chatbots' effectiveness in accomplishing tasks, particularly within educational contexts. Given the rapidly growing integration of AI in various sectors, including education, a systematic and robust tool for evaluating AI chatbot performance is essential. This investigation involved a rigorous process including expert involvement to ensure content validity, as well as the application of statistical tests for assessing internal consistency and reliability. Factor analysis also revealed two significant domains, \"Quality of Content\" and \"Quality of Expression\", which further enhanced the construct validity of the evaluation scale. The results from this investigation robustly affirm the reliability and validity of the developed rubric, thus marking a significant advancement in the sphere of AI chatbot performance evaluation within educational contexts. Nonetheless, the study simultaneously emphasizes the requirement for additional validation research, specifically those entailing a variety of tasks and diverse AI chatbots, to further corroborate these findings. The ramifications of this research are profound, offering both researchers and practitioners engaged in chatbot development and evaluation a comprehensive and validated framework for the assessment of chatbot performance.","PeriodicalId":46105,"journal":{"name":"Electronic Journal of e-Learning","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rubric Development and Validation for Assessing Tasks' Solving via AI Chatbots\",\"authors\":\"Mohammad Hmoud, Hadeel Swaity, Eman Anjass, Eva María Aguaded-Ramírez\",\"doi\":\"10.34190/ejel.22.6.3292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research aimed to develop and validate a rubric to assess Artificial Intelligence (AI) chatbots' effectiveness in accomplishing tasks, particularly within educational contexts. Given the rapidly growing integration of AI in various sectors, including education, a systematic and robust tool for evaluating AI chatbot performance is essential. This investigation involved a rigorous process including expert involvement to ensure content validity, as well as the application of statistical tests for assessing internal consistency and reliability. Factor analysis also revealed two significant domains, \\\"Quality of Content\\\" and \\\"Quality of Expression\\\", which further enhanced the construct validity of the evaluation scale. The results from this investigation robustly affirm the reliability and validity of the developed rubric, thus marking a significant advancement in the sphere of AI chatbot performance evaluation within educational contexts. Nonetheless, the study simultaneously emphasizes the requirement for additional validation research, specifically those entailing a variety of tasks and diverse AI chatbots, to further corroborate these findings. The ramifications of this research are profound, offering both researchers and practitioners engaged in chatbot development and evaluation a comprehensive and validated framework for the assessment of chatbot performance.\",\"PeriodicalId\":46105,\"journal\":{\"name\":\"Electronic Journal of e-Learning\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronic Journal of e-Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34190/ejel.22.6.3292\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Journal of e-Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34190/ejel.22.6.3292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

摘要

这项研究旨在开发和验证一个标准,用于评估人工智能(AI)聊天机器人完成任务的效率,特别是在教育领域。鉴于人工智能与包括教育在内的各行各业的快速融合,一个系统、强大的人工智能聊天机器人性能评估工具至关重要。本次调查采用了严格的流程,包括专家参与以确保内容的有效性,以及应用统计测试来评估内部一致性和可靠性。因子分析还揭示了 "内容质量 "和 "表达质量 "这两个重要领域,进一步增强了评价量表的建构效度。研究结果有力地证实了所开发量表的可靠性和有效性,从而标志着人工智能聊天机器人在教育背景下的性能评估领域取得了重大进展。尽管如此,本研究同时强调需要进行更多的验证研究,特别是涉及各种任务和不同人工智能聊天机器人的研究,以进一步证实这些发现。这项研究意义深远,它为从事聊天机器人开发和评估的研究人员和从业人员提供了一个全面、有效的聊天机器人性能评估框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Rubric Development and Validation for Assessing Tasks' Solving via AI Chatbots
This research aimed to develop and validate a rubric to assess Artificial Intelligence (AI) chatbots' effectiveness in accomplishing tasks, particularly within educational contexts. Given the rapidly growing integration of AI in various sectors, including education, a systematic and robust tool for evaluating AI chatbot performance is essential. This investigation involved a rigorous process including expert involvement to ensure content validity, as well as the application of statistical tests for assessing internal consistency and reliability. Factor analysis also revealed two significant domains, "Quality of Content" and "Quality of Expression", which further enhanced the construct validity of the evaluation scale. The results from this investigation robustly affirm the reliability and validity of the developed rubric, thus marking a significant advancement in the sphere of AI chatbot performance evaluation within educational contexts. Nonetheless, the study simultaneously emphasizes the requirement for additional validation research, specifically those entailing a variety of tasks and diverse AI chatbots, to further corroborate these findings. The ramifications of this research are profound, offering both researchers and practitioners engaged in chatbot development and evaluation a comprehensive and validated framework for the assessment of chatbot performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Electronic Journal of e-Learning
Electronic Journal of e-Learning EDUCATION & EDUCATIONAL RESEARCH-
CiteScore
5.90
自引率
18.20%
发文量
34
审稿时长
20 weeks
期刊最新文献
Exploring Student and AI Generated Texts: Reflections on Reflection Texts Technostress Impact on Educator Productivity: Gender Differences in Jordan's Higher Education Quo Vadis, University? A Roadmap for AI and Ethics in Higher Education Examining Student Characteristics, Self-Regulated Learning Strategies, and Their Perceived Effects on Satisfaction and Academic Performance in MOOCs Operationalizing a Weighted Performance Scoring Model for Sustainable e-Learning in Medical Education: Insights from Expert Judgement
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1