Binh Nguyen Thanh, Diem Thi-Ngoc Vo, Minh Nguyen Nhat, Thi Thu Tra Pham, Hieu Thai Trung, Son Ha Xuan
{"title":"与机器赛跑评估生成式人工智能解决真实评估问题的能力","authors":"Binh Nguyen Thanh, Diem Thi-Ngoc Vo, Minh Nguyen Nhat, Thi Thu Tra Pham, Hieu Thai Trung, Son Ha Xuan","doi":"10.14742/ajet.8902","DOIUrl":null,"url":null,"abstract":"In this study, we introduce a framework designed to help educators assess the effectiveness of popular generative artificial intelligence (AI) tools in solving authentic assessments. We employed Bloom’s taxonomy as a guiding principle to create authentic assessments that evaluate the capabilities of generative AI tools. We applied this framework to assess the abilities of ChatGPT-4, ChatGPT-3.5, Google Bard and Microsoft Bing in solving authentic assessments in economics. We found that generative AI tools perform very well at the lower levels of Bloom's taxonomy while still maintaining a decent level of performance at the higher levels, with “create” being the weakest level of performance. Interestingly, these tools are better able to address numeric-based questions than text-based ones. Moreover, all the generative AI tools exhibit weaknesses in building arguments based on theoretical frameworks, maintaining the coherence of different arguments and providing appropriate references. Our study provides educators with a framework to assess the capabilities of generative AI tools, enabling them to make more informed decisions regarding assessments and learning activities. Our findings demand a strategic reimagining of educational goals and assessments, emphasising higher cognitive skills and calling for a concerted effort to enhance the capabilities of educators in preparing students for a rapidly transforming professional environment.\nImplications for practice or policy\n\nOur proposed framework enables educators to systematically evaluate the capabilities of widely used generative AI tools in assessments and assist them in the assessment design process.\nTertiary institutions should re-evaluate and redesign programmes and course learning outcomes. The new focus on learning outcomes should address the higher levels of educational goals of Bloom’s taxonomy, specifically the “create” level.\n","PeriodicalId":47812,"journal":{"name":"Australasian Journal of Educational Technology","volume":"4 2","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Race with the machines: Assessing the capability of generative AI in solving authentic assessments\",\"authors\":\"Binh Nguyen Thanh, Diem Thi-Ngoc Vo, Minh Nguyen Nhat, Thi Thu Tra Pham, Hieu Thai Trung, Son Ha Xuan\",\"doi\":\"10.14742/ajet.8902\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we introduce a framework designed to help educators assess the effectiveness of popular generative artificial intelligence (AI) tools in solving authentic assessments. We employed Bloom’s taxonomy as a guiding principle to create authentic assessments that evaluate the capabilities of generative AI tools. We applied this framework to assess the abilities of ChatGPT-4, ChatGPT-3.5, Google Bard and Microsoft Bing in solving authentic assessments in economics. We found that generative AI tools perform very well at the lower levels of Bloom's taxonomy while still maintaining a decent level of performance at the higher levels, with “create” being the weakest level of performance. Interestingly, these tools are better able to address numeric-based questions than text-based ones. Moreover, all the generative AI tools exhibit weaknesses in building arguments based on theoretical frameworks, maintaining the coherence of different arguments and providing appropriate references. Our study provides educators with a framework to assess the capabilities of generative AI tools, enabling them to make more informed decisions regarding assessments and learning activities. Our findings demand a strategic reimagining of educational goals and assessments, emphasising higher cognitive skills and calling for a concerted effort to enhance the capabilities of educators in preparing students for a rapidly transforming professional environment.\\nImplications for practice or policy\\n\\nOur proposed framework enables educators to systematically evaluate the capabilities of widely used generative AI tools in assessments and assist them in the assessment design process.\\nTertiary institutions should re-evaluate and redesign programmes and course learning outcomes. The new focus on learning outcomes should address the higher levels of educational goals of Bloom’s taxonomy, specifically the “create” level.\\n\",\"PeriodicalId\":47812,\"journal\":{\"name\":\"Australasian Journal of Educational Technology\",\"volume\":\"4 2\",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2023-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Australasian Journal of Educational Technology\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.14742/ajet.8902\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australasian Journal of Educational Technology","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.14742/ajet.8902","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
Race with the machines: Assessing the capability of generative AI in solving authentic assessments
In this study, we introduce a framework designed to help educators assess the effectiveness of popular generative artificial intelligence (AI) tools in solving authentic assessments. We employed Bloom’s taxonomy as a guiding principle to create authentic assessments that evaluate the capabilities of generative AI tools. We applied this framework to assess the abilities of ChatGPT-4, ChatGPT-3.5, Google Bard and Microsoft Bing in solving authentic assessments in economics. We found that generative AI tools perform very well at the lower levels of Bloom's taxonomy while still maintaining a decent level of performance at the higher levels, with “create” being the weakest level of performance. Interestingly, these tools are better able to address numeric-based questions than text-based ones. Moreover, all the generative AI tools exhibit weaknesses in building arguments based on theoretical frameworks, maintaining the coherence of different arguments and providing appropriate references. Our study provides educators with a framework to assess the capabilities of generative AI tools, enabling them to make more informed decisions regarding assessments and learning activities. Our findings demand a strategic reimagining of educational goals and assessments, emphasising higher cognitive skills and calling for a concerted effort to enhance the capabilities of educators in preparing students for a rapidly transforming professional environment.
Implications for practice or policy
Our proposed framework enables educators to systematically evaluate the capabilities of widely used generative AI tools in assessments and assist them in the assessment design process.
Tertiary institutions should re-evaluate and redesign programmes and course learning outcomes. The new focus on learning outcomes should address the higher levels of educational goals of Bloom’s taxonomy, specifically the “create” level.