Pub Date : 2024-09-17DOI: 10.1016/j.caeai.2024.100295
Di Wu, Meng Chen, Xu Chen, Xing Liu
There is growing recognition among researchers and stakeholders about the significant impact of artificial intelligence (AI) technology on classroom instruction. As a crucial element in developing AI literacy, AI education in K-12 schools is increasingly gaining attention. However, most existing research on K-12 AI education relies on experiential methodologies and suffers from a lack of quantitative analysis based on extensive classroom data, hindering a comprehensive depiction of AI education's current state at these educational levels. To address this gap, this article employs the advanced semantic understanding capabilities of large language models (LLMs) to create an intelligent analysis framework that identifies learning theories, pedagogical approaches, learning tools, and levels of AI literacy in AI classroom instruction. Compared with the results of manual analysis, analysis based on LLMs can achieve more than 90% consistency. Our findings, based on the analysis of 98 classroom instruction videos in central Chinese cities, reveal that current AI classroom instruction insufficiently foster AI literacy, with only 35.71% addressing higher-level skills such as evaluating and creating AI. AI ethics are even less commonly addressed, featured in just 5.1% of classroom instruction. We classified AI classroom instruction into three categories: conceptual (50%), heuristic (18.37%), and experimental (31.63%). Correlation analysis suggests a significant relationship between the adoption of pedagogical approaches and the development of advanced AI literacy. Specifically, integrating Project-based/Problem-based learning (PBL) with Collaborative learning appears effective in cultivating the capacity to evaluate and create AI.
{"title":"Analyzing K-12 AI education: A large language model study of classroom instruction on learning theories, pedagogy, tools, and AI literacy","authors":"Di Wu, Meng Chen, Xu Chen, Xing Liu","doi":"10.1016/j.caeai.2024.100295","DOIUrl":"10.1016/j.caeai.2024.100295","url":null,"abstract":"<div><p>There is growing recognition among researchers and stakeholders about the significant impact of artificial intelligence (AI) technology on classroom instruction. As a crucial element in developing AI literacy, AI education in K-12 schools is increasingly gaining attention. However, most existing research on K-12 AI education relies on experiential methodologies and suffers from a lack of quantitative analysis based on extensive classroom data, hindering a comprehensive depiction of AI education's current state at these educational levels. To address this gap, this article employs the advanced semantic understanding capabilities of large language models (LLMs) to create an intelligent analysis framework that identifies learning theories, pedagogical approaches, learning tools, and levels of AI literacy in AI classroom instruction. Compared with the results of manual analysis, analysis based on LLMs can achieve more than 90% consistency. Our findings, based on the analysis of 98 classroom instruction videos in central Chinese cities, reveal that current AI classroom instruction insufficiently foster AI literacy, with only 35.71% addressing higher-level skills such as evaluating and creating AI. AI ethics are even less commonly addressed, featured in just 5.1% of classroom instruction. We classified AI classroom instruction into three categories: conceptual (50%), heuristic (18.37%), and experimental (31.63%). Correlation analysis suggests a significant relationship between the adoption of pedagogical approaches and the development of advanced AI literacy. Specifically, integrating Project-based/Problem-based learning (PBL) with Collaborative learning appears effective in cultivating the capacity to evaluate and create AI.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100295"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24000985/pdfft?md5=79b917cbae807f8c5d6d3d47fcc54e84&pid=1-s2.0-S2666920X24000985-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17DOI: 10.1016/j.caeai.2024.100302
Jerry Huang, Atsushi Mizumoto
Since the introduction of the L2 Motivational Self System (L2MSS), numerous studies worldwide have highlighted its effectiveness in elucidating Second Language Acquisition. However, the influence of generative artificial intelligence (GenAI) technology on this model remains largely unexplored. The Technology Acceptance Model (TAM) is a widely employed framework for examining the impact of a new technology, and this study explores the intercorrelation when these two models are considered together. Conducted with 35 s-year university English as a foreign language (EFL) students in humanities, the study involved two sessions of instructor-led ChatGPT usage writing workshops, followed by the collection of survey responses. Data analysis unveiled a notable correlation between the L2 Motivational Self System and the Technology Acceptance Model. Particularly noteworthy is the finding that Ought-to L2 Self positively predict Actual Usage. The study discusses pedagogical and theoretical implications, along with suggesting future research directions.
{"title":"Examining the relationship between the L2 motivational self system and technology acceptance model post ChatGPT introduction and utilization","authors":"Jerry Huang, Atsushi Mizumoto","doi":"10.1016/j.caeai.2024.100302","DOIUrl":"10.1016/j.caeai.2024.100302","url":null,"abstract":"<div><p>Since the introduction of the L2 Motivational Self System (L2MSS), numerous studies worldwide have highlighted its effectiveness in elucidating Second Language Acquisition. However, the influence of generative artificial intelligence (GenAI) technology on this model remains largely unexplored. The Technology Acceptance Model (TAM) is a widely employed framework for examining the impact of a new technology, and this study explores the intercorrelation when these two models are considered together. Conducted with 35 s-year university English as a foreign language (EFL) students in humanities, the study involved two sessions of instructor-led ChatGPT usage writing workshops, followed by the collection of survey responses. Data analysis unveiled a notable correlation between the L2 Motivational Self System and the Technology Acceptance Model. Particularly noteworthy is the finding that Ought-to L2 Self positively predict Actual Usage. The study discusses pedagogical and theoretical implications, along with suggesting future research directions.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100302"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X2400105X/pdfft?md5=7ea85e3b8e97e5831a851694fb6a8b69&pid=1-s2.0-S2666920X2400105X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142270841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17DOI: 10.1016/j.caeai.2024.100293
Helia Farhood , Ibrahim Joudah , Amin Beheshti , Samuel Muller
Predicting student outcomes is essential in educational analytics for creating personalised learning experiences. The effectiveness of these predictive models relies on having access to sufficient and accurate data. However, privacy concerns and the lack of student consent often restrict data collection, limiting the applicability of predictive models. To tackle this obstacle, we employ Generative Adversarial Networks, a type of Generative AI, to generate tabular data replicating and enlarging the dimensions of two distinct publicly available student datasets. The ‘Math dataset’ has 395 observations and 33 features, whereas the ‘Exam dataset’ has 1000 observations and 8 features. Using advanced Python libraries, Conditional Tabular Generative Adversarial Networks and Copula Generative Adversarial Networks, our methodology consists of two phases. First, a mirroring approach where we produce synthetic data matching the volume of the real datasets, focusing on privacy and evaluating predictive accuracy. Second, augmenting the real datasets with newly created synthetic observations to fill gaps in datasets that lack student data. We validate the synthetic data before employing these approaches using Correlation Analysis, Density Analysis, Correlation Heatmaps, and Principal Component Analysis. We then compare the predictive accuracy of whether students will pass or fail their exams across original, synthetic, and augmented datasets. Employing Feedforward Neural Networks, Convolutional Neural Networks, and Gradient-boosted Neural Networks, and using Bayesian optimisation for hyperparameter tuning, this research methodically examines the impact of synthetic data on prediction accuracy. We implement and optimize these models using Python. Our mirroring approach aims to achieve accuracy rates that closely align with the original data. Meanwhile, our augmenting approach seeks to reach a slightly higher accuracy level than when solely learning from the original data. Our findings provide actionable insights into leveraging advanced Generative AI techniques to enhance educational outcomes and meet our objectives successfully.
{"title":"Advancing student outcome predictions through generative adversarial networks","authors":"Helia Farhood , Ibrahim Joudah , Amin Beheshti , Samuel Muller","doi":"10.1016/j.caeai.2024.100293","DOIUrl":"10.1016/j.caeai.2024.100293","url":null,"abstract":"<div><p>Predicting student outcomes is essential in educational analytics for creating personalised learning experiences. The effectiveness of these predictive models relies on having access to sufficient and accurate data. However, privacy concerns and the lack of student consent often restrict data collection, limiting the applicability of predictive models. To tackle this obstacle, we employ Generative Adversarial Networks, a type of Generative AI, to generate tabular data replicating and enlarging the dimensions of two distinct publicly available student datasets. The ‘Math dataset’ has 395 observations and 33 features, whereas the ‘Exam dataset’ has 1000 observations and 8 features. Using advanced Python libraries, Conditional Tabular Generative Adversarial Networks and Copula Generative Adversarial Networks, our methodology consists of two phases. First, a mirroring approach where we produce synthetic data matching the volume of the real datasets, focusing on privacy and evaluating predictive accuracy. Second, augmenting the real datasets with newly created synthetic observations to fill gaps in datasets that lack student data. We validate the synthetic data before employing these approaches using Correlation Analysis, Density Analysis, Correlation Heatmaps, and Principal Component Analysis. We then compare the predictive accuracy of whether students will pass or fail their exams across original, synthetic, and augmented datasets. Employing Feedforward Neural Networks, Convolutional Neural Networks, and Gradient-boosted Neural Networks, and using Bayesian optimisation for hyperparameter tuning, this research methodically examines the impact of synthetic data on prediction accuracy. We implement and optimize these models using Python. Our mirroring approach aims to achieve accuracy rates that closely align with the original data. Meanwhile, our augmenting approach seeks to reach a slightly higher accuracy level than when solely learning from the original data. Our findings provide actionable insights into leveraging advanced Generative AI techniques to enhance educational outcomes and meet our objectives successfully.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100293"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24000961/pdfft?md5=84ea8e2d09f812e84420042fb46f8199&pid=1-s2.0-S2666920X24000961-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142270842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-14DOI: 10.1016/j.caeai.2024.100296
Judit Martínez-Moreno , Dominik Petko
Artificial Intelligence in Education (AIEd) is reshaping not only the educational landscape but also potentially influencing the motivations of aspiring teachers. This paper explores whether considerations related to AIEd play a role in student teachers' decision to become teachers. For this aim, the study introduces a new AI subscale within the (D)FIT-Choice scale's Social Utility Value (SUV) factor and validates its effectiveness with a sample of 183 student teachers. Descriptive statistics reveal high mean scores for traditional motivators like Intrinsic Value Teaching, while AI-related factors, although considered, exhibit lower influence. A noticeable disconnection exists between digital motivations and the aspiration to shape the future, suggesting a potential gap in student teachers' understanding of digitalization's future impact. An extreme group analysis reveals a subset of student teachers who significantly consider AI. This group also gives value to Job Security and Make a Social Contribution, suggesting an awareness of AI's societal and professional impacts. Based on these findings, it is recommended to put a focus on teacher education programs to ensure student teachers' understanding of the impact of AI on education and society.
{"title":"What motivates future teachers? The influence of Aartificial Iintelligence on student eachers' career choice","authors":"Judit Martínez-Moreno , Dominik Petko","doi":"10.1016/j.caeai.2024.100296","DOIUrl":"10.1016/j.caeai.2024.100296","url":null,"abstract":"<div><p>Artificial Intelligence in Education (AIEd) is reshaping not only the educational landscape but also potentially influencing the motivations of aspiring teachers. This paper explores whether considerations related to AIEd play a role in student teachers' decision to become teachers. For this aim, the study introduces a new AI subscale within the (D)FIT-Choice scale's Social Utility Value (SUV) factor and validates its effectiveness with a sample of 183 student teachers. Descriptive statistics reveal high mean scores for traditional motivators like Intrinsic Value Teaching, while AI-related factors, although considered, exhibit lower influence. A noticeable disconnection exists between digital motivations and the aspiration to shape the future, suggesting a potential gap in student teachers' understanding of digitalization's future impact. An extreme group analysis reveals a subset of student teachers who significantly consider AI. This group also gives value to Job Security and Make a Social Contribution, suggesting an awareness of AI's societal and professional impacts. Based on these findings, it is recommended to put a focus on teacher education programs to ensure student teachers' understanding of the impact of AI on education and society.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100296"},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24000997/pdfft?md5=b29bf2e1cc48cca9e5d15dd706ad1b0a&pid=1-s2.0-S2666920X24000997-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-13DOI: 10.1016/j.caeai.2024.100297
Unggi Lee , Minji Jeon , Yunseo Lee , Gyuri Byun , Yoorim Son , Jaeyoon Shin , Hongkyu Ko , Hyeoncheol Kim
Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.
{"title":"LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education","authors":"Unggi Lee , Minji Jeon , Yunseo Lee , Gyuri Byun , Yoorim Son , Jaeyoon Shin , Hongkyu Ko , Hyeoncheol Kim","doi":"10.1016/j.caeai.2024.100297","DOIUrl":"10.1016/j.caeai.2024.100297","url":null,"abstract":"<div><p>Despite the development of various <span>AI</span> systems to support learning in various domains, <span>AI</span> assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100297"},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24001000/pdfft?md5=48322b1027ada7b47fe2466e14bfef09&pid=1-s2.0-S2666920X24001000-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1016/j.caeai.2024.100298
Said Al Faraby, Ade Romadhony, Adiwijaya
Large language models (LLMs) like ChatGPT have shown promise in generating educational content, including questions. This study evaluates the effectiveness of LLMs in classifying and generating educational-type questions. We assessed ChatGPT's performance using a dataset of 4,959 user-generated questions labeled into ten categories, employing various prompting techniques and aggregating results with a voting method to enhance robustness. Additionally, we evaluated ChatGPT's accuracy in generating type-specific questions from 100 reading sections sourced from five online textbooks, which were manually reviewed by human evaluators. We also generated questions based on learning objectives and compared their quality to those crafted by human experts, with evaluations by experts and crowdsourced participants.
Our findings reveal that ChatGPT achieved a macro-average F1-score of 0.57 in zero-shot classification, improving to 0.70 when combined with a Random Forest classifier using embeddings. The most effective prompting technique was zero-shot with added definitions, while few-shot and few-shot + Chain of Thought approaches underperformed. The voting method enhanced robustness in classification. In generating type-specific questions, ChatGPT's accuracy was lower than anticipated. However, quality differences between ChatGPT-generated and human-generated questions were not statistically significant, indicating ChatGPT's potential for educational content creation. This study underscores the transformative potential of LLMs in educational practices. By effectively classifying and generating high-quality educational questions, LLMs can reduce the workload on educators and enable personalized learning experiences.
{"title":"Analysis of LLMs for educational question classification and generation","authors":"Said Al Faraby, Ade Romadhony, Adiwijaya","doi":"10.1016/j.caeai.2024.100298","DOIUrl":"10.1016/j.caeai.2024.100298","url":null,"abstract":"<div><p>Large language models (LLMs) like ChatGPT have shown promise in generating educational content, including questions. This study evaluates the effectiveness of LLMs in classifying and generating educational-type questions. We assessed ChatGPT's performance using a dataset of 4,959 user-generated questions labeled into ten categories, employing various prompting techniques and aggregating results with a voting method to enhance robustness. Additionally, we evaluated ChatGPT's accuracy in generating type-specific questions from 100 reading sections sourced from five online textbooks, which were manually reviewed by human evaluators. We also generated questions based on learning objectives and compared their quality to those crafted by human experts, with evaluations by experts and crowdsourced participants.</p><p>Our findings reveal that ChatGPT achieved a macro-average F1-score of 0.57 in zero-shot classification, improving to 0.70 when combined with a Random Forest classifier using embeddings. The most effective prompting technique was zero-shot with added definitions, while few-shot and few-shot + Chain of Thought approaches underperformed. The voting method enhanced robustness in classification. In generating type-specific questions, ChatGPT's accuracy was lower than anticipated. However, quality differences between ChatGPT-generated and human-generated questions were not statistically significant, indicating ChatGPT's potential for educational content creation. This study underscores the transformative potential of LLMs in educational practices. By effectively classifying and generating high-quality educational questions, LLMs can reduce the workload on educators and enable personalized learning experiences.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100298"},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24001012/pdfft?md5=9fd431cb491d50380d5493d0879bc351&pid=1-s2.0-S2666920X24001012-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1016/j.caeai.2024.100294
Clare Baek, Tamara Tate, Mark Warschauer
This study investigates how U.S. college students (N = 1001) perceive and use ChatGPT, exploring its relationship with societal structures and student characteristics. Regression results show that gender, age, major, institution type, and institutional policy significantly influenced ChatGPT use for general, writing, and programming tasks. Students in their 30s–40s were more likely to use ChatGPT frequently than younger students. Non-native English speakers were more likely than native speakers to use ChatGPT frequently for writing, suggesting its potential as a support tool for language learners. Institutional policies allowing ChatGPT use predicted higher use of ChatGPT. Thematic analysis and natural language processing of open-ended responses revealed varied attitudes towards ChatGPT, with some fearing institutional punishment for using ChatGPT and others confident in their appropriate use of ChatGPT. Computer science majors expressed concerns about job displacement due to the advent of generative AI. Higher-income students generally viewed ChatGPT more positively than their lower-income counterparts. Our research underscores how technology can both empower and marginalize within educational settings; we advocate for equitable integration of AI in academic environments for diverse students.
{"title":"“ChatGPT seems too good to be true”: College students’ use and perceptions of generative AI","authors":"Clare Baek, Tamara Tate, Mark Warschauer","doi":"10.1016/j.caeai.2024.100294","DOIUrl":"10.1016/j.caeai.2024.100294","url":null,"abstract":"<div><div>This study investigates how U.S. college students (N = 1001) perceive and use ChatGPT, exploring its relationship with societal structures and student characteristics. Regression results show that gender, age, major, institution type, and institutional policy significantly influenced ChatGPT use for general, writing, and programming tasks. Students in their 30s–40s were more likely to use ChatGPT frequently than younger students. Non-native English speakers were more likely than native speakers to use ChatGPT frequently for writing, suggesting its potential as a support tool for language learners. Institutional policies allowing ChatGPT use predicted higher use of ChatGPT. Thematic analysis and natural language processing of open-ended responses revealed varied attitudes towards ChatGPT, with some fearing institutional punishment for using ChatGPT and others confident in their appropriate use of ChatGPT. Computer science majors expressed concerns about job displacement due to the advent of generative AI. Higher-income students generally viewed ChatGPT more positively than their lower-income counterparts. Our research underscores how technology can both empower and marginalize within educational settings; we advocate for equitable integration of AI in academic environments for diverse students.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100294"},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1016/j.caeai.2024.100289
Stanislav Pozdniakov , Jonathan Brazil , Solmaz Abdi , Aneesha Bakharia , Shazia Sadiq , Dragan Gašević , Paul Denny , Hassan Khosravi
Incorporating Generative Artificial Intelligence (GenAI), especially Large Language Models (LLMs), into educational settings presents valuable opportunities to boost the efficiency of educators and enrich the learning experiences of students. A significant portion of the current use of LLMs by educators has involved using conversational user interfaces (CUIs), such as chat windows, for functions like generating educational materials or offering feedback to learners. The ability to engage in real-time conversations with LLMs, which can enhance educators' domain knowledge across various subjects, has been of high value. However, it also presents challenges to LLMs' widespread, ethical, and effective adoption. Firstly, educators must have a degree of expertise, including tool familiarity, AI literacy and prompting to effectively use CUIs, which can be a barrier to adoption. Secondly, the open-ended design of CUIs makes them exceptionally powerful, which raises ethical concerns, particularly when used for high-stakes decisions like grading. Additionally, there are risks related to privacy and intellectual property, stemming from the potential unauthorised sharing of sensitive information. Finally, CUIs are designed for short, synchronous interactions and often struggle and hallucinate when given complex, multi-step tasks (e.g., providing individual feedback based on a rubric on a large scale). To address these challenges, we explored the benefits of transitioning away from employing LLMs via CUIs to the creation of applications with user-friendly interfaces that leverage LLMs through API calls. We first propose a framework for pedagogically sound and ethically responsible incorporation of GenAI into educational tools, emphasizing a human-centred design. We then illustrate the application of our framework to the design and implementation of a novel tool called Feedback Copilot, which enables instructors to provide students with personalized qualitative feedback on their assignments in classes of any size. An evaluation involving the generation of feedback from two distinct variations of the Feedback Copilot tool, using numerically graded assignments from 338 students, demonstrates the viability and effectiveness of our approach. Our findings have significant implications for GenAI application researchers, educators seeking to leverage accessible GenAI tools, and educational technologists aiming to transcend the limitations of conversational AI interfaces, thereby charting a course for the future of GenAI in education.
{"title":"Large language models meet user interfaces: The case of provisioning feedback","authors":"Stanislav Pozdniakov , Jonathan Brazil , Solmaz Abdi , Aneesha Bakharia , Shazia Sadiq , Dragan Gašević , Paul Denny , Hassan Khosravi","doi":"10.1016/j.caeai.2024.100289","DOIUrl":"10.1016/j.caeai.2024.100289","url":null,"abstract":"<div><p>Incorporating Generative Artificial Intelligence (GenAI), especially Large Language Models (LLMs), into educational settings presents valuable opportunities to boost the efficiency of educators and enrich the learning experiences of students. A significant portion of the current use of LLMs by educators has involved using conversational user interfaces (CUIs), such as chat windows, for functions like generating educational materials or offering feedback to learners. The ability to engage in real-time conversations with LLMs, which can enhance educators' domain knowledge across various subjects, has been of high value. However, it also presents challenges to LLMs' widespread, ethical, and effective adoption. Firstly, educators must have a degree of expertise, including tool familiarity, AI literacy and prompting to effectively use CUIs, which can be a barrier to adoption. Secondly, the open-ended design of CUIs makes them exceptionally powerful, which raises ethical concerns, particularly when used for high-stakes decisions like grading. Additionally, there are risks related to privacy and intellectual property, stemming from the potential unauthorised sharing of sensitive information. Finally, CUIs are designed for short, synchronous interactions and often struggle and hallucinate when given complex, multi-step tasks (e.g., providing individual feedback based on a rubric on a large scale). To address these challenges, we explored the benefits of transitioning away from employing LLMs via CUIs to the creation of applications with user-friendly interfaces that leverage LLMs through API calls. We first propose a framework for pedagogically sound and ethically responsible incorporation of GenAI into educational tools, emphasizing a human-centred design. We then illustrate the application of our framework to the design and implementation of a novel tool called Feedback Copilot, which enables instructors to provide students with personalized qualitative feedback on their assignments in classes of any size. An evaluation involving the generation of feedback from two distinct variations of the Feedback Copilot tool, using numerically graded assignments from 338 students, demonstrates the viability and effectiveness of our approach. Our findings have significant implications for GenAI application researchers, educators seeking to leverage accessible GenAI tools, and educational technologists aiming to transcend the limitations of conversational AI interfaces, thereby charting a course for the future of GenAI in education.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100289"},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24000924/pdfft?md5=3bb8f6c7b9b7274023053e63f7ea8a0a&pid=1-s2.0-S2666920X24000924-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1016/j.caeai.2024.100299
Wei Dai , Yi-Shan Tsai , Jionghao Lin , Ahmad Aldino , Hua Jin , Tongguang Li , Dragan Gašević , Guanliang Chen
Assessment feedback is important to student learning. Learning analytics (LA) powered by artificial intelligence exhibits profound potential in helping instructors with the laborious provision of feedback. Inspired by the recent advancements made by Generative Pre-trained Transformer (GPT) models, we conducted a study to examine the extent to which GPT models hold the potential to advance the existing knowledge of LA-supported feedback systems towards improving the efficiency of feedback provision. Therefore, our study explored the ability of two versions of GPT models – i.e., GPT-3.5 (ChatGPT) and GPT-4 – to generate assessment feedback on students' writing assessment tasks, common in higher education, with open-ended topics for a data science-related course. We compared the feedback generated by GPT models (namely GPT-3.5 and GPT-4) with the feedback provided by human instructors in terms of readability, effectiveness (content containing effective feedback components), and reliability (correct assessment on student performance). Results showed that (1) both GPT-3.5 and GPT-4 were able to generate more readable feedback with greater consistency than human instructors, (2) GPT-4 outperformed GPT-3.5 and human instructors in providing feedback containing information about effective feedback dimensions, including feeding-up, feeding-forward, process level, and self-regulation level, and (3) GPT-4 demonstrated higher reliability of feedback compared to GPT-3.5. Based on our findings, we discussed the potential opportunities and challenges of utilising GPT models in assessment feedback generation.
{"title":"Assessing the proficiency of large language models in automatic feedback generation: An evaluation study","authors":"Wei Dai , Yi-Shan Tsai , Jionghao Lin , Ahmad Aldino , Hua Jin , Tongguang Li , Dragan Gašević , Guanliang Chen","doi":"10.1016/j.caeai.2024.100299","DOIUrl":"10.1016/j.caeai.2024.100299","url":null,"abstract":"<div><p>Assessment feedback is important to student learning. Learning analytics (LA) powered by artificial intelligence exhibits profound potential in helping instructors with the laborious provision of feedback. Inspired by the recent advancements made by Generative Pre-trained Transformer (GPT) models, we conducted a study to examine the extent to which GPT models hold the potential to advance the existing knowledge of LA-supported feedback systems towards improving the efficiency of feedback provision. Therefore, our study explored the ability of two versions of GPT models – i.e., GPT-3.5 (ChatGPT) and GPT-4 – to generate assessment feedback on students' writing assessment tasks, common in higher education, with open-ended topics for a data science-related course. We compared the feedback generated by GPT models (namely GPT-3.5 and GPT-4) with the feedback provided by human instructors in terms of readability, effectiveness (content containing effective feedback components), and reliability (correct assessment on student performance). Results showed that (1) both GPT-3.5 and GPT-4 were able to generate more readable feedback with greater consistency than human instructors, (2) GPT-4 outperformed GPT-3.5 and human instructors in providing feedback containing information about effective feedback dimensions, including feeding-up, feeding-forward, process level, and self-regulation level, and (3) GPT-4 demonstrated higher reliability of feedback compared to GPT-3.5. Based on our findings, we discussed the potential opportunities and challenges of utilising GPT models in assessment feedback generation.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100299"},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24001024/pdfft?md5=54f423223507728d270e7cb6c02a10bb&pid=1-s2.0-S2666920X24001024-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1016/j.caeai.2024.100292
Cheng Ning Loong, Chih-Chen Chang
A student's learning system is a system that guides the student's knowledge acquisition process using available learning resources to produce certain learning outcomes that can be evaluated based on the scores of questions in an assessment. Such a learning system is analogous to a control system, which regulates the process of a plant through a controller in order to generate a desired response that can be inferred from sensor measurements. Inspired by this analogy, this study proposes to model the monitoring of students' knowledge acquisition process from a control-theory viewpoint, which is referred to as control knowledge tracing (CtrKT). The proposed CtrKT comprises a dynamic equation that characterizes the temporal variation of students' knowledge states in response to the effects of learning resources and an observation equation that maps their knowledge states to question scores. With this formulation, CtrKT enables tracking students' knowledge states, predicting their assessment performance, and teaching planning. The insights and accuracy of CtrKT in postulating the knowledge acquisition process are analyzed and validated using experimental data from psychology literature and two naturalistic datasets collected from a civil engineering undergraduate course. Results verify the feasibility of using CtrKT to estimate the overall assessment performance of the participants in the psychology experiments and the students in the naturalistic datasets. Lastly, this study explores the use of CtrKT for teaching scheduling and optimization, discusses its modeling issues, and compares it with other knowledge-tracing approaches.
{"title":"Control knowledge tracing: Modeling students' learning dynamics from a control-theory perspective","authors":"Cheng Ning Loong, Chih-Chen Chang","doi":"10.1016/j.caeai.2024.100292","DOIUrl":"10.1016/j.caeai.2024.100292","url":null,"abstract":"<div><p>A student's learning system is a system that guides the student's knowledge acquisition process using available learning resources to produce certain learning outcomes that can be evaluated based on the scores of questions in an assessment. Such a learning system is analogous to a control system, which regulates the process of a plant through a controller in order to generate a desired response that can be inferred from sensor measurements. Inspired by this analogy, this study proposes to model the monitoring of students' knowledge acquisition process from a control-theory viewpoint, which is referred to as control knowledge tracing (CtrKT). The proposed CtrKT comprises a dynamic equation that characterizes the temporal variation of students' knowledge states in response to the effects of learning resources and an observation equation that maps their knowledge states to question scores. With this formulation, CtrKT enables tracking students' knowledge states, predicting their assessment performance, and teaching planning. The insights and accuracy of CtrKT in postulating the knowledge acquisition process are analyzed and validated using experimental data from psychology literature and two naturalistic datasets collected from a civil engineering undergraduate course. Results verify the feasibility of using CtrKT to estimate the overall assessment performance of the participants in the psychology experiments and the students in the naturalistic datasets. Lastly, this study explores the use of CtrKT for teaching scheduling and optimization, discusses its modeling issues, and compares it with other knowledge-tracing approaches.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100292"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X2400095X/pdfft?md5=f28faf847be38deedd55f86aa939a437&pid=1-s2.0-S2666920X2400095X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142164264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}