Pub Date : 2025-11-17DOI: 10.1016/j.caeai.2025.100500
Yingzhao Chen
Reformulation is a form of written corrective feedback to help second language (L2) learners improve their writing. This study examined whether ChatGPT could produce reformulations that (1) retain the meanings of the original essays and (2) are linguistically more developed than learners’ original essays. In addition, three types of ChatGPT prompts were compared to see which type yielded better reformulations. One thousand two hundred argumentative essays written for the TOEFL iBT® independent writing task were submitted to ChatGPT. ROUGE-L scores, used as a proxy for meaning retention, showed that ChatGPT reformulations largely retained the meaning of the original essays. A qualitative examination was conducted to examine the major types of changes ChatGPT made. For linguistic features, the ChatGPT reformulations were compared with the original essays for syntactic complexity, lexical sophistication, lexical diversity, and cohesion. Results showed that while ChatGPT reformulations were more developed for most linguistic features than the original essays, the reformulations did worse in cohesion. ChatGPT prompts with specific instructions produced reformulations with more developed linguistic features than a generic prompt. Findings were discussed in terms of how to use ChatGPT to generate reformulations and how to use the reformulations to improve L2 writing.
{"title":"Evaluating the potential of ChatGPT-reformulated essays as written feedback in L2 writing","authors":"Yingzhao Chen","doi":"10.1016/j.caeai.2025.100500","DOIUrl":"10.1016/j.caeai.2025.100500","url":null,"abstract":"<div><div>Reformulation is a form of written corrective feedback to help second language (L2) learners improve their writing. This study examined whether ChatGPT could produce reformulations that (1) retain the meanings of the original essays and (2) are linguistically more developed than learners’ original essays. In addition, three types of ChatGPT prompts were compared to see which type yielded better reformulations. One thousand two hundred argumentative essays written for the TOEFL iBT® independent writing task were submitted to ChatGPT. ROUGE-L scores, used as a proxy for meaning retention, showed that ChatGPT reformulations largely retained the meaning of the original essays. A qualitative examination was conducted to examine the major types of changes ChatGPT made. For linguistic features, the ChatGPT reformulations were compared with the original essays for syntactic complexity, lexical sophistication, lexical diversity, and cohesion. Results showed that while ChatGPT reformulations were more developed for most linguistic features than the original essays, the reformulations did worse in cohesion. ChatGPT prompts with specific instructions produced reformulations with more developed linguistic features than a generic prompt. Findings were discussed in terms of how to use ChatGPT to generate reformulations and how to use the reformulations to improve L2 writing.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100500"},"PeriodicalIF":0.0,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145568214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-17DOI: 10.1016/j.caeai.2025.100501
Carlos Medel-Vera , Sandy Britton , William Francis Gates
This paper explores the role of generative AI (GenAI) in supporting creativity within architectural education through the lens of a student-led AI drawing competition. The research addresses two questions: (1) how creative are students' text prompts and the resulting AI-generated images, and is there a relationship between them? and (2) to what extent do students perceive GenAI as a supportive tool in their creative process? Drawing on a mixed-methods approach, the study combines semantic analysis of text prompts, aesthetic evaluation of AI-generated images, and a Creativity Support Index (CSI) survey, complemented by sentiment analysis of student feedback. The semantic analysis reveals varying levels of conceptual richness across prompts, with higher divergence correlating to more open-ended and expressive image results. The CSI data indicates strong support for exploratory and goal-directed creativity, with high scores in exploration and results-worth-effort dimensions. These findings suggest that GenAI can function as both a collaborator and provocateur in design pedagogy, facilitating creative ideation while inviting new pedagogical strategies centred on prompt literacy and reflective design. The study concludes by discussing implications for integrating AI tools into design education, emphasising the pedagogical value of prompt literacy, and calling for further research on creative agency and authorship in hybrid human–AI workflows.
{"title":"An exploration of the role of generative AI in fostering creativity in architectural learning environments","authors":"Carlos Medel-Vera , Sandy Britton , William Francis Gates","doi":"10.1016/j.caeai.2025.100501","DOIUrl":"10.1016/j.caeai.2025.100501","url":null,"abstract":"<div><div>This paper explores the role of generative AI (GenAI) in supporting creativity within architectural education through the lens of a student-led AI drawing competition. The research addresses two questions: (1) how creative are students' text prompts and the resulting AI-generated images, and is there a relationship between them? and (2) to what extent do students perceive GenAI as a supportive tool in their creative process? Drawing on a mixed-methods approach, the study combines semantic analysis of text prompts, aesthetic evaluation of AI-generated images, and a Creativity Support Index (CSI) survey, complemented by sentiment analysis of student feedback. The semantic analysis reveals varying levels of conceptual richness across prompts, with higher divergence correlating to more open-ended and expressive image results. The CSI data indicates strong support for exploratory and goal-directed creativity, with high scores in exploration and results-worth-effort dimensions. These findings suggest that GenAI can function as both a collaborator and provocateur in design pedagogy, facilitating creative ideation while inviting new pedagogical strategies centred on prompt literacy and reflective design. The study concludes by discussing implications for integrating AI tools into design education, emphasising the pedagogical value of prompt literacy, and calling for further research on creative agency and authorship in hybrid human–AI workflows.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100501"},"PeriodicalIF":0.0,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145568751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-15DOI: 10.1016/j.caeai.2025.100496
Jack Tsao
Generative artificial intelligence (GenAI) in higher education has introduced a spectrum of ethical challenges, significantly impacting learning outcomes, pedagogies, and assessments. Based on the experiences and perspectives of students and teachers at a research-intensive university in Hong Kong, the study draws on qualitative interview data with 58 undergraduate and graduate students and 12 teachers conducted in early 2025. Through the concept of policy trajectories (Ball, 1993; Ball et al., 2012), the research analyses the interconnections between material contexts and discursive constructions in how AI policies (and their absence) are framed, interpreted, enacted, and resisted. The findings reveal general concerns about academic integrity, fairness, equity, privacy, and data security, including specifically the invisible labour in dealing with ambiguous policies, uneven enforcement strategies, loopholes to avoid detection, disparities in access to state-of-the-art tools, and the cognitive and other developmental impacts due to overreliance on GenAI tools. Institutional ambiguity in policy supported experimentation and the appearance of progress, but risked individualising failure on teachers and students. Some actionable insights for university leaders and policymakers, teaching development centres, and individual teachers and programme coordinators include clearer messaging, the need for adaptive policies and guidelines with ongoing student and teacher participation, availability of digital libraries of toolkits, case studies and other resources, building in early “failure experiences”, and exposing students to authentic real-world applications and encounters to cultivate awareness on the limitations of GenAI. Ultimately, policy responses need to be both contextually and pragmatically sensitive, requiring on-the-ground experimentation and care by teachers.
高等教育中的生成式人工智能(GenAI)带来了一系列伦理挑战,对学习成果、教学方法和评估产生了重大影响。该研究基于香港一所研究型大学学生和教师的经验和观点,采用了2025年初对58名本科生和研究生以及12名教师进行的定性访谈数据。通过政策轨迹的概念(Ball, 1993; Ball et al., 2012),该研究分析了人工智能政策(及其缺席)如何被框架、解释、制定和抵制的物质背景和话语结构之间的相互联系。调查结果揭示了对学术诚信、公平、公平、隐私和数据安全的普遍担忧,特别是在处理模棱两可的政策、不平衡的执行策略、避免检测的漏洞、获取最先进工具的差距以及过度依赖GenAI工具造成的认知和其他发展影响方面。政策中的制度模糊性支持了实验和进步的表象,但却有可能导致教师和学生的个人失败。为大学领导和政策制定者、教学发展中心、教师个人和项目协调员提供的一些可操作的见解包括:更清晰的信息传递、在学生和教师持续参与的情况下制定适应性政策和指导方针的必要性、工具箱、案例研究和其他资源的数字图书馆的可用性、建立早期“失败经验”、让学生接触真实世界的应用和遭遇,以培养对GenAI局限性的认识。最终,政策反应需要对环境和实际情况都敏感,需要实地实验和教师的关心。
{"title":"Trajectories of AI policy in higher education: Interpretations, discourses, and enactments of students and teachers","authors":"Jack Tsao","doi":"10.1016/j.caeai.2025.100496","DOIUrl":"10.1016/j.caeai.2025.100496","url":null,"abstract":"<div><div>Generative artificial intelligence (GenAI) in higher education has introduced a spectrum of ethical challenges, significantly impacting learning outcomes, pedagogies, and assessments. Based on the experiences and perspectives of students and teachers at a research-intensive university in Hong Kong, the study draws on qualitative interview data with 58 undergraduate and graduate students and 12 teachers conducted in early 2025. Through the concept of policy trajectories (Ball, 1993; Ball et al., 2012), the research analyses the interconnections between material contexts and discursive constructions in how AI policies (and their absence) are framed, interpreted, enacted, and resisted. The findings reveal general concerns about academic integrity, fairness, equity, privacy, and data security, including specifically the invisible labour in dealing with ambiguous policies, uneven enforcement strategies, loopholes to avoid detection, disparities in access to state-of-the-art tools, and the cognitive and other developmental impacts due to overreliance on GenAI tools. Institutional ambiguity in policy supported experimentation and the appearance of progress, but risked individualising failure on teachers and students. Some actionable insights for university leaders and policymakers, teaching development centres, and individual teachers and programme coordinators include clearer messaging, the need for adaptive policies and guidelines with ongoing student and teacher participation, availability of digital libraries of toolkits, case studies and other resources, building in early “failure experiences”, and exposing students to authentic real-world applications and encounters to cultivate awareness on the limitations of GenAI. Ultimately, policy responses need to be both contextually and pragmatically sensitive, requiring on-the-ground experimentation and care by teachers.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100496"},"PeriodicalIF":0.0,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145568745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-13DOI: 10.1016/j.caeai.2025.100497
Manuel Oliveira, Carlos Zednik, Gunter Bombaerts, Bert Sadowski, Rianne Conijn
As generative AI (GenAI) transforms how students learn and work, higher education must rethink its assessment strategies. This paper introduces a conceptual framework, DRIVE, and a taxonomy to help educators evaluate student learning based on their interactions with GenAI chatbots. Although existing research maps student-GenAI interactions to writing outcomes, practice-oriented tools for assessing evidence of domain-specific learning beyond general AI literacy skills or general writing skills remain underexplored. We propose that GenAI interactions can serve as a valid indicator of learning by revealing how students steer the interaction (Directive Reasoning Interaction) and articulate acquired knowledge into the dialogue with AI (Visible Expertise). We conducted a multi-methods analysis of GenAI interaction annotations (n = 1450) from graded essays (n = 70) in STEM writing-intensive courses. A strong positive correlation was found between the quality GenAI interactions and final essay scores, validating the feasibility of this assessment approach. Furthermore, our taxonomy revealed distinct GenAI interaction profiles: High essay scores were connected to a ”targeted improvement partnership” focused on text refinement, whereas high interaction scores were linked to a ”collaborative intellectual partnership” centered on idea development. In contrast, below-average scores were associated with ”basic information retrieval” or ”passive task delegation” profiles. These findings demonstrate how the assessment method (output vs. process focus) may shape students’ GenAI usage. Traditional assessment can reinforce text optimization, while process-focused evaluation may reward an exploratory partnership with AI. The DRIVE framework and the taxonomy offer educators and researchers a practical tool to design assessments that capture learning in AI-integrated classrooms.
{"title":"Assessing students’ DRIVE: A framework to evaluate learning through interactions with generative AI","authors":"Manuel Oliveira, Carlos Zednik, Gunter Bombaerts, Bert Sadowski, Rianne Conijn","doi":"10.1016/j.caeai.2025.100497","DOIUrl":"10.1016/j.caeai.2025.100497","url":null,"abstract":"<div><div>As generative AI (GenAI) transforms how students learn and work, higher education must rethink its assessment strategies. This paper introduces a conceptual framework, DRIVE, and a taxonomy to help educators evaluate student learning based on their interactions with GenAI chatbots. Although existing research maps student-GenAI interactions to writing outcomes, practice-oriented tools for assessing evidence of domain-specific learning beyond general AI literacy skills or general writing skills remain underexplored. We propose that GenAI interactions can serve as a valid indicator of learning by revealing how students steer the interaction (Directive Reasoning Interaction) and articulate acquired knowledge into the dialogue with AI (Visible Expertise). We conducted a multi-methods analysis of GenAI interaction annotations (<em>n</em> = 1450) from graded essays (<em>n</em> = 70) in STEM writing-intensive courses. A strong positive correlation was found between the quality GenAI interactions and final essay scores, validating the feasibility of this assessment approach. Furthermore, our taxonomy revealed distinct GenAI interaction profiles: High essay scores were connected to a ”targeted improvement partnership” focused on text refinement, whereas high interaction scores were linked to a ”collaborative intellectual partnership” centered on idea development. In contrast, below-average scores were associated with ”basic information retrieval” or ”passive task delegation” profiles. These findings demonstrate how the assessment method (output vs. process focus) may shape students’ GenAI usage. Traditional assessment can reinforce text optimization, while process-focused evaluation may reward an exploratory partnership with AI. The DRIVE framework and the taxonomy offer educators and researchers a practical tool to design assessments that capture learning in AI-integrated classrooms.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100497"},"PeriodicalIF":0.0,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145519443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.caeai.2025.100495
Noemi V. Mendoza Diaz , So Yoon Yoon , Nancy Gertrudiz Salvador
Attitudes can constitute barriers to engineering, computing, and artificial intelligence (AI) enculturation, contributing to and resulting from digital inequity. Building upon research on computational thinking privilege, we explored first-year students' (a) perceived future impact of AI on their career prospects and (b) backgrounds (e.g., gender, underrepresented minority (URM) status, and First-Generation status) associated with their attitudes toward AI, computational thinking, and course performance. Computational thinking was measured using our newly validated Engineering Computational Thinking Diagnostic (ECTD), while course performance was assessed based on final grades in an introductory computing course at a Southwestern institution—the first coding experience for many students. For the fall 2021 participant cohort of 163 first-year engineering and computing students, 40.9 % expressed positive attitudes toward AI in their career prospects, with 48.9 % of them having prior computer science course experience. Regarding their backgrounds, the number of CS courses taken before college significantly correlated with their attitudes toward AI, ECTD scores, and course grades—irrespective of gender, URM status, residence, First-Generation, or First-Time-in-College status. These findings support the notion that computational thinking privilege, shaped by prior exposure and access to resources, contributes to digital inequity and influences attitudes. Specifically, students' cognitive attitudes toward AI have the potential to shape AI literacy and education, potentially perpetuating inequities in an increasingly AI-driven world.
{"title":"Digital equity and computational thinking privilege: The case of first-year engineering and computing students' attitudes towards artificial intelligence","authors":"Noemi V. Mendoza Diaz , So Yoon Yoon , Nancy Gertrudiz Salvador","doi":"10.1016/j.caeai.2025.100495","DOIUrl":"10.1016/j.caeai.2025.100495","url":null,"abstract":"<div><div>Attitudes can constitute barriers to engineering, computing, and artificial intelligence (AI) enculturation, contributing to and resulting from digital inequity. Building upon research on computational thinking privilege, we explored first-year students' (a) perceived future impact of AI on their career prospects and (b) backgrounds (e.g., gender, underrepresented minority (URM) status, and First-Generation status) associated with their attitudes toward AI, computational thinking, and course performance. Computational thinking was measured using our newly validated Engineering Computational Thinking Diagnostic (ECTD), while course performance was assessed based on final grades in an introductory computing course at a Southwestern institution—the first coding experience for many students. For the fall 2021 participant cohort of 163 first-year engineering and computing students, 40.9 % expressed positive attitudes toward AI in their career prospects, with 48.9 % of them having prior computer science course experience. Regarding their backgrounds, the number of CS courses taken before college significantly correlated with their attitudes toward AI, ECTD scores, and course grades—irrespective of gender, URM status, residence, First-Generation, or First-Time-in-College status. These findings support the notion that computational thinking privilege, shaped by prior exposure and access to resources, contributes to digital inequity and influences attitudes. Specifically, students' cognitive attitudes toward AI have the potential to shape AI literacy and education, potentially perpetuating inequities in an increasingly AI-driven world.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100495"},"PeriodicalIF":0.0,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145568752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.1016/j.caeai.2025.100491
Osnat Atias, Areej Mawasi
The growing presence of Artificial Intelligence (AI) in society increases the exposure of children and youth to these technologies. In response, recent research introduced educational programs that foster AI knowledge and competencies, collectively comprising AI literacy. This study presents a systematic review of 23 articles published up to 2023 describing AI literacy programs for children and youth. We examined: (1) motivations for teaching AI literacy, (2) conceptualizations of AI literacy that informed program design, and (3) learning theories and pedagogical methods employed. The analysis identified five motivational themes: workforce, informed users, purposeful creators, advocacy, and social good. Seventeen AI literacy frameworks and conceptual models were identified and grouped into four themes: competency-based, computational, sociotechnical, and practice-based. Application of a three-dimensional model of literacy (operational, sociocultural, and critical), shows that the operational dimension predominates in both frameworks and program designs, the sociocultural dimension is less accentuated, and the critical dimension is least evident. Cognitive constructivism emerged as the dominant learning theory guiding program design, often supported by hands-on activities and project-based learning methods. This systematic review advances understanding of the conceptual drivers shaping AI literacy programs for children and youth. The findings highlight the need for stronger conceptualizations of sociocultural and critical AI literacies and for their more balanced integration into educational programs. Addressing these gaps would better support broad motivations for teaching AI to children and youth, such as fostering social and ethical understanding and agency, and guide future research towards more comprehensive and critically informed frameworks.
{"title":"Conceptualizing AI literacies for children and youth: A systematic review on the design of AI literacy educational programs","authors":"Osnat Atias, Areej Mawasi","doi":"10.1016/j.caeai.2025.100491","DOIUrl":"10.1016/j.caeai.2025.100491","url":null,"abstract":"<div><div>The growing presence of Artificial Intelligence (AI) in society increases the exposure of children and youth to these technologies. In response, recent research introduced educational programs that foster AI knowledge and competencies, collectively comprising AI literacy. This study presents a systematic review of 23 articles published up to 2023 describing AI literacy programs for children and youth. We examined: (1) motivations for teaching AI literacy, (2) conceptualizations of AI literacy that informed program design, and (3) learning theories and pedagogical methods employed. The analysis identified five motivational themes: workforce, informed users, purposeful creators, advocacy, and social good. Seventeen AI literacy frameworks and conceptual models were identified and grouped into four themes: competency-based, computational, sociotechnical, and practice-based. Application of a three-dimensional model of literacy (operational, sociocultural, and critical), shows that the operational dimension predominates in both frameworks and program designs, the sociocultural dimension is less accentuated, and the critical dimension is least evident. Cognitive constructivism emerged as the dominant learning theory guiding program design, often supported by hands-on activities and project-based learning methods. This systematic review advances understanding of the conceptual drivers shaping AI literacy programs for children and youth. The findings highlight the need for stronger conceptualizations of sociocultural and critical AI literacies and for their more balanced integration into educational programs. Addressing these gaps would better support broad motivations for teaching AI to children and youth, such as fostering social and ethical understanding and agency, and guide future research towards more comprehensive and critically informed frameworks.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100491"},"PeriodicalIF":0.0,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-28DOI: 10.1016/j.caeai.2025.100490
Sutapa Dey Tithi, Arun Kumar Ramesh, Clara DiMarco, Xiaoyi Tian, Nazia Alam, Kimia Fazeli, Tiffany Barnes
Intelligent tutoring systems have demonstrated effectiveness in teaching formal propositional logic proofs, but their reliance on template-based explanations limits their ability to provide personalized student feedback. While large language models (LLMs) offer promising capabilities for dynamic feedback generation, they risk producing hallucinations or pedagogically unsound explanations. We evaluated the stepwise accuracy of LLMs in constructing multi-step symbolic logic proofs, comparing six prompting techniques across four state-of-the-art LLMs on 358 propositional logic problems. Results show that DeepSeek-V3 achieved superior performance with upto 86.7 % accuracy on stepwise proof construction and excelled particularly in simpler rules. We further used the best-performing LLM to generate explanatory hints for 1050 unique student problem-solving states from a logic ITS and evaluated them on 4 criteria with both an LLM grader and human expert ratings on a 20 % sample. Our analysis finds that LLM-generated hints were 75 % accurate and rated highly by human evaluators on consistency and clarity, but did not perform as well in explaining why the hint was provided or its larger context. Our results demonstrate that LLMs may be used to augment tutoring systems with logic tutoring hints, but those hints require additional modifications to ensure accuracy and pedagogical appropriateness.
{"title":"The promise and limits of LLMs in constructing proofs and hints for logic problems in intelligent tutoring systems","authors":"Sutapa Dey Tithi, Arun Kumar Ramesh, Clara DiMarco, Xiaoyi Tian, Nazia Alam, Kimia Fazeli, Tiffany Barnes","doi":"10.1016/j.caeai.2025.100490","DOIUrl":"10.1016/j.caeai.2025.100490","url":null,"abstract":"<div><div>Intelligent tutoring systems have demonstrated effectiveness in teaching formal propositional logic proofs, but their reliance on template-based explanations limits their ability to provide personalized student feedback. While large language models (LLMs) offer promising capabilities for dynamic feedback generation, they risk producing hallucinations or pedagogically unsound explanations. We evaluated the stepwise accuracy of LLMs in constructing multi-step symbolic logic proofs, comparing six prompting techniques across four state-of-the-art LLMs on 358 propositional logic problems. Results show that DeepSeek-V3 achieved superior performance with upto 86.7 % accuracy on stepwise proof construction and excelled particularly in simpler rules. We further used the best-performing LLM to generate explanatory hints for 1050 unique student problem-solving states from a logic ITS and evaluated them on 4 criteria with both an LLM grader and human expert ratings on a 20 % sample. Our analysis finds that LLM-generated hints were 75 % accurate and rated highly by human evaluators on consistency and clarity, but did not perform as well in explaining why the hint was provided or its larger context. Our results demonstrate that LLMs may be used to augment tutoring systems with logic tutoring hints, but those hints require additional modifications to ensure accuracy and pedagogical appropriateness.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100490"},"PeriodicalIF":0.0,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145519441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-28DOI: 10.1016/j.caeai.2025.100493
Arne Bewersdorff , Claudia Nerdel , Xiaoming Zhai
This systematic review maps the empirical landscape of AI literacy by examining its correlations with a diverse array of affective, behavioral, cognitive and contextual variables. Building on the review of AI literacy scales by Lintner (2024), we analyzed 31 empirical studies that applied six of those AI literacy scales, covering 14 countries and a range of participant groups. Our findings reveal robust correlations of AI literacy with AI self-efficacy, positive AI attitudes, motivation, and digital competencies, and negative correlations with AI anxiety and negative AI attitudes. Personal factors such as age appear largely uncorrelated with AI literacy. The review reveals measurement challenges regarding AI literacy: discrepancies between self-assessment scales and performance-based tests suggest that metacognitive biases like the Dunning Kruger effect may inflate certain correlations with self-assessment AI literacy scales. Despite these challenges, the robust findings provide a solid foundation for future research.
{"title":"How AI literacy correlates with affective, behavioral, cognitive and contextual variables: A systematic review","authors":"Arne Bewersdorff , Claudia Nerdel , Xiaoming Zhai","doi":"10.1016/j.caeai.2025.100493","DOIUrl":"10.1016/j.caeai.2025.100493","url":null,"abstract":"<div><div>This systematic review maps the empirical landscape of AI literacy by examining its correlations with a diverse array of affective, behavioral, cognitive and contextual variables. Building on the review of AI literacy scales by Lintner (2024), we analyzed 31 empirical studies that applied six of those AI literacy scales, covering 14 countries and a range of participant groups. Our findings reveal robust correlations of AI literacy with AI self-efficacy, positive AI attitudes, motivation, and digital competencies, and negative correlations with AI anxiety and negative AI attitudes. Personal factors such as age appear largely uncorrelated with AI literacy. The review reveals measurement challenges regarding AI literacy: discrepancies between self-assessment scales and performance-based tests suggest that metacognitive biases like the Dunning Kruger effect may inflate certain correlations with self-assessment AI literacy scales. Despite these challenges, the robust findings provide a solid foundation for future research.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100493"},"PeriodicalIF":0.0,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145415151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-28DOI: 10.1016/j.caeai.2025.100494
Tahereh Heydarnejad
This study explores the impact of embedding self-evaluation within AI-supported writing instruction on learners’ cognitive emotion regulation, self-competence, motivation, and writing achievement. Conducted at a high school in Iran, the research utilized a quantitative quasi-experimental pretest-posttest design involving two intact pre-intermediate writing classes randomly assigned to an experimental group and a control group. The experimental group received instruction that combined AI tools with structured self-evaluation activities, whereas the control group followed a traditional teaching approach without AI integration or self-evaluation. Data were collected using the Cognitive Emotion Regulation Questionnaire, the Self-Competence Scale, the Academic Motivation Scale, and standardized writing assessments. Statistical analyses, including Chi-square tests and t-tests, indicated that the experimental group significantly outperformed the control group across all measured variables, demonstrating improvements in cognitive emotion regulation, self-competence, motivation, and writing achievement. These results underscore the value of integrating self-evaluation practices alongside AI tools to enhance learner outcomes in EFL writing contexts.
{"title":"Unmasking the impacts of self-evaluation in AI-supported writing instruction on EFL learners’ emotion regulation, self-competence, motivation, and writing achievement","authors":"Tahereh Heydarnejad","doi":"10.1016/j.caeai.2025.100494","DOIUrl":"10.1016/j.caeai.2025.100494","url":null,"abstract":"<div><div>This study explores the impact of embedding self-evaluation within AI-supported writing instruction on learners’ cognitive emotion regulation, self-competence, motivation, and writing achievement. Conducted at a high school in Iran, the research utilized a quantitative quasi-experimental pretest-posttest design involving two intact pre-intermediate writing classes randomly assigned to an experimental group and a control group. The experimental group received instruction that combined AI tools with structured self-evaluation activities, whereas the control group followed a traditional teaching approach without AI integration or self-evaluation. Data were collected using the Cognitive Emotion Regulation Questionnaire, the Self-Competence Scale, the Academic Motivation Scale, and standardized writing assessments. Statistical analyses, including Chi-square tests and t-tests, indicated that the experimental group significantly outperformed the control group across all measured variables, demonstrating improvements in cognitive emotion regulation, self-competence, motivation, and writing achievement. These results underscore the value of integrating self-evaluation practices alongside AI tools to enhance learner outcomes in EFL writing contexts.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100494"},"PeriodicalIF":0.0,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145415153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-24DOI: 10.1016/j.caeai.2025.100492
Hui Gao
With the globalization of academic exchanges, the importance of academic English writing quality has become increasingly prominent. Especially for non-native speakers, grammar and language quality in academic English writing significantly affect the readability and academic value of articles. Therefore, this study proposes an academic English content optimization method based on generative adversarial networks and data augmentation. The method uses Transformer as the generator, combines generative adversarial networks with data augmentation techniques to generate high-quality pseudo error correction sentence pairs, and optimizes model performance through policy gradient methods. Although academic English is used as the application context in this study, the architecture can be adapted to other English writing genres given appropriate training corpora. From the results, when the iteration reached 500, the precision was 0.98 and the recall was 0.10. The accuracy-2, F1 score, mean absolute error, correlation coefficient index, and accuracy-7 values of the proposed academic English content optimization model were 87.8, 89.2, 0.05, 0.69, and 97.6. The proposed model has higher accuracy and efficiency on multiple datasets, which can effectively optimize various types of English grammar errors, providing new solutions for content optimization in academic English writing.
{"title":"Optimization method for academic English content based on generative adversarial networks and data augmentation","authors":"Hui Gao","doi":"10.1016/j.caeai.2025.100492","DOIUrl":"10.1016/j.caeai.2025.100492","url":null,"abstract":"<div><div>With the globalization of academic exchanges, the importance of academic English writing quality has become increasingly prominent. Especially for non-native speakers, grammar and language quality in academic English writing significantly affect the readability and academic value of articles. Therefore, this study proposes an academic English content optimization method based on generative adversarial networks and data augmentation. The method uses Transformer as the generator, combines generative adversarial networks with data augmentation techniques to generate high-quality pseudo error correction sentence pairs, and optimizes model performance through policy gradient methods. Although academic English is used as the application context in this study, the architecture can be adapted to other English writing genres given appropriate training corpora. From the results, when the iteration reached 500, the precision was 0.98 and the recall was 0.10. The accuracy-2, F1 score, mean absolute error, correlation coefficient index, and accuracy-7 values of the proposed academic English content optimization model were 87.8, 89.2, 0.05, 0.69, and 97.6. The proposed model has higher accuracy and efficiency on multiple datasets, which can effectively optimize various types of English grammar errors, providing new solutions for content optimization in academic English writing.</div></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"9 ","pages":"Article 100492"},"PeriodicalIF":0.0,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145415150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}