Pub Date : 2026-02-25DOI: 10.1109/TLT.2026.3668051
Xiaoming Cao;Shiting Xu;Xinyue Chen;Yujie Song;Qiurui Luo;Tao He
As generative artificial intelligence (GenAI) rapidly evolves, K–12 education is introducing artificial intelligence (AI) courses whose interdisciplinary complexity exposes teachers’ limited readiness. Research suggests that teachers can harness GenAI tools not only to enhance instruction but also to scaffold reflection that drives further teaching improvement. Therefore, this exploratory quasi-experimental study developed a customized GenAI agent system to support teachers’ reflection on AI-course teaching and to enhance reflection self-efficacy, instructional design, and reflective thinking. A total of 60 in-service teachers were recruited and divided into two experimental groups, the self-reflection group (SRG) and the peer-reflection group (PRG), both supported by the customized GenAI agent system, and one control group (CG) with conventional technology-based reflection for a four-week AI course reflective practice experiment. The results revealed that the GenAI-agent-supported reflection approach could significantly promote teachers’ self-reflection efficacy of two experimental groups compared with CG teachers. However, no significant differences between the SRG and PRG teachers’ self-reflections efficacy could be found. Moreover, SRG and PRG teachers also outperformed the CG teachers on mutual-reflection self-efficacy. Furthermore, GenAI approach could significantly boost SRG and PRG teachers’ instructional design reflection on methods and behavior; however, no significant differences were observed in instructional objectives or content among the three groups. To further explore the effects of GenAI agent support, epistemic network analysis was applied to examine the coded results of teachers’ reflective journals. The findings indicated that SRG and PRG teachers demonstrated broader and higher order reflective thinking, integrating more dialogic and critical elements, whereas CG networks were predominantly descriptive. Overall, the study confirms that a customized GenAI agent can effectively deepen reflective thinking and practice, offering new insights into fostering teachers’ professional development within K–12 AI education.
{"title":"Leveraging Generative AI Agent to Promote Teaching Reflection in a K–12 AI Course: Effects on Teachers’ Reflection Self-Efficacy, Instructional Design, and Reflective Thinking","authors":"Xiaoming Cao;Shiting Xu;Xinyue Chen;Yujie Song;Qiurui Luo;Tao He","doi":"10.1109/TLT.2026.3668051","DOIUrl":"https://doi.org/10.1109/TLT.2026.3668051","url":null,"abstract":"As generative artificial intelligence (GenAI) rapidly evolves, K–12 education is introducing artificial intelligence (AI) courses whose interdisciplinary complexity exposes teachers’ limited readiness. Research suggests that teachers can harness GenAI tools not only to enhance instruction but also to scaffold reflection that drives further teaching improvement. Therefore, this exploratory quasi-experimental study developed a customized GenAI agent system to support teachers’ reflection on AI-course teaching and to enhance reflection self-efficacy, instructional design, and reflective thinking. A total of 60 in-service teachers were recruited and divided into two experimental groups, the self-reflection group (SRG) and the peer-reflection group (PRG), both supported by the customized GenAI agent system, and one control group (CG) with conventional technology-based reflection for a four-week AI course reflective practice experiment. The results revealed that the GenAI-agent-supported reflection approach could significantly promote teachers’ self-reflection efficacy of two experimental groups compared with CG teachers. However, no significant differences between the SRG and PRG teachers’ self-reflections efficacy could be found. Moreover, SRG and PRG teachers also outperformed the CG teachers on mutual-reflection self-efficacy. Furthermore, GenAI approach could significantly boost SRG and PRG teachers’ instructional design reflection on methods and behavior; however, no significant differences were observed in instructional objectives or content among the three groups. To further explore the effects of GenAI agent support, epistemic network analysis was applied to examine the coded results of teachers’ reflective journals. The findings indicated that SRG and PRG teachers demonstrated broader and higher order reflective thinking, integrating more dialogic and critical elements, whereas CG networks were predominantly descriptive. Overall, the study confirms that a customized GenAI agent can effectively deepen reflective thinking and practice, offering new insights into fostering teachers’ professional development within K–12 AI education.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"127-140"},"PeriodicalIF":4.9,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-12DOI: 10.1109/TLT.2026.3664309
Mo Wang;Ye Zhang;Jinlong He;Yupeng Zhou;Niantong Li;Jianan Wang;Yifei Sun;Minghao Yin
Aesthetic perception, the cognitive process through which individuals interpret and evaluate the expressive and emotional qualities of visual art, is fundamental to students' creative and emotional development. Recent progress in artificial intelligence has enabled computational models to assist in aesthetic analysis by identifying patterns in visual composition and affective expression. However, such models often struggle to recognize abstract or context-dependent aesthetic dimensions, and improving these aspects through comprehensive annotation remains costly and time-consuming. This study presents a human–AI teaming framework designed to identify the aesthetic perception dimensions that AI models find most difficult to interpret and to allocate these dimensions to human experts for annotation. The framework employs a multiagent reinforcement learning mechanism, where each agent is assigned to a specific aesthetic dimension and learns a policy for determining whether expert annotation is required. Two complementary state representation strategies are introduced: a statistical representation that captures the model's predictive distribution across dimensions, and a graph-based attention module that models interdependencies among aesthetic attributes. A reward mechanism further guides agents to balance the improvement of model perception with the minimization of human annotation effort. Experiments conducted on two real-world datasets demonstrate that the proposed framework effectively identifies the challenging dimensions for AI models and strategically delegates them for human evaluation. This targeted collaboration significantly enhances annotation efficiency and model interpretability, providing a scalable approach for improving human–AI synergy in aesthetic perception analysis.
{"title":"Optimizing Aesthetic Perception Through Human–AI Teaming for Subtle Dimension Identification in Art Annotation","authors":"Mo Wang;Ye Zhang;Jinlong He;Yupeng Zhou;Niantong Li;Jianan Wang;Yifei Sun;Minghao Yin","doi":"10.1109/TLT.2026.3664309","DOIUrl":"https://doi.org/10.1109/TLT.2026.3664309","url":null,"abstract":"Aesthetic perception, the cognitive process through which individuals interpret and evaluate the expressive and emotional qualities of visual art, is fundamental to students' creative and emotional development. Recent progress in artificial intelligence has enabled computational models to assist in aesthetic analysis by identifying patterns in visual composition and affective expression. However, such models often struggle to recognize abstract or context-dependent aesthetic dimensions, and improving these aspects through comprehensive annotation remains costly and time-consuming. This study presents a human–AI teaming framework designed to identify the aesthetic perception dimensions that AI models find most difficult to interpret and to allocate these dimensions to human experts for annotation. The framework employs a multiagent reinforcement learning mechanism, where each agent is assigned to a specific aesthetic dimension and learns a policy for determining whether expert annotation is required. Two complementary state representation strategies are introduced: a statistical representation that captures the model's predictive distribution across dimensions, and a graph-based attention module that models interdependencies among aesthetic attributes. A reward mechanism further guides agents to balance the improvement of model perception with the minimization of human annotation effort. Experiments conducted on two real-world datasets demonstrate that the proposed framework effectively identifies the challenging dimensions for AI models and strategically delegates them for human evaluation. This targeted collaboration significantly enhances annotation efficiency and model interpretability, providing a scalable approach for improving human–AI synergy in aesthetic perception analysis.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"105-116"},"PeriodicalIF":4.9,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-12DOI: 10.1109/TLT.2026.3661363
Xiaojie Niu;Jingjing Zhang
Automated coding is particularly crucial in online learning research, where vast amounts of text provide valuable insights into cognitive engagement, emotional expression, and social interaction, yet manual analysis of such large-scale discourse remains time-consuming. While large language models (LLMs) offer promising solutions, existing approaches suffer from inconsistent coding performance and suboptimal prompt engineering, limiting their reliability across diverse educational frameworks. This study introduces the optimized LLM coding (OLLM-C) framework, a systematic approach that enhances automated content analysis through calibrated prompting strategies. We evaluated OLLM-C using a comprehensive dataset of 8671 comments collected from five online courses. The framework was validated against 15 widely used coding frameworks spanning cognitive, social-emotional, and behavioral dimensions. Comparative analysis of five LLMs [generative pre-trained transformer (GPT)-3.5, GPT-4o, Gemini, Claude, and Llama] revealed GPT-4o’s superior performance in initial coding tasks. The OLLM-C calibration process significantly improved GPT-4o’s reliability, with Cohen’s kappa coefficients increasing from an average of 0.45–0.57 across frameworks. Notably, the model demonstrated stronger performance in social-emotional coding compared to cognitive skills frameworks, achieving substantial agreement with human coders in emotion recognition and social interaction analysis while showing limitations in complex cognitive reasoning tasks. These findings establish OLLM-C as a systematic calibration framework that enhances the reliability, efficiency, and practical applicability of LLM-assisted qualitative analysis.
{"title":"Enhancing Automated Text Coding in Online Learning Research: A Systematic Calibration Framework for Large Language Models","authors":"Xiaojie Niu;Jingjing Zhang","doi":"10.1109/TLT.2026.3661363","DOIUrl":"https://doi.org/10.1109/TLT.2026.3661363","url":null,"abstract":"Automated coding is particularly crucial in online learning research, where vast amounts of text provide valuable insights into cognitive engagement, emotional expression, and social interaction, yet manual analysis of such large-scale discourse remains time-consuming. While large language models (LLMs) offer promising solutions, existing approaches suffer from inconsistent coding performance and suboptimal prompt engineering, limiting their reliability across diverse educational frameworks. This study introduces the optimized LLM coding (OLLM-C) framework, a systematic approach that enhances automated content analysis through calibrated prompting strategies. We evaluated OLLM-C using a comprehensive dataset of 8671 comments collected from five online courses. The framework was validated against 15 widely used coding frameworks spanning cognitive, social-emotional, and behavioral dimensions. Comparative analysis of five LLMs [generative pre-trained transformer (GPT)-3.5, GPT-4o, Gemini, Claude, and Llama] revealed GPT-4o’s superior performance in initial coding tasks. The OLLM-C calibration process significantly improved GPT-4o’s reliability, with Cohen’s kappa coefficients increasing from an average of 0.45–0.57 across frameworks. Notably, the model demonstrated stronger performance in social-emotional coding compared to cognitive skills frameworks, achieving substantial agreement with human coders in emotion recognition and social interaction analysis while showing limitations in complex cognitive reasoning tasks. These findings establish OLLM-C as a systematic calibration framework that enhances the reliability, efficiency, and practical applicability of LLM-assisted qualitative analysis.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"63-74"},"PeriodicalIF":4.9,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-04DOI: 10.1109/TLT.2026.3661154
Junlei Du;Yishen Song;Qinhua Zheng
Nonanchor equating presents a significant challenge in educational assessment when test forms lack common items, requiring innovative solutions to ensure score comparability across different test administrations. This study proposes a novel large language model-simulated nonequivalent groups with anchor test (LLM-SNGAT) method that leverages large language models (LLMs) to simulate test-taking samples and generate common item sets for equating purposes. The approach eliminates traditional dependencies on specialized test design and extensive demographic data collection by utilizing the inherent capabilities of LLMs to simulate diverse response patterns. We evaluated the method using Tucker and Levine equating approaches across multiple LLMs, including generative pre-trained transformer 4o (GPT-4o), O1-preview, and DeepSeek-R1. Results demonstrated the feasibility of the proposed approach, with the Tucker method showing superior performance and consistent improvements as common item coverage increased. Sensitivity analysis confirmed that model performance rankings remained consistent across varying prompt formulations. The study revealed characteristic that standard errors were smallest near the mean and became larger farther away from the mean, and identified optimal common item proportions of 30%–50% for stable equating performance. While current limitations include the capacity of LLMs to accurately simulate human cognitive and behavioral diversity, this proof-of-concept study provides preliminary evidence for the feasibility of the LLM-SNGAT methodology. The approach represents a paradigm shift from resource-intensive traditional methods to computationally driven solutions, offering promising prospects for addressing nonanchor equating challenges in the digital age.
{"title":"LLM-Simulated Nonequivalent Groups With Anchor Test: A Novel Approach for Test Equating in the Absence of Traditional Anchor Items","authors":"Junlei Du;Yishen Song;Qinhua Zheng","doi":"10.1109/TLT.2026.3661154","DOIUrl":"https://doi.org/10.1109/TLT.2026.3661154","url":null,"abstract":"Nonanchor equating presents a significant challenge in educational assessment when test forms lack common items, requiring innovative solutions to ensure score comparability across different test administrations. This study proposes a novel large language model-simulated nonequivalent groups with anchor test (LLM-SNGAT) method that leverages large language models (LLMs) to simulate test-taking samples and generate common item sets for equating purposes. The approach eliminates traditional dependencies on specialized test design and extensive demographic data collection by utilizing the inherent capabilities of LLMs to simulate diverse response patterns. We evaluated the method using Tucker and Levine equating approaches across multiple LLMs, including generative pre-trained transformer 4o (GPT-4o), O1-preview, and DeepSeek-R1. Results demonstrated the feasibility of the proposed approach, with the Tucker method showing superior performance and consistent improvements as common item coverage increased. Sensitivity analysis confirmed that model performance rankings remained consistent across varying prompt formulations. The study revealed characteristic that standard errors were smallest near the mean and became larger farther away from the mean, and identified optimal common item proportions of 30%–50% for stable equating performance. While current limitations include the capacity of LLMs to accurately simulate human cognitive and behavioral diversity, this proof-of-concept study provides preliminary evidence for the feasibility of the LLM-SNGAT methodology. The approach represents a paradigm shift from resource-intensive traditional methods to computationally driven solutions, offering promising prospects for addressing nonanchor equating challenges in the digital age.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"75-86"},"PeriodicalIF":4.9,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1109/TLT.2026.3660059
Han Wan;Shiyang Yue;Mengying Li;Xin Luo;Yaofeng Hu;Baoliang Che;Jingyuan Wang
Blended learning enriches students’ experiences across diverse environments while generating multimodal data related to learning activities. However, it presents challenges in the appropriate use of multimodal data to track students’ performance development. Previous models with fixed-length inputs or static fusion mechanisms inadequately model temporal dependencies across behavioral modalities. In this article, we integrate variable-length time series over weeks to forecast performance for the subsequent week. As the main contribution, we propose a two-stage training model that relies on a transformer for temporal attention-based multimodal fusion. We conducted experiments on two real-world datasets, FC2023 and CS2023, derived from hybrid mode courses involving 439 and 199 students, respectively. The results demonstrate that multimodal fusion yields better periodical prediction compared to the unimodal approach. Ultimately, aiming at predicting the week-by-week development of student performance, the proposed model achieves the area under the curve of receiver operating characteristic of 81.02% on FC2023 and 82.65% on CS2023. This approach, which leverages multimodal learning analytics, helps educators track each student’s learning progress more effectively, enabling the timely implementation of instructional interventions and enhancing educational outcomes.
{"title":"Integrating Blended Learning Behaviors via Multimodal Fusion for Student Performance Prediction","authors":"Han Wan;Shiyang Yue;Mengying Li;Xin Luo;Yaofeng Hu;Baoliang Che;Jingyuan Wang","doi":"10.1109/TLT.2026.3660059","DOIUrl":"https://doi.org/10.1109/TLT.2026.3660059","url":null,"abstract":"Blended learning enriches students’ experiences across diverse environments while generating multimodal data related to learning activities. However, it presents challenges in the appropriate use of multimodal data to track students’ performance development. Previous models with fixed-length inputs or static fusion mechanisms inadequately model temporal dependencies across behavioral modalities. In this article, we integrate variable-length time series over weeks to forecast performance for the subsequent week. As the main contribution, we propose a two-stage training model that relies on a transformer for temporal attention-based multimodal fusion. We conducted experiments on two real-world datasets, FC2023 and CS2023, derived from hybrid mode courses involving 439 and 199 students, respectively. The results demonstrate that multimodal fusion yields better periodical prediction compared to the unimodal approach. Ultimately, aiming at predicting the week-by-week development of student performance, the proposed model achieves the area under the curve of receiver operating characteristic of 81.02% on FC2023 and 82.65% on CS2023. This approach, which leverages multimodal learning analytics, helps educators track each student’s learning progress more effectively, enabling the timely implementation of instructional interventions and enhancing educational outcomes.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"87-104"},"PeriodicalIF":4.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
University graduates often encounter challenges during their initial job-seeking, such as interview anxiety and insufficient interview experience. To address these issues, this article designs a job-seeking assistance system that integrates interview training and personalized job recommendation. The system employs virtual reality technology and large language model to create immersive interview scenarios for experiential learning. This approach provides a realistic interview simulation environment that facilitates the accumulation of interview experience and helps alleviate interview anxiety. Based on the interview dialogue records, we further propose a novel job recommendation with dual-view-enhanced hybrid expert module to improve recommendation accuracy. Experimental results on a real-world dataset demonstrate the optimal performance of the proposed model. After the interview, the system automatically generates interview analysis reports, including a competency analysis of the user, an evaluation of the target position, and a personalized job recommendation. Although the system is developed and evaluated within the context of the Chinese job market, it offers potential for extension to other cultural and labor market settings.
{"title":"A Job-Seeking Assistance System by Integrating Interview Training With Job Recommendation","authors":"Junmei Feng;Yaomin Zhao;Xiongwen Zhong;Lin Zhang;Xianlin Peng;Qiguang Miao","doi":"10.1109/TLT.2026.3659507","DOIUrl":"https://doi.org/10.1109/TLT.2026.3659507","url":null,"abstract":"University graduates often encounter challenges during their initial job-seeking, such as interview anxiety and insufficient interview experience. To address these issues, this article designs a job-seeking assistance system that integrates interview training and personalized job recommendation. The system employs virtual reality technology and large language model to create immersive interview scenarios for experiential learning. This approach provides a realistic interview simulation environment that facilitates the accumulation of interview experience and helps alleviate interview anxiety. Based on the interview dialogue records, we further propose a novel job recommendation with dual-view-enhanced hybrid expert module to improve recommendation accuracy. Experimental results on a real-world dataset demonstrate the optimal performance of the proposed model. After the interview, the system automatically generates interview analysis reports, including a competency analysis of the user, an evaluation of the target position, and a personalized job recommendation. Although the system is developed and evaluated within the context of the Chinese job market, it offers potential for extension to other cultural and labor market settings.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"51-62"},"PeriodicalIF":4.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1109/TLT.2026.3658939
Devasena Pasupuleti;Hamed Mahzoon;Sakai Kazuki;Hiroshi Ishiguro;Yaswanth Bangi;Rajeevlochana G. Chittawadigi;Yuichiro Yoshikawa
Academic self-esteem (ASE) plays a significant role in children's learning outcomes by influencing their engagement and performance in subjects, such as math. Although existing studies emphasize the importance of self-esteem in academic achievement, the potential of interactive technologies, particularly virtual robots, in enhancing ASE remains underexplored. This study investigates the impact of a virtual robot on children's ASE, math performance, concentration, and engagement in an e-learning environment. The study involved an experimental group (n = 12) interacting with a virtual robot integrated into the math e-learning platform, and a control group (n = 12) working in a traditional e-learning setting without robot interaction. The results demonstrated that the experimental group, which interacted with the virtual robot, exhibited significant improvements in math performance, concentration, and engagement across the three experimental sessions (sessions 1–3) compared with the control group, as indicated by both quantitative measures and qualitative feedback from participants. ASE, as well as the quantity and quality of friendships, was assessed pre- and posttest, with findings indicating greater improvements in the experimental group after the intervention. The correlation between improved math performance and higher ASE was moderate to strong. These findings underscore the potential of virtual robots as tools that positively influence the ASE and learning outcomes of children, and highlight their value for future educational settings where such technology could address achievement gaps in mathematics learning through improved self-perception.
{"title":"Virtual Robots in E-Learning: A Pathway to Enhanced Academic Self-Esteem, Math Performance, and Engagement in Children","authors":"Devasena Pasupuleti;Hamed Mahzoon;Sakai Kazuki;Hiroshi Ishiguro;Yaswanth Bangi;Rajeevlochana G. Chittawadigi;Yuichiro Yoshikawa","doi":"10.1109/TLT.2026.3658939","DOIUrl":"https://doi.org/10.1109/TLT.2026.3658939","url":null,"abstract":"Academic self-esteem (ASE) plays a significant role in children's learning outcomes by influencing their engagement and performance in subjects, such as math. Although existing studies emphasize the importance of self-esteem in academic achievement, the potential of interactive technologies, particularly virtual robots, in enhancing ASE remains underexplored. This study investigates the impact of a virtual robot on children's ASE, math performance, concentration, and engagement in an e-learning environment. The study involved an experimental group (<italic>n</i> = 12) interacting with a virtual robot integrated into the math e-learning platform, and a control group (<italic>n</i> = 12) working in a traditional e-learning setting without robot interaction. The results demonstrated that the experimental group, which interacted with the virtual robot, exhibited significant improvements in math performance, concentration, and engagement across the three experimental sessions (sessions 1–3) compared with the control group, as indicated by both quantitative measures and qualitative feedback from participants. ASE, as well as the quantity and quality of friendships, was assessed pre- and posttest, with findings indicating greater improvements in the experimental group after the intervention. The correlation between improved math performance and higher ASE was moderate to strong. These findings underscore the potential of virtual robots as tools that positively influence the ASE and learning outcomes of children, and highlight their value for future educational settings where such technology could address achievement gaps in mathematics learning through improved self-perception.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"35-50"},"PeriodicalIF":4.9,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11367484","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1109/TLT.2026.3656190
Zhiyuan Wang;Hengding Wang;Guoyue Xiong;Ruofei Lin
In recent years, the integration of large language models with retrieval-augmented generation (RAG) has significantly advanced the development of question-answering systems in educational settings. While RAG has shown promising potential in open-domain question handling, it still faces numerous challenges in structured and specialized scenarios, such as football coaching. These challenges include fragmented knowledge, lack of structural awareness, and inability to support multimodal outputs. To address these issues, we propose a RAG-based multimodal intelligent football coaching question-answering framework, referred to as FC-RAG, with three core innovations: 1) graph-guided knowledge base retrieval, leveraging a hierarchical community structure built on a football knowledge graph to organize and aggregate structured knowledge; 2) a fine-grained two-level question-answering mechanism that decomposes complex problems into atomic questions, enhancing retrieval accuracy and answer coherence; and 3) a multimodal answer generation module that combines text, tactical diagrams, and action illustrations to enhance the intuitiveness and interactivity of the teaching process. Based on research in educational question-answering and RAG, this article explores the application potential of structured and multimodal generation technologies in skill-based educational contexts, providing a feasible intelligent design paradigm for sports coaching and other specialized education systems.
{"title":"FC-RAG: Enhancing Football Coaching With Multimodal Retrieval-Augmented Generation","authors":"Zhiyuan Wang;Hengding Wang;Guoyue Xiong;Ruofei Lin","doi":"10.1109/TLT.2026.3656190","DOIUrl":"https://doi.org/10.1109/TLT.2026.3656190","url":null,"abstract":"In recent years, the integration of large language models with retrieval-augmented generation (RAG) has significantly advanced the development of question-answering systems in educational settings. While RAG has shown promising potential in open-domain question handling, it still faces numerous challenges in structured and specialized scenarios, such as football coaching. These challenges include fragmented knowledge, lack of structural awareness, and inability to support multimodal outputs. To address these issues, we propose a RAG-based multimodal intelligent football coaching question-answering framework, referred to as FC-RAG, with three core innovations: 1) graph-guided knowledge base retrieval, leveraging a hierarchical community structure built on a football knowledge graph to organize and aggregate structured knowledge; 2) a fine-grained two-level question-answering mechanism that decomposes complex problems into atomic questions, enhancing retrieval accuracy and answer coherence; and 3) a multimodal answer generation module that combines text, tactical diagrams, and action illustrations to enhance the intuitiveness and interactivity of the teaching process. Based on research in educational question-answering and RAG, this article explores the application potential of structured and multimodal generation technologies in skill-based educational contexts, providing a feasible intelligent design paradigm for sports coaching and other specialized education systems.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"117-126"},"PeriodicalIF":4.9,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Real-time feedback in online education relies on the automated assessment of student engagement. Current research primarily evaluates engagement through the analysis of students’ behavioral performance in classroom videos. Although existing methods consider temporal and multimodal cues, they predominantly rely on visual information and insufficiently model the cognitive rhythm of attention. Moreover, several approaches are limited to engagement classification and fail to offer continuous quantitative assessment. To address these limitations, we propose M-LATTE method, short for multimodal latent attention trends and time-cyclic engagement modeling. The model extracts features from visual, audio, and textual modalities and employs a cross-modal attention mechanism to achieve effective multimodal fusion, thus avoiding solely relying on visual information. The fused temporal data are decomposed into long-term trends and time-cyclic fluctuations to capture cognitive rhythm characteristics. A smoothness constraint and a variational lower bound are introduced to suppress transient disturbances, ensuring stable evaluation. Experimental results show significant performance improvements over the baseline: on the RoomReader dataset, mean squared error decreases from 0.1912 to 0.0969. Furthermore, our method achieves a competitive classification accuracy of 61.37% on the dataset for affective states in e-environments (DAiSEE) dataset.
{"title":"Multimodal Latent Temporal Modeling for Continuous Engagement Assessment in Online Education","authors":"Congcong Xie;Di Wang;Quan Wang;Xiao Liang;Ruyi Liu;Qiguang Miao","doi":"10.1109/TLT.2026.3656606","DOIUrl":"https://doi.org/10.1109/TLT.2026.3656606","url":null,"abstract":"Real-time feedback in online education relies on the automated assessment of student engagement. Current research primarily evaluates engagement through the analysis of students’ behavioral performance in classroom videos. Although existing methods consider temporal and multimodal cues, they predominantly rely on visual information and insufficiently model the cognitive rhythm of attention. Moreover, several approaches are limited to engagement classification and fail to offer continuous quantitative assessment. To address these limitations, we propose <italic>M-LATTE</i> method, short for multimodal latent attention trends and time-cyclic engagement modeling. The model extracts features from visual, audio, and textual modalities and employs a cross-modal attention mechanism to achieve effective multimodal fusion, thus avoiding solely relying on visual information. The fused temporal data are decomposed into long-term trends and time-cyclic fluctuations to capture cognitive rhythm characteristics. A smoothness constraint and a variational lower bound are introduced to suppress transient disturbances, ensuring stable evaluation. Experimental results show significant performance improvements over the baseline: on the RoomReader dataset, mean squared error decreases from 0.1912 to 0.0969. Furthermore, our method achieves a competitive classification accuracy of 61.37% on the dataset for affective states in e-environments (DAiSEE) dataset.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"19 ","pages":"21-34"},"PeriodicalIF":4.9,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}