Ryosuke Nakamoto, B. Flanagan, Yiling Dai, Kyosuke Takami, H. Ogata
{"title":"Unsupervised techniques for generating a standard sample self-explanation answer with knowledge components in a math quiz","authors":"Ryosuke Nakamoto, B. Flanagan, Yiling Dai, Kyosuke Takami, H. Ogata","doi":"10.58459/rptel.2024.19016","DOIUrl":null,"url":null,"abstract":"Self-explanation is a widely recognized and effective pedagogical method. Previous research has indicated that self-explanation can be used to evaluate students’ comprehension and identify their areas of difficulty on mathematical quizzes. However, most analytical techniques necessitate pre-labeled materials, which limits the potential for large-scale study. Conversely, utilizing collected self-explanations without supervision is challenging because there is little research on this topic. Therefore, this study aims to investigate the feasibility of automatically generating a standardized self-explanation sample answer from unsupervised collected self-explanations. The proposed model involves preprocessing and three machine learning steps: vectorization, clustering, and extraction. Experiments involving 1,434 self-explanation answers from 25 quizzes indicate that 72% of the quizzes generate sample answers containing all the necessary knowledge components. The similarity between human-generated and machine-generated sentences was significant with moderate positive correlation, r(23) = .48, p < .05.The best-performing generative model also achieved a high BERTScore of 0.715. Regarding the readability of the generated sample answers, the average score of the human-generated sentences was superior to that of the machine-generated ones. These results suggest that the proposed model can generate sample answers that contain critical knowledge components and can be further improved with BERTScore. This study is expected to have numerous applications, including identifying students’ areas of difficulty, scoring self-explanations, presenting students with reference materials for learning, and automatically generating scaffolding templates to train self-explanation skills.","PeriodicalId":37055,"journal":{"name":"Research and Practice in Technology Enhanced Learning","volume":"100 1","pages":"16"},"PeriodicalIF":3.1000,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research and Practice in Technology Enhanced Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.58459/rptel.2024.19016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 2
Abstract
Self-explanation is a widely recognized and effective pedagogical method. Previous research has indicated that self-explanation can be used to evaluate students’ comprehension and identify their areas of difficulty on mathematical quizzes. However, most analytical techniques necessitate pre-labeled materials, which limits the potential for large-scale study. Conversely, utilizing collected self-explanations without supervision is challenging because there is little research on this topic. Therefore, this study aims to investigate the feasibility of automatically generating a standardized self-explanation sample answer from unsupervised collected self-explanations. The proposed model involves preprocessing and three machine learning steps: vectorization, clustering, and extraction. Experiments involving 1,434 self-explanation answers from 25 quizzes indicate that 72% of the quizzes generate sample answers containing all the necessary knowledge components. The similarity between human-generated and machine-generated sentences was significant with moderate positive correlation, r(23) = .48, p < .05.The best-performing generative model also achieved a high BERTScore of 0.715. Regarding the readability of the generated sample answers, the average score of the human-generated sentences was superior to that of the machine-generated ones. These results suggest that the proposed model can generate sample answers that contain critical knowledge components and can be further improved with BERTScore. This study is expected to have numerous applications, including identifying students’ areas of difficulty, scoring self-explanations, presenting students with reference materials for learning, and automatically generating scaffolding templates to train self-explanation skills.