{"title":"An Enhanced Topic Modeling Method in Educational Domain by Integrating LDA with Semantic","authors":"Ruofei Ding, Pucheng Huang, Shumin Chen, Jiale Zhang, Jingxiu Huang, Yunxiang Zheng","doi":"10.23919/icact60172.2024.10471952","DOIUrl":null,"url":null,"abstract":"With the development of online courses, students' discussion texts in online forums and communication groups are increasing. Teachers can use these texts to monitor student learning so that they can adapt the pace of instruction accordingly. And textual topics, as the important information of the text, can be extracted from the text by topic modeling. Currently, a Latent Dirichlet Allocation (LDA) method has been used to identify the critical main topics discussed by students. However, LDA is based on word frequency and ignores semantic information. In this study, we propose a model for fusing semantic information into LDA. To verify the validity of our model, we collected two MOOC datasets for testing and conducted an ablation study using Silhouette Coefficient value and Calinski-Harabasz score as the criterion. The results show that our method is scientifically feasible and better than LDA in the field of educational topic modeling. Thus, our method is able to perform topic modeling more accurately compared to LDA. It can be used by teachers to automatically analyze large amounts of student discussion data to guide personalized learning paths.","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"45 ","pages":"01-06"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 26th International Conference on Advanced Communications Technology (ICACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/icact60172.2024.10471952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of online courses, students' discussion texts in online forums and communication groups are increasing. Teachers can use these texts to monitor student learning so that they can adapt the pace of instruction accordingly. And textual topics, as the important information of the text, can be extracted from the text by topic modeling. Currently, a Latent Dirichlet Allocation (LDA) method has been used to identify the critical main topics discussed by students. However, LDA is based on word frequency and ignores semantic information. In this study, we propose a model for fusing semantic information into LDA. To verify the validity of our model, we collected two MOOC datasets for testing and conducted an ablation study using Silhouette Coefficient value and Calinski-Harabasz score as the criterion. The results show that our method is scientifically feasible and better than LDA in the field of educational topic modeling. Thus, our method is able to perform topic modeling more accurately compared to LDA. It can be used by teachers to automatically analyze large amounts of student discussion data to guide personalized learning paths.