Efficient topic identification for urgent MOOC Forum posts using BERTopic and traditional topic modeling techniques

IF 4.8 2区 教育学 Q1 EDUCATION & EDUCATIONAL RESEARCH Education and Information Technologies Pub Date : 2024-09-17 DOI:10.1007/s10639-024-13003-4
Nabila Khodeir, Fatma Elghannam
{"title":"Efficient topic identification for urgent MOOC Forum posts using BERTopic and traditional topic modeling techniques","authors":"Nabila Khodeir, Fatma Elghannam","doi":"10.1007/s10639-024-13003-4","DOIUrl":null,"url":null,"abstract":"<p>MOOC platforms provide a means of communication through forums, allowing learners to express their difficulties and challenges while studying various courses. Within these forums, some posts require urgent attention from instructors. Failing to respond promptly to these posts can contribute to higher dropout rates and lower course completion rates. While existing research primarily focuses on identifying urgent posts through various classification techniques, it has not adequately addressed the underlying reasons behind them. This research aims to delve into these reasons and assess the extent to which they vary. By understanding the root causes of urgency, instructors can effectively address these issues and provide appropriate support and solutions. BERTopic utilizes the advanced language capabilities of transformer models and represents an advanced approach in topic modeling. In this study, a comparison was conducted to evaluate the performance of BERTopic in topic modeling on MOOCs discussion forums, alongside traditional topic models such as LDA, LSI, and NMF. The experimental results revealed that the NMF and BERTopic models outperformed the other models. Specifically, the NMF model demonstrated superior performance when a lower number of topics was required, whereas the BERTopic model excelled in generating topics with higher coherence when a larger number of topics was needed.The results considering all urgent posts from the dataset were as follows: Optimal number of topics is 6 for NMF and 50 for BERTopic; coherence scores is 0.66 for NMF and 0.616 for BERTopic; and IRBO scores is 1 for both models. This highlights the BERTopic model capability to distinguish and extract diverse topics comprehensively and coherently, aiding in the identification of various reasons behind MOOC Forum posts.</p>","PeriodicalId":51494,"journal":{"name":"Education and Information Technologies","volume":"33 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Education and Information Technologies","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1007/s10639-024-13003-4","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

Abstract

MOOC platforms provide a means of communication through forums, allowing learners to express their difficulties and challenges while studying various courses. Within these forums, some posts require urgent attention from instructors. Failing to respond promptly to these posts can contribute to higher dropout rates and lower course completion rates. While existing research primarily focuses on identifying urgent posts through various classification techniques, it has not adequately addressed the underlying reasons behind them. This research aims to delve into these reasons and assess the extent to which they vary. By understanding the root causes of urgency, instructors can effectively address these issues and provide appropriate support and solutions. BERTopic utilizes the advanced language capabilities of transformer models and represents an advanced approach in topic modeling. In this study, a comparison was conducted to evaluate the performance of BERTopic in topic modeling on MOOCs discussion forums, alongside traditional topic models such as LDA, LSI, and NMF. The experimental results revealed that the NMF and BERTopic models outperformed the other models. Specifically, the NMF model demonstrated superior performance when a lower number of topics was required, whereas the BERTopic model excelled in generating topics with higher coherence when a larger number of topics was needed.The results considering all urgent posts from the dataset were as follows: Optimal number of topics is 6 for NMF and 50 for BERTopic; coherence scores is 0.66 for NMF and 0.616 for BERTopic; and IRBO scores is 1 for both models. This highlights the BERTopic model capability to distinguish and extract diverse topics comprehensively and coherently, aiding in the identification of various reasons behind MOOC Forum posts.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用 BERTopic 和传统主题建模技术高效识别 MOOC 论坛紧急帖子的主题
MOOC 平台通过论坛提供交流手段,让学习者表达他们在学习各种课程时遇到的困难和挑战。在这些论坛中,有些帖子需要教师给予紧急关注。如果不能及时回复这些帖子,就会导致辍学率上升和课程完成率下降。现有的研究主要侧重于通过各种分类技术来识别紧急帖子,但没有充分探讨这些帖子背后的深层原因。本研究旨在深入探讨这些原因,并评估其差异程度。通过了解紧迫性的根本原因,教师可以有效地解决这些问题,并提供适当的支持和解决方案。BERTopic 利用了转换器模型的高级语言功能,代表了主题建模的一种先进方法。本研究比较评估了 BERTopic 与 LDA、LSI 和 NMF 等传统话题模型在 MOOCs 讨论区话题建模方面的性能。实验结果表明,NMF 和 BERTopic 模型的表现优于其他模型。具体来说,当需要的主题数量较少时,NMF 模型表现优异,而当需要的主题数量较多时,BERTopic 模型在生成一致性较高的主题方面表现出色:NMF 的最佳主题数为 6,BERTopic 的最佳主题数为 50;NMF 的一致性分数为 0.66,BERTopic 的一致性分数为 0.616;两种模型的 IRBO 分数均为 1。这凸显了 BERTopic 模型全面、连贯地区分和提取不同主题的能力,有助于识别 MOOC 论坛帖子背后的各种原因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Education and Information Technologies
Education and Information Technologies EDUCATION & EDUCATIONAL RESEARCH-
CiteScore
10.00
自引率
12.70%
发文量
610
期刊介绍: The Journal of Education and Information Technologies (EAIT) is a platform for the range of debates and issues in the field of Computing Education as well as the many uses of information and communication technology (ICT) across many educational subjects and sectors. It probes the use of computing to improve education and learning in a variety of settings, platforms and environments. The journal aims to provide perspectives at all levels, from the micro level of specific pedagogical approaches in Computing Education and applications or instances of use in classrooms, to macro concerns of national policies and major projects; from pre-school classes to adults in tertiary institutions; from teachers and administrators to researchers and designers; from institutions to online and lifelong learning. The journal is embedded in the research and practice of professionals within the contemporary global context and its breadth and scope encourage debate on fundamental issues at all levels and from different research paradigms and learning theories. The journal does not proselytize on behalf of the technologies (whether they be mobile, desktop, interactive, virtual, games-based or learning management systems) but rather provokes debate on all the complex relationships within and between computing and education, whether they are in informal or formal settings. It probes state of the art technologies in Computing Education and it also considers the design and evaluation of digital educational artefacts.  The journal aims to maintain and expand its international standing by careful selection on merit of the papers submitted, thus providing a credible ongoing forum for debate and scholarly discourse. Special Issues are occasionally published to cover particular issues in depth. EAIT invites readers to submit papers that draw inferences, probe theory and create new knowledge that informs practice, policy and scholarship. Readers are also invited to comment and reflect upon the argument and opinions published. EAIT is the official journal of the Technical Committee on Education of the International Federation for Information Processing (IFIP) in partnership with UNESCO.
期刊最新文献
Development of a virtual reality creative enhancement system utilizing haptic vibration feedback via electroencephalography Is ChatGPT like a nine-year-old child in theory of mind? Evidence from Chinese writing Analysing factors influencing undergraduates’ adoption of intelligent physical education systems using an expanded TAM The importance of aligning instructor age with learning content in designing instructional videos for older adults Evaluating classroom response systems in engineering education: Which metrics better reflect student performance?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1