Karen Reina Sánchez, Gonzalo Vaca Serrano, Juan Pedro Arbáizar Gómez, Alfonso Duran-Heras
{"title":"揭示 MOOC 论坛中的建议:基于转换器的方法","authors":"Karen Reina Sánchez, Gonzalo Vaca Serrano, Juan Pedro Arbáizar Gómez, Alfonso Duran-Heras","doi":"10.1007/s10462-024-10997-8","DOIUrl":null,"url":null,"abstract":"<div><p>The field of natural language processing has experienced significant advances in recent years, but these advances have not yet resulted in improved analytics for instructors on MOOC platforms. Valuable information, such as suggestions, is generated in the comment forums of these courses, but due to their volume, manual processing is often impractical. This study examines the feasibility of fine-tuning and effectively utilizing state-of-the-art deep learning models to identify comments that contain suggestions in MOOC forums. The main challenges encountered are the lack of labeled datasets from the MOOC context for fine-tuning classification models and the soaring computational cost of this training. For this study, we manually collected and labeled 2228 comments in Spanish and English from 5 MOOCs and scraped 1.4 million MOOC reviews from 3 platforms. We fine-tuned and evaluated 4 pretrained models based on the transformer architecture and 3 traditional machine learning models to compare their effectiveness in the suggestion mining task in this domain. Transformer-based models proved to be highly effective in this task/domain combination, achieving performance levels that matched or exceeded those deemed appropriate in other contexts and were significantly greater than those achieved by traditional models. Domain adaptation led to improved linguistic understanding of the target domain; however, in this project, this approach did not translate into an observable improvement in suggestion mining. The automated identification of comments that can be labeled as suggestions can result in considerable time savings for instructors, especially considering that less than a quarter of the analyzed comments contain suggestions.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 1","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10997-8.pdf","citationCount":"0","resultStr":"{\"title\":\"Uncovering suggestions in MOOC discussion forums: a transformer-based approach\",\"authors\":\"Karen Reina Sánchez, Gonzalo Vaca Serrano, Juan Pedro Arbáizar Gómez, Alfonso Duran-Heras\",\"doi\":\"10.1007/s10462-024-10997-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The field of natural language processing has experienced significant advances in recent years, but these advances have not yet resulted in improved analytics for instructors on MOOC platforms. Valuable information, such as suggestions, is generated in the comment forums of these courses, but due to their volume, manual processing is often impractical. This study examines the feasibility of fine-tuning and effectively utilizing state-of-the-art deep learning models to identify comments that contain suggestions in MOOC forums. The main challenges encountered are the lack of labeled datasets from the MOOC context for fine-tuning classification models and the soaring computational cost of this training. For this study, we manually collected and labeled 2228 comments in Spanish and English from 5 MOOCs and scraped 1.4 million MOOC reviews from 3 platforms. We fine-tuned and evaluated 4 pretrained models based on the transformer architecture and 3 traditional machine learning models to compare their effectiveness in the suggestion mining task in this domain. Transformer-based models proved to be highly effective in this task/domain combination, achieving performance levels that matched or exceeded those deemed appropriate in other contexts and were significantly greater than those achieved by traditional models. Domain adaptation led to improved linguistic understanding of the target domain; however, in this project, this approach did not translate into an observable improvement in suggestion mining. The automated identification of comments that can be labeled as suggestions can result in considerable time savings for instructors, especially considering that less than a quarter of the analyzed comments contain suggestions.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2024-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10462-024-10997-8.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-024-10997-8\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-10997-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Uncovering suggestions in MOOC discussion forums: a transformer-based approach
The field of natural language processing has experienced significant advances in recent years, but these advances have not yet resulted in improved analytics for instructors on MOOC platforms. Valuable information, such as suggestions, is generated in the comment forums of these courses, but due to their volume, manual processing is often impractical. This study examines the feasibility of fine-tuning and effectively utilizing state-of-the-art deep learning models to identify comments that contain suggestions in MOOC forums. The main challenges encountered are the lack of labeled datasets from the MOOC context for fine-tuning classification models and the soaring computational cost of this training. For this study, we manually collected and labeled 2228 comments in Spanish and English from 5 MOOCs and scraped 1.4 million MOOC reviews from 3 platforms. We fine-tuned and evaluated 4 pretrained models based on the transformer architecture and 3 traditional machine learning models to compare their effectiveness in the suggestion mining task in this domain. Transformer-based models proved to be highly effective in this task/domain combination, achieving performance levels that matched or exceeded those deemed appropriate in other contexts and were significantly greater than those achieved by traditional models. Domain adaptation led to improved linguistic understanding of the target domain; however, in this project, this approach did not translate into an observable improvement in suggestion mining. The automated identification of comments that can be labeled as suggestions can result in considerable time savings for instructors, especially considering that less than a quarter of the analyzed comments contain suggestions.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.