大学生满意度的意见分类算法的K-Fold交叉验证

IF 1.7 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS International Journal of Online and Biomedical Engineering Pub Date : 2023-08-16 DOI:10.3991/ijoe.v19i11.39887

Omar Chamorro-Atalaya, J. Arévalo-Tuesta, Denisse Balarezo-Mares, Anthony Gonzáles-Pacheco, Olga Mendoza-León, Manuel Quipuscoa-Silvestre, Gregorio Tomás-Quispe, Raul Suarez-Bazalar

{"title":"大学生满意度的意见分类算法的K-Fold交叉验证","authors":"Omar Chamorro-Atalaya, J. Arévalo-Tuesta, Denisse Balarezo-Mares, Anthony Gonzáles-Pacheco, Olga Mendoza-León, Manuel Quipuscoa-Silvestre, Gregorio Tomás-Quispe, Raul Suarez-Bazalar","doi":"10.3991/ijoe.v19i11.39887","DOIUrl":null,"url":null,"abstract":"When using machine-learning techniques to determine algorithms or ranking models that identify student satisfaction, algorithms are often trained and tested on a single data set, leading to bias in their performance metrics. This article aims to identify the best algorithm to classify the satisfaction of university students applying the K-fold cross-validation technique, comparing the error rates of the performance metrics before and after its application. The method used began with the collection of student opinions on the teaching performance of the social network Twitter during an academic semester. Then, sentiment analysis was used for data processing, through which it was possible to categorize the opinions of the students into “satisfied” or “dissatisfied.” The results showed that the algorithm with the lowest error rate in its performance metric was the support vector machine (SVM). In addition, it was identified that its classification probability reached an accuracy of 91.76%. It is concluded that SVM classification using K-fold cross-validation will contribute to determining which factors associated with the teacher’s didactic strategies should be improved in each class session, since traditional surveying techniques have shortcomings.","PeriodicalId":36900,"journal":{"name":"International Journal of Online and Biomedical Engineering","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"K-Fold Cross-Validation through Identification of the Opinion Classification Algorithm for the Satisfaction of University Students\",\"authors\":\"Omar Chamorro-Atalaya, J. Arévalo-Tuesta, Denisse Balarezo-Mares, Anthony Gonzáles-Pacheco, Olga Mendoza-León, Manuel Quipuscoa-Silvestre, Gregorio Tomás-Quispe, Raul Suarez-Bazalar\",\"doi\":\"10.3991/ijoe.v19i11.39887\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When using machine-learning techniques to determine algorithms or ranking models that identify student satisfaction, algorithms are often trained and tested on a single data set, leading to bias in their performance metrics. This article aims to identify the best algorithm to classify the satisfaction of university students applying the K-fold cross-validation technique, comparing the error rates of the performance metrics before and after its application. The method used began with the collection of student opinions on the teaching performance of the social network Twitter during an academic semester. Then, sentiment analysis was used for data processing, through which it was possible to categorize the opinions of the students into “satisfied” or “dissatisfied.” The results showed that the algorithm with the lowest error rate in its performance metric was the support vector machine (SVM). In addition, it was identified that its classification probability reached an accuracy of 91.76%. It is concluded that SVM classification using K-fold cross-validation will contribute to determining which factors associated with the teacher’s didactic strategies should be improved in each class session, since traditional surveying techniques have shortcomings.\",\"PeriodicalId\":36900,\"journal\":{\"name\":\"International Journal of Online and Biomedical Engineering\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Online and Biomedical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3991/ijoe.v19i11.39887\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Online and Biomedical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3991/ijoe.v19i11.39887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

当使用机器学习技术来确定识别学生满意度的算法或排名模型时，算法通常在单个数据集上进行训练和测试，导致其性能指标存在偏差。本文旨在确定应用K-fold交叉验证技术对大学生满意度进行分类的最佳算法，比较应用前后绩效指标的错误率。所使用的方法始于收集学生对社交网络推特在一个学期的教学表现的意见。然后，将情绪分析用于数据处理，通过情绪分析可以将学生的意见分为“满意”或“不满意”。结果表明，在其性能指标中错误率最低的算法是支持向量机。此外，它的分类概率达到了91.76%的准确率。结论是，由于传统的调查技术存在缺陷，使用K-fold交叉验证的SVM分类将有助于确定与教师的教学策略相关的哪些因素应该在每节课上得到改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

K-Fold Cross-Validation through Identification of the Opinion Classification Algorithm for the Satisfaction of University Students

When using machine-learning techniques to determine algorithms or ranking models that identify student satisfaction, algorithms are often trained and tested on a single data set, leading to bias in their performance metrics. This article aims to identify the best algorithm to classify the satisfaction of university students applying the K-fold cross-validation technique, comparing the error rates of the performance metrics before and after its application. The method used began with the collection of student opinions on the teaching performance of the social network Twitter during an academic semester. Then, sentiment analysis was used for data processing, through which it was possible to categorize the opinions of the students into “satisfied” or “dissatisfied.” The results showed that the algorithm with the lowest error rate in its performance metric was the support vector machine (SVM). In addition, it was identified that its classification probability reached an accuracy of 91.76%. It is concluded that SVM classification using K-fold cross-validation will contribute to determining which factors associated with the teacher’s didactic strategies should be improved in each class session, since traditional surveying techniques have shortcomings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊