Until recently an assumption within the predictive modelling community has been that collecting more student data is always better. But in reaction to recent high profile data privacy scandals, many educators, scholars, students and administrators have been questioning the ethics of such a strategy. Suggestions are growing that the minimum amount of data should be collected to aid the function for which a prediction is being made. Yet, machine learning algorithms are primarily judged on metrics derived from prediction accuracy or whether they meet probabilistic criteria for significance. They are not routinely judged on whether they utilize the minimum number of the least sensitive features, preserving what we name here as data collection parsimony. We believe the ability to assess data collection parsimony would be a valuable addition to the suite of evaluations for any prediction strategy and to that end, the following paper provides an introduction to data collection parsimony, describes a novel method for quantifying the concept using empirical Bayes estimates and then tests the metric on real world data. Both theoretical and empirical benefits and limitations of this method are discussed. We conclude that for the purpose of model building this metric is superior to others in several ways, but there are some hurdles to effective implementation.
{"title":"Quantifying data sensitivity: precise demonstration of care when building student prediction models","authors":"Charles Lang, Charlotte Woo, Jeanne Sinclair","doi":"10.1145/3375462.3375506","DOIUrl":"https://doi.org/10.1145/3375462.3375506","url":null,"abstract":"Until recently an assumption within the predictive modelling community has been that collecting more student data is always better. But in reaction to recent high profile data privacy scandals, many educators, scholars, students and administrators have been questioning the ethics of such a strategy. Suggestions are growing that the minimum amount of data should be collected to aid the function for which a prediction is being made. Yet, machine learning algorithms are primarily judged on metrics derived from prediction accuracy or whether they meet probabilistic criteria for significance. They are not routinely judged on whether they utilize the minimum number of the least sensitive features, preserving what we name here as data collection parsimony. We believe the ability to assess data collection parsimony would be a valuable addition to the suite of evaluations for any prediction strategy and to that end, the following paper provides an introduction to data collection parsimony, describes a novel method for quantifying the concept using empirical Bayes estimates and then tests the metric on real world data. Both theoretical and empirical benefits and limitations of this method are discussed. We conclude that for the purpose of model building this metric is superior to others in several ways, but there are some hurdles to effective implementation.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"291 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114383535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuanyuan Hu, C. Donald, Nasser Giacaman, Zexuan Zhu
This paper reports on early stages of a machine learning research project, where phases of cognitive presence in MOOC discussions were manually coded in preparation for training automated cognitive classifiers. We present a manual-classification rubric combining Garrison, Anderson and Archer's [11] coding scheme with Park's [25] revised version for a target MOOC. The inter-rater reliability between two raters achieved 95.4% agreement with a Cohen's weighted kappa of 0.96, demonstrating our classification rubric is plausible for the target MOOC dataset. The classification rubric, originally intended for for-credit, undergraduate courses, can be applied to a MOOC context. We found that the main disagreements between two raters lay on adjacent cognitive phases, implying that additional categories may exist between cognitive phases in such MOOC discussion messages. Overall, our results suggest a reliable rubric for classifying cognitive phases in discussion messages of the target MOOC by two raters. This indicates we are in a position to apply machine learning algorithms which can also cater for data with inter-rater disagreements in future automated classification studies.
{"title":"Towards automated analysis of cognitive presence in MOOC discussions: a manual classification study","authors":"Yuanyuan Hu, C. Donald, Nasser Giacaman, Zexuan Zhu","doi":"10.1145/3375462.3375473","DOIUrl":"https://doi.org/10.1145/3375462.3375473","url":null,"abstract":"This paper reports on early stages of a machine learning research project, where phases of cognitive presence in MOOC discussions were manually coded in preparation for training automated cognitive classifiers. We present a manual-classification rubric combining Garrison, Anderson and Archer's [11] coding scheme with Park's [25] revised version for a target MOOC. The inter-rater reliability between two raters achieved 95.4% agreement with a Cohen's weighted kappa of 0.96, demonstrating our classification rubric is plausible for the target MOOC dataset. The classification rubric, originally intended for for-credit, undergraduate courses, can be applied to a MOOC context. We found that the main disagreements between two raters lay on adjacent cognitive phases, implying that additional categories may exist between cognitive phases in such MOOC discussion messages. Overall, our results suggest a reliable rubric for classifying cognitive phases in discussion messages of the target MOOC by two raters. This indicates we are in a position to apply machine learning algorithms which can also cater for data with inter-rater disagreements in future automated classification studies.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115136615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brendan R. Eagan, Jais Brohinsky, Jingyi Wang, D. Shaffer
Analyses of learning often rely on coded data. One important aspect of coding is establishing reliability. Previous research has shown that the common approach for establishing coding reliability is seriously flawed in that it produces unacceptably high Type I error rates. This paper focuses on testing whether or not these error rates correspond to specific reliability metrics or a larger methodological problem. Our results show that the method for establishing reliability is not metric specific, and we suggest the adoption of new practices to control Type I error rates associated with establishing coding reliability.
{"title":"Testing the reliability of inter-rater reliability","authors":"Brendan R. Eagan, Jais Brohinsky, Jingyi Wang, D. Shaffer","doi":"10.1145/3375462.3375508","DOIUrl":"https://doi.org/10.1145/3375462.3375508","url":null,"abstract":"Analyses of learning often rely on coded data. One important aspect of coding is establishing reliability. Previous research has shown that the common approach for establishing coding reliability is seriously flawed in that it produces unacceptably high Type I error rates. This paper focuses on testing whether or not these error rates correspond to specific reliability metrics or a larger methodological problem. Our results show that the method for establishing reliability is not metric specific, and we suggest the adoption of new practices to control Type I error rates associated with establishing coding reliability.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129059772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John A. Erickson, Anthony F. Botelho, Steven McAteer, A. Varatharaj, N. Heffernan
The use of computer-based systems in classrooms has provided teachers with new opportunities in delivering content to students, supplementing instruction, and assessing student knowledge and comprehension. Among the largest benefits of these systems is their ability to provide students with feedback on their work and also report student performance and progress to their teacher. While computer-based systems can automatically assess student answers to a range of question types, a limitation faced by many systems is in regard to open-ended problems. Many systems are either unable to provide support for open-ended problems, relying on the teacher to grade them manually, or avoid such question types entirely. Due to recent advancements in natural language processing methods, the automation of essay grading has made notable strides. However, much of this research has pertained to domains outside of mathematics, where the use of open-ended problems can be used by teachers to assess students' understanding of mathematical concepts beyond what is possible on other types of problems. This research explores the viability and challenges of developing automated graders of open-ended student responses in mathematics. We further explore how the scale of available data impacts model performance. Focusing on content delivered through the ASSISTments online learning platform, we present a set of analyses pertaining to the development and evaluation of models to predict teacher-assigned grades for student open responses.
{"title":"The automated grading of student open responses in mathematics","authors":"John A. Erickson, Anthony F. Botelho, Steven McAteer, A. Varatharaj, N. Heffernan","doi":"10.1145/3375462.3375523","DOIUrl":"https://doi.org/10.1145/3375462.3375523","url":null,"abstract":"The use of computer-based systems in classrooms has provided teachers with new opportunities in delivering content to students, supplementing instruction, and assessing student knowledge and comprehension. Among the largest benefits of these systems is their ability to provide students with feedback on their work and also report student performance and progress to their teacher. While computer-based systems can automatically assess student answers to a range of question types, a limitation faced by many systems is in regard to open-ended problems. Many systems are either unable to provide support for open-ended problems, relying on the teacher to grade them manually, or avoid such question types entirely. Due to recent advancements in natural language processing methods, the automation of essay grading has made notable strides. However, much of this research has pertained to domains outside of mathematics, where the use of open-ended problems can be used by teachers to assess students' understanding of mathematical concepts beyond what is possible on other types of problems. This research explores the viability and challenges of developing automated graders of open-ended student responses in mathematics. We further explore how the scale of available data impacts model performance. Focusing on content delivered through the ASSISTments online learning platform, we present a set of analyses pertaining to the development and evaluation of models to predict teacher-assigned grades for student open responses.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121188186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joshua Quick, Benjamin A. Motz, Jamie Israel, Jason Kaetzel
A central concern in learning analytics specifically and educational research more generally is the alignment of robust, coherent measures to well-developed conceptual and theoretical frameworks. Capturing and representing processes of learning remains an ongoing challenge in all areas of educational inquiry and presents substantive considerations on the nature of learning, knowledge, and assessment & measurement that have been continuously refined in various areas of education and pedagogical practice. Learning analytics as a still developing method of inquiry has yet to substantively navigate the alignment of measurement, capture, and representation of learning to theoretical frameworks despite being used to identify various practical concerns such as at risk students. This study seeks to address these concerns by comparing behavioral measurements from learning management systems to established measurements of components of learning as understood through self-regulated learning frameworks. Using several prominent and robustly supported self-reported survey measures designed to identify dimensions of self-regulated learning, as well as typical behavioral features extracted from a learning management system, we conducted descriptive and exploratory analyses on the relational structures of these data. With the exception of learners' self-reported time management strategies and level of motivation, the current results indicate that behavioral measures were not well correlated with survey measurements. Possibilities and recommendations for learning analytics as measurements for self-regulated learning are discussed.
{"title":"What college students say, and what they do: aligning self-regulated learning theory with behavioral logs","authors":"Joshua Quick, Benjamin A. Motz, Jamie Israel, Jason Kaetzel","doi":"10.1145/3375462.3375516","DOIUrl":"https://doi.org/10.1145/3375462.3375516","url":null,"abstract":"A central concern in learning analytics specifically and educational research more generally is the alignment of robust, coherent measures to well-developed conceptual and theoretical frameworks. Capturing and representing processes of learning remains an ongoing challenge in all areas of educational inquiry and presents substantive considerations on the nature of learning, knowledge, and assessment & measurement that have been continuously refined in various areas of education and pedagogical practice. Learning analytics as a still developing method of inquiry has yet to substantively navigate the alignment of measurement, capture, and representation of learning to theoretical frameworks despite being used to identify various practical concerns such as at risk students. This study seeks to address these concerns by comparing behavioral measurements from learning management systems to established measurements of components of learning as understood through self-regulated learning frameworks. Using several prominent and robustly supported self-reported survey measures designed to identify dimensions of self-regulated learning, as well as typical behavioral features extracted from a learning management system, we conducted descriptive and exploratory analyses on the relational structures of these data. With the exception of learners' self-reported time management strategies and level of motivation, the current results indicate that behavioral measures were not well correlated with survey measurements. Possibilities and recommendations for learning analytics as measurements for self-regulated learning are discussed.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115498786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Isabel Hilliger, Camila Aguirre, Constanza Miranda, S. Celis, M. Pérez-Sanagustín
Curriculum analytics (CA) emerged as a sub-field of learning analytics, aiming to use evidence to drive curriculum decision-making and program improvement. However, its overall impact on program outcomes remains unknown. In this context, this paper presents work-in-progress of a large research project to understand how CA could support continuous improvement processes at a program-level. We followed an approach based on design-based research to develop a CA tool: The Integrative Learning Design Framework. This paper describes three out of four phases of this framework and its main results, including the evaluation of the local impact of this CA tool. This evaluation consisted of an instrumental case study to evaluate its use to support 124 teaching staff in a 3-year continuous improvement process in a Latin American university. Lessons learned indicate that the tool helped staff to collect information for curriculum discussions, facilitating the availability of evidence regarding student competency attainment. To generalize these lessons, future work will consist of evaluating the tool in different university settings.
{"title":"Design of a curriculum analytics tool to support continuous improvement processes in higher education","authors":"Isabel Hilliger, Camila Aguirre, Constanza Miranda, S. Celis, M. Pérez-Sanagustín","doi":"10.1145/3375462.3375489","DOIUrl":"https://doi.org/10.1145/3375462.3375489","url":null,"abstract":"Curriculum analytics (CA) emerged as a sub-field of learning analytics, aiming to use evidence to drive curriculum decision-making and program improvement. However, its overall impact on program outcomes remains unknown. In this context, this paper presents work-in-progress of a large research project to understand how CA could support continuous improvement processes at a program-level. We followed an approach based on design-based research to develop a CA tool: The Integrative Learning Design Framework. This paper describes three out of four phases of this framework and its main results, including the evaluation of the local impact of this CA tool. This evaluation consisted of an instrumental case study to evaluate its use to support 124 teaching staff in a 3-year continuous improvement process in a Latin American university. Lessons learned indicate that the tool helped staff to collect information for curriculum discussions, facilitating the availability of evidence regarding student competency attainment. To generalize these lessons, future work will consist of evaluating the tool in different university settings.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121647644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fatima Harrak, François Bouchet, Vanda Luengo, P. Gillois
Students' questions are essential to help teachers in assessing their understanding and adapting their pedagogy. However, in a flipped classroom context where many questions are asked online to be addressed in class, selecting questions can be difficult for teachers. To help them in this task, we present here three alternative ways of organizing questions: one based on pedagogical needs, one based on estimated students' profiles and one mixing both approaches. Results of a survey filled by 37 teachers in a flipped classroom pedagogy show no consensus over a single organization. A cluster analysis based on teachers' flipped classroom experience allowed us to distinguish two profiles, but they were not associated with any particular question organization preference. Qualitative results suggest the need for different organizations may rely more on a pedagogical philosophy and advocates for differentiated dashboards.
{"title":"Evaluating teachers' perceptions of students' questions organization","authors":"Fatima Harrak, François Bouchet, Vanda Luengo, P. Gillois","doi":"10.1145/3375462.3375509","DOIUrl":"https://doi.org/10.1145/3375462.3375509","url":null,"abstract":"Students' questions are essential to help teachers in assessing their understanding and adapting their pedagogy. However, in a flipped classroom context where many questions are asked online to be addressed in class, selecting questions can be difficult for teachers. To help them in this task, we present here three alternative ways of organizing questions: one based on pedagogical needs, one based on estimated students' profiles and one mixing both approaches. Results of a survey filled by 37 teachers in a flipped classroom pedagogy show no consensus over a single organization. A cluster analysis based on teachers' flipped classroom experience allowed us to distinguish two profiles, but they were not associated with any particular question organization preference. Qualitative results suggest the need for different organizations may rely more on a pedagogical philosophy and advocates for differentiated dashboards.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134448120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Performance and consistency play a large role in learning. This study analyzes the relation between consistency in students' online work habits and academic performance in a blended course. We utilize the data from logs recorded by a learning management system (LMS) in two information technology courses. The two courses required the completion of monthly asynchronous online discussion tasks and weekly assignments, respectively. We measure consistency by using Data Time Warping (DTW) distance for two successive tasks (assignments or discussions), as an appropriate measure to assess similarity of time series, over 11-day timeline starting 10 days before and up to the submission deadline. We found meaningful clusters of students exhibiting similar behavior and we use these to identify three distinct consistency patterns: highly consistent, incrementally consistent, and inconsistent users. We also found evidence of significant associations between these patterns and learner's academic performance.
{"title":"Analyzing the consistency in within-activity learning patterns in blended learning","authors":"Varshita Sher, M. Hatala, D. Gašević","doi":"10.1145/3375462.3375470","DOIUrl":"https://doi.org/10.1145/3375462.3375470","url":null,"abstract":"Performance and consistency play a large role in learning. This study analyzes the relation between consistency in students' online work habits and academic performance in a blended course. We utilize the data from logs recorded by a learning management system (LMS) in two information technology courses. The two courses required the completion of monthly asynchronous online discussion tasks and weekly assignments, respectively. We measure consistency by using Data Time Warping (DTW) distance for two successive tasks (assignments or discussions), as an appropriate measure to assess similarity of time series, over 11-day timeline starting 10 days before and up to the submission deadline. We found meaningful clusters of students exhibiting similar behavior and we use these to identify three distinct consistency patterns: highly consistent, incrementally consistent, and inconsistent users. We also found evidence of significant associations between these patterns and learner's academic performance.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129284635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Koen Niemeijer, R. Feskens, G. Krempl, J. Koops, Matthieu J. S. Brinkhuis
Educational tests can be used to estimate pupils' abilities and thereby give an indication of whether their school type is suitable for them. However, tests in education are usually conducted for each content area separately which makes it difficult to combine these results into one single school advice. To help with school advice, we provide a comparison between both domain-specific and domain-agnostic methods for predicting school types. Both use data from a pupil monitoring system in the Netherlands, a system that keeps track of pupils' educational progress over several years by a series of tests measuring multiple skills. A domain-specific item response theory (IRT) model is calibrated from which an ability score is extracted and is subsequently plugged into a multinomial log-linear regression model. Second, we train domain-agnostic machine learning (ML) models. These are a random forest (RF) and a shallow neural network (NN). Furthermore, we apply case weighting to give extra attention to pupils who switched between school types. When considering the performance of all pupils, RFs provided the most accurate predictions followed by NNs and IRT respectively. When only looking at the performance of pupils who switched school type, IRT performed best followed by NNs and RFs. Case weighting proved to provide a major improvement for this group. Lastly, IRT was found to be much easier to explain in comparison to the other models. Thus, while ML provided more accurate results, this comes at the cost of a lower explainability in comparison to IRT.
{"title":"Constructing and predicting school advice for academic achievement: a comparison of item response theory and machine learning techniques","authors":"Koen Niemeijer, R. Feskens, G. Krempl, J. Koops, Matthieu J. S. Brinkhuis","doi":"10.1145/3375462.3375486","DOIUrl":"https://doi.org/10.1145/3375462.3375486","url":null,"abstract":"Educational tests can be used to estimate pupils' abilities and thereby give an indication of whether their school type is suitable for them. However, tests in education are usually conducted for each content area separately which makes it difficult to combine these results into one single school advice. To help with school advice, we provide a comparison between both domain-specific and domain-agnostic methods for predicting school types. Both use data from a pupil monitoring system in the Netherlands, a system that keeps track of pupils' educational progress over several years by a series of tests measuring multiple skills. A domain-specific item response theory (IRT) model is calibrated from which an ability score is extracted and is subsequently plugged into a multinomial log-linear regression model. Second, we train domain-agnostic machine learning (ML) models. These are a random forest (RF) and a shallow neural network (NN). Furthermore, we apply case weighting to give extra attention to pupils who switched between school types. When considering the performance of all pupils, RFs provided the most accurate predictions followed by NNs and IRT respectively. When only looking at the performance of pupils who switched school type, IRT performed best followed by NNs and RFs. Case weighting proved to provide a major improvement for this group. Lastly, IRT was found to be much easier to explain in comparison to the other models. Thus, while ML provided more accurate results, this comes at the cost of a lower explainability in comparison to IRT.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120935265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Learning Analytics and Knowledge (LAK) and Educational Data Mining (EDM) are two of the most popular venues for researchers and practitioners to report and disseminate discoveries in data-intensive research on technology-enhanced education. After the development of about a decade, it is time to scrutinize and compare these two venues. By doing this, we expected to inform relevant stakeholders of a better understanding of the past development of LAK and EDM and provide suggestions for their future development. Specifically, we conducted an extensive comparison analysis between LAK and EDM from four perspectives, including (i) the topics investigated; (ii) community development; (iii) community diversity; and (iv) research impact. Furthermore, we applied one of the most widely-used language modeling techniques (Word2Vec) to capture words used frequently by researchers to describe future works that can be pursued by building upon suggestions made in the published papers to shed light on potential directions for future research.
{"title":"Let's shine together!: a comparative study between learning analytics and educational data mining","authors":"Guanliang Chen, V. Rolim, R. F. Mello, D. Gašević","doi":"10.1145/3375462.3375500","DOIUrl":"https://doi.org/10.1145/3375462.3375500","url":null,"abstract":"Learning Analytics and Knowledge (LAK) and Educational Data Mining (EDM) are two of the most popular venues for researchers and practitioners to report and disseminate discoveries in data-intensive research on technology-enhanced education. After the development of about a decade, it is time to scrutinize and compare these two venues. By doing this, we expected to inform relevant stakeholders of a better understanding of the past development of LAK and EDM and provide suggestions for their future development. Specifically, we conducted an extensive comparison analysis between LAK and EDM from four perspectives, including (i) the topics investigated; (ii) community development; (iii) community diversity; and (iv) research impact. Furthermore, we applied one of the most widely-used language modeling techniques (Word2Vec) to capture words used frequently by researchers to describe future works that can be pursued by building upon suggestions made in the published papers to shed light on potential directions for future research.","PeriodicalId":355800,"journal":{"name":"Proceedings of the Tenth International Conference on Learning Analytics & Knowledge","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115999565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}