Exploring the whole sequence of steps a student takes to produce work, and the patterns that emerge from thousands of such sequences is fertile ground for a richer understanding of learning. In this paper we autonomously generate hints for the Code.org `Hour of Code,' (which is to the best of our knowledge the largest online course to date) using historical student data. We first develop a family of algorithms that can predict the way an expert teacher would encourage a student to make forward progress. Such predictions can form the basis for effective hint generation systems. The algorithms are more accurate than current state-of-the-art methods at recreating expert suggestions, are easy to implement and scale well. We then show that the same framework which motivated the hint generating algorithms suggests a sequence-based statistic that can be measured for each learner. We discover that this statistic is highly predictive of a student's future success.
{"title":"Autonomously Generating Hints by Inferring Problem Solving Policies","authors":"C. Piech, M. Sahami, Jonathan Huang, L. Guibas","doi":"10.1145/2724660.2724668","DOIUrl":"https://doi.org/10.1145/2724660.2724668","url":null,"abstract":"Exploring the whole sequence of steps a student takes to produce work, and the patterns that emerge from thousands of such sequences is fertile ground for a richer understanding of learning. In this paper we autonomously generate hints for the Code.org `Hour of Code,' (which is to the best of our knowledge the largest online course to date) using historical student data. We first develop a family of algorithms that can predict the way an expert teacher would encourage a student to make forward progress. Such predictions can form the basis for effective hint generation systems. The algorithms are more accurate than current state-of-the-art methods at recreating expert suggestions, are easy to implement and scale well. We then show that the same framework which motivated the hint generating algorithms suggests a sequence-based statistic that can be measured for each learner. We discover that this statistic is highly predictive of a student's future success.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"129 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73667681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we track the interaction of students across multiple Massive Open Online Courses (MOOCs) on edX. Leveraging the ``burstiness" factor of three of the most commonly exhibited interaction forms made possible by online learning (i.e, video lecture viewing, coursework access and discussion forum posting), we take on the task of predicting student performance (operationalized as grade) across these courses. Specifically, we utilize the probabilistic framework of Conditional Random Fields (CRF) to formalize the problem of predicting the sequence of grades achieved by a student in different MOOCs, taking into account the contextual dependency of this outcome measure on students' general interaction trend across courses. Based on a comparative analysis of the combination of interaction features, our best CRF model can achieve a precision of 0.581, recall of 0.660 and a weighted F-score of 0.560, outweighing several baseline discriminative classifiers applied at each sequence position. These findings have implications for initiating early instructor intervention, so as to engage students along less active interaction dimensions that could be associated with low grades.
{"title":"Connecting the Dots: Predicting Student Grade Sequences from Bursty MOOC Interactions over Time","authors":"Tanmay Sinha, Justine Cassell","doi":"10.1145/2724660.2728669","DOIUrl":"https://doi.org/10.1145/2724660.2728669","url":null,"abstract":"In this work, we track the interaction of students across multiple Massive Open Online Courses (MOOCs) on edX. Leveraging the ``burstiness\" factor of three of the most commonly exhibited interaction forms made possible by online learning (i.e, video lecture viewing, coursework access and discussion forum posting), we take on the task of predicting student performance (operationalized as grade) across these courses. Specifically, we utilize the probabilistic framework of Conditional Random Fields (CRF) to formalize the problem of predicting the sequence of grades achieved by a student in different MOOCs, taking into account the contextual dependency of this outcome measure on students' general interaction trend across courses. Based on a comparative analysis of the combination of interaction features, our best CRF model can achieve a precision of 0.581, recall of 0.660 and a weighted F-score of 0.560, outweighing several baseline discriminative classifiers applied at each sequence position. These findings have implications for initiating early instructor intervention, so as to engage students along less active interaction dimensions that could be associated with low grades.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74543506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we address issues of transparency, modularity, and privacy with the introduction of an open source, web-based data repository and analysis tool tailored to the Massive Open Online Course community. The tool integrates data request/authorization and distribution workflows as well as a simple analytics module upload format to enable reuse and replication of analytics results among instructors and researchers. We survey the evolving landscape of competing data models, all of which can be accommodated in the platform. Data model descriptions are provided to analytics authors who choose, much like with smartphone app stores, to write for any number of data models depending on their needs and the proliferation of the particular data model. Two case study examples of analytics and interactive visualizations are described in the paper. The result is a simple but effective approach to learning analytics immediately applicable to X consortium institutions and beyond.
{"title":"moocRP: An Open-source Analytics Platform","authors":"Z. Pardos, Kevin Kao","doi":"10.1145/2724660.2724683","DOIUrl":"https://doi.org/10.1145/2724660.2724683","url":null,"abstract":"In this paper, we address issues of transparency, modularity, and privacy with the introduction of an open source, web-based data repository and analysis tool tailored to the Massive Open Online Course community. The tool integrates data request/authorization and distribution workflows as well as a simple analytics module upload format to enable reuse and replication of analytics results among instructors and researchers. We survey the evolving landscape of competing data models, all of which can be accommodated in the platform. Data model descriptions are provided to analytics authors who choose, much like with smartphone app stores, to write for any number of data models depending on their needs and the proliferation of the particular data model. Two case study examples of analytics and interactive visualizations are described in the paper. The result is a simple but effective approach to learning analytics immediately applicable to X consortium institutions and beyond.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"99 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85879222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A challenge in introductory and intermediate programming courses is understanding how students approached solving a particular programming problem, in order to provide feedback on how they might improve. In both Massive Open Online Courses (MOOCs) and large residential courses, such feedback is difficult to provide for each student individually. To multiply the instructor's leverage, we would like to group student submissions according to the general problem-solving strategy they used, as the first stage of a ``feedback pipeline''. We describe ongoing explorations of a variety of clustering algorithms and similarity metrics using a corpus of over 800 student submissions to a simple programming assignment from a programming MOOC. We find that for a majority of submissions, it is possible to automatically create clusters such that an instructor ``eyeballing'' some representative submissions from each cluster can readily describe qualitatively what the common elements are in student submissions in that cluster. This information can be the basis for feedback to the students or for comparing one group of students' approach with another's.
{"title":"Clustering Student Programming Assignments to Multiply Instructor Leverage","authors":"Hezheng Yin, J. Moghadam, A. Fox","doi":"10.1145/2724660.2728695","DOIUrl":"https://doi.org/10.1145/2724660.2728695","url":null,"abstract":"A challenge in introductory and intermediate programming courses is understanding how students approached solving a particular programming problem, in order to provide feedback on how they might improve. In both Massive Open Online Courses (MOOCs) and large residential courses, such feedback is difficult to provide for each student individually. To multiply the instructor's leverage, we would like to group student submissions according to the general problem-solving strategy they used, as the first stage of a ``feedback pipeline''. We describe ongoing explorations of a variety of clustering algorithms and similarity metrics using a corpus of over 800 student submissions to a simple programming assignment from a programming MOOC. We find that for a majority of submissions, it is possible to automatically create clusters such that an instructor ``eyeballing'' some representative submissions from each cluster can readily describe qualitatively what the common elements are in student submissions in that cluster. This information can be the basis for feedback to the students or for comparing one group of students' approach with another's.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85987787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The study of Operating Systems and Systems Programming provides invaluable software engineering experience and crucial conceptual understanding that make it an essential component of an undergraduate computer science curriculum. It is also imperative that classroom course material and infrastructure keep pace with rapidly evolving technology. A "modern" course will provide an accurate software engineering experience and prevent the study of outdated concepts. With the recent increase in size and popularity of computer science courses, all course material must also be appropriately scalable. In order to create such a "modern" systems course, we redesigned UC Berkeley's CS 162, a 300 student Introduction to Operating Systems & Systems Programming course. In this paper we detail our unique curriculum layout, our advanced infrastructure support for students, and future work on extending our infrastructure for other large computer science courses
{"title":"A Modern Student Experience inSystems Programming","authors":"Vaishaal Shankar, D. Culler","doi":"10.1145/2724660.2728665","DOIUrl":"https://doi.org/10.1145/2724660.2728665","url":null,"abstract":"The study of Operating Systems and Systems Programming provides invaluable software engineering experience and crucial conceptual understanding that make it an essential component of an undergraduate computer science curriculum. It is also imperative that classroom course material and infrastructure keep pace with rapidly evolving technology. A \"modern\" course will provide an accurate software engineering experience and prevent the study of outdated concepts. With the recent increase in size and popularity of computer science courses, all course material must also be appropriately scalable. In order to create such a \"modern\" systems course, we redesigned UC Berkeley's CS 162, a 300 student Introduction to Operating Systems & Systems Programming course. In this paper we detail our unique curriculum layout, our advanced infrastructure support for students, and future work on extending our infrastructure for other large computer science courses","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86112576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steven Tang, Elizabeth A. McBride, H. Gogel, Z. Pardos
Online computer adaptive learning is increasingly being used in classrooms as a way to provide guided learning for students. Such tutors have the potential to provide tailored feedback based on specific student needs and misunderstandings. Bayesian knowledge tracing (BKT) is used to model student knowledge when knowledge is assumed to be changing throughout a single assessment period; in contrast, traditional Item Response Theory (IRT) models assume student knowledge to be constant within an assessment period. The basic BKT model assumes that the chance a student transitions from "not knowing" to "knowing" after each item is the same, and problems are considered learning opportunities. It could be the case, however, that learning is actually context sensitive, where students' learning might be improved when the items and their associated tutoring content are delivered to the student in a particular order. In this paper, we use BKT models to find such context sensitive transition probabilities from real data delivered by an online tutoring system, ASSISTments. After empirically deriving orderings that lead to better learning, we qualitatively analyze the items and their tutoring content to uncover any mechanisms that might explain why such orderings are modeled to have higher learning potential.
{"title":"Item Ordering Effects with Qualitative Explanations using Online Adaptive Tutoring Data","authors":"Steven Tang, Elizabeth A. McBride, H. Gogel, Z. Pardos","doi":"10.1145/2724660.2728682","DOIUrl":"https://doi.org/10.1145/2724660.2728682","url":null,"abstract":"Online computer adaptive learning is increasingly being used in classrooms as a way to provide guided learning for students. Such tutors have the potential to provide tailored feedback based on specific student needs and misunderstandings. Bayesian knowledge tracing (BKT) is used to model student knowledge when knowledge is assumed to be changing throughout a single assessment period; in contrast, traditional Item Response Theory (IRT) models assume student knowledge to be constant within an assessment period. The basic BKT model assumes that the chance a student transitions from \"not knowing\" to \"knowing\" after each item is the same, and problems are considered learning opportunities. It could be the case, however, that learning is actually context sensitive, where students' learning might be improved when the items and their associated tutoring content are delivered to the student in a particular order. In this paper, we use BKT models to find such context sensitive transition probabilities from real data delivered by an online tutoring system, ASSISTments. After empirically deriving orderings that lead to better learning, we qualitatively analyze the items and their tutoring content to uncover any mechanisms that might explain why such orderings are modeled to have higher learning potential.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"175 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82965108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaye Clarkes-Nias, Juliet Mutahi, Andrew Kinai, Oliver E. Bent, Komminist Weldemariam, Saurabh Srivastava
We report on the motivation and qualitative studies that examine the design of a sentiment and context collection tool in a mobile-enabled blended learning technology. The tool concept emerged from field studies with teachers and students from two primary schools in Kenya. In this paper, we discuss the background and motivation of learners sentiment and context. Next, we present the overall design of the proposed module and its prototype implementation in a blended learning environment. Detailed discussions on the algorithms underlying the tool are beyond the scope of this paper.
{"title":"Towards Capturing Learners Sentiment and Context","authors":"Jaye Clarkes-Nias, Juliet Mutahi, Andrew Kinai, Oliver E. Bent, Komminist Weldemariam, Saurabh Srivastava","doi":"10.1145/2724660.2728662","DOIUrl":"https://doi.org/10.1145/2724660.2728662","url":null,"abstract":"We report on the motivation and qualitative studies that examine the design of a sentiment and context collection tool in a mobile-enabled blended learning technology. The tool concept emerged from field studies with teachers and students from two primary schools in Kenya. In this paper, we discuss the background and motivation of learners sentiment and context. Next, we present the overall design of the proposed module and its prototype implementation in a blended learning environment. Detailed discussions on the algorithms underlying the tool are beyond the scope of this paper.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"81 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82853135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Attrition in online learning is generally higher than in traditional settings, especially in large-scale online learning environments. A systematic analysis of individual differences in attrition and performance in 20 massive open online courses (N > 67,000) revealed a geographic achievement gap and a gender achievement gap. Online learners in Africa, Asia, and Latin America scored substantially lower grades and were only half as likely to persist than those in Europe, Oceania, and Northern America. Women also exhibited lower persistence and performance than men. Yet more persistent learners were only marginally more satisfied with their achievement. The primary obstacle for most learners was finding time for the course, which was partly related to low levels of volitional control. Self-ascribed successful learners reported higher levels of goal striving, growth mindset, and feelings of social belonging than unsuccessful ones. Insights into why learners leave online courses inform models of attrition and targeted interventions to support learners achieve their goals.
{"title":"Attrition and Achievement Gaps in Online Learning","authors":"René F. Kizilcec, Sherif A. Halawa","doi":"10.1145/2724660.2724680","DOIUrl":"https://doi.org/10.1145/2724660.2724680","url":null,"abstract":"Attrition in online learning is generally higher than in traditional settings, especially in large-scale online learning environments. A systematic analysis of individual differences in attrition and performance in 20 massive open online courses (N > 67,000) revealed a geographic achievement gap and a gender achievement gap. Online learners in Africa, Asia, and Latin America scored substantially lower grades and were only half as likely to persist than those in Europe, Oceania, and Northern America. Women also exhibited lower persistence and performance than men. Yet more persistent learners were only marginally more satisfied with their achievement. The primary obstacle for most learners was finding time for the course, which was partly related to low levels of volitional control. Self-ascribed successful learners reported higher levels of goal striving, growth mindset, and feelings of social belonging than unsuccessful ones. Insights into why learners leave online courses inform models of attrition and targeted interventions to support learners achieve their goals.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75561019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teaching computer architecture as a hands-on engineering course to approximately 250 MIT students per semester requires a large, dedicated teaching staff. This Spring, a shortened version of the course will be deployed on edX to a potentially far larger cohort of students, without additional teaching staff. To better support students, we have deployed developmental versions of three learner-sourcing systems to as many as 500 students. These systems harvest and organize students' collective knowledge about debugging and optimizing solutions. We plan to deploy and study the next iteration of these systems on edX this Spring.
{"title":"Learner-Sourcing in an Engineering Class at Scale","authors":"Elena L. Glassman, C. Terman, Rob Miller","doi":"10.1145/2724660.2728694","DOIUrl":"https://doi.org/10.1145/2724660.2728694","url":null,"abstract":"Teaching computer architecture as a hands-on engineering course to approximately 250 MIT students per semester requires a large, dedicated teaching staff. This Spring, a shortened version of the course will be deployed on edX to a potentially far larger cohort of students, without additional teaching staff. To better support students, we have deployed developmental versions of three learner-sourcing systems to as many as 500 students. These systems harvest and organize students' collective knowledge about debugging and optimizing solutions. We plan to deploy and study the next iteration of these systems on edX this Spring.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77247788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We report on an experiment testing the effects of releasing all of the content in a Massive Open Online Course (MOOC) at launch versus in a staggered release. In 2013, HarvardX offered two "runs" of the HeroesX course: In the first, content was released weekly over four months; in the second, all content was released at once. We develop three operationalizations of "ontrackness" to measure how students participated in sync with the recommended syllabus. Ontrackness in both versions was low, though in the second, mean ontrackness was approximately one-half of levels in the first HeroesX. We find few differences in persistence, participation, and completion between the two runs. Controlling for a students' number of active weeks, we estimate modest positive effects of ontrackness on certification. The revealed preferences of students for flexibility and the minimal benefits of ontrackness suggest that releasing content all at once may be a viable strategy for MOOC designers.
{"title":"Staggered Versus All-At-Once Content Release in Massive Open Online Courses: Evaluating a Natural Experiment","authors":"Tommy Mullaney, J. Reich","doi":"10.1145/2724660.2724663","DOIUrl":"https://doi.org/10.1145/2724660.2724663","url":null,"abstract":"We report on an experiment testing the effects of releasing all of the content in a Massive Open Online Course (MOOC) at launch versus in a staggered release. In 2013, HarvardX offered two \"runs\" of the HeroesX course: In the first, content was released weekly over four months; in the second, all content was released at once. We develop three operationalizations of \"ontrackness\" to measure how students participated in sync with the recommended syllabus. Ontrackness in both versions was low, though in the second, mean ontrackness was approximately one-half of levels in the first HeroesX. We find few differences in persistence, participation, and completion between the two runs. Controlling for a students' number of active weeks, we estimate modest positive effects of ontrackness on certification. The revealed preferences of students for flexibility and the minimal benefits of ontrackness suggest that releasing content all at once may be a viable strategy for MOOC designers.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81054536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}