Conceptual Mutation Testing for Student Programming Misconceptions

The Art, Science, and Engineering of Programming Pub Date : 2023-10-15 DOI:10.22152/programming-journal.org/2024/8/7

Siddhartha Prasad, Ben Greenman, Tim Nelson, S. Krishnamurthi

{"title":"Conceptual Mutation Testing for Student Programming Misconceptions","authors":"Siddhartha Prasad, Ben Greenman, Tim Nelson, S. Krishnamurthi","doi":"10.22152/programming-journal.org/2024/8/7","DOIUrl":null,"url":null,"abstract":"Context: Students often misunderstand programming problem descriptions. This can lead them to solve the wrong problem, which creates frustration, obstructs learning, and imperils grades. Researchers have found that students can be made to better understand the problem by writing examples before they start programming. These examples are checked against correct and wrong implementations -- analogous to mutation testing -- provided by course staff. Doing so results in better student understanding of the problem as well as better test suites to accompany the program, both of which are desirable educational outcomes. Inquiry: Producing mutant implementations requires care. If there are too many, or they are too obscure, students will end up spending a lot of time on an unproductive task and also become frustrated. Instead, we want a small number of mutants that each correspond to common problem misconceptions. This paper presents a workflow with partial automation to produce mutants of this form which, notably, are not those produced by mutation-testing tools. Approach: We comb through student tests that fail a correct implementation. The student misconceptions are embedded in these failures. We then use methods to semantically cluster these failures. These clusters are then translated into conceptual mutants. These can then be run against student data to determine whether we they are better than prior methods. Some of these processes also enjoy automation. Knowledge: We find that student misconceptions illustrated by failing tests can be operationalized by the above process. The resulting mutants do much better at identifying student misconceptions. Grounding: Our findings are grounded in a manual analysis of student examples and a quantitative evaluation of both our clustering techniques and our process for making conceptual mutants. The clustering evaluation compares against a ground truth using standard cluster-correspondence measures, while the mutant evaluation examines how conceptual mutants perform against student data. Importance: Our work contributes a workflow, with some automation, to reduce the cost and increase the effectiveness of generating conceptually interesting mutants. Such mutants can both improve learning outcomes and reduce student frustration, leading to better educational outcomes. In the process, we also identify a variation of mutation testing not commonly discussed in the software literature.","PeriodicalId":142220,"journal":{"name":"The Art, Science, and Engineering of Programming","volume":"58 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Art, Science, and Engineering of Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22152/programming-journal.org/2024/8/7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Context: Students often misunderstand programming problem descriptions. This can lead them to solve the wrong problem, which creates frustration, obstructs learning, and imperils grades. Researchers have found that students can be made to better understand the problem by writing examples before they start programming. These examples are checked against correct and wrong implementations -- analogous to mutation testing -- provided by course staff. Doing so results in better student understanding of the problem as well as better test suites to accompany the program, both of which are desirable educational outcomes. Inquiry: Producing mutant implementations requires care. If there are too many, or they are too obscure, students will end up spending a lot of time on an unproductive task and also become frustrated. Instead, we want a small number of mutants that each correspond to common problem misconceptions. This paper presents a workflow with partial automation to produce mutants of this form which, notably, are not those produced by mutation-testing tools. Approach: We comb through student tests that fail a correct implementation. The student misconceptions are embedded in these failures. We then use methods to semantically cluster these failures. These clusters are then translated into conceptual mutants. These can then be run against student data to determine whether we they are better than prior methods. Some of these processes also enjoy automation. Knowledge: We find that student misconceptions illustrated by failing tests can be operationalized by the above process. The resulting mutants do much better at identifying student misconceptions. Grounding: Our findings are grounded in a manual analysis of student examples and a quantitative evaluation of both our clustering techniques and our process for making conceptual mutants. The clustering evaluation compares against a ground truth using standard cluster-correspondence measures, while the mutant evaluation examines how conceptual mutants perform against student data. Importance: Our work contributes a workflow, with some automation, to reduce the cost and increase the effectiveness of generating conceptually interesting mutants. Such mutants can both improve learning outcomes and reduce student frustration, leading to better educational outcomes. In the process, we also identify a variation of mutation testing not commonly discussed in the software literature.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

针对学生编程误区的概念突变测试

背景：学生经常误解编程问题的描述。这会导致他们解决错误的问题，从而产生挫败感，阻碍学习，影响成绩。研究人员发现，在学生开始编程之前，可以通过编写示例让他们更好地理解问题。这些示例将与课程人员提供的正确和错误的实现方法（类似于突变测试）进行核对。这样做的结果是，学生对问题有了更好的理解，程序也有了更好的测试套件，这两者都是理想的教育成果。调查：制作突变实现需要小心谨慎。如果数量过多，或者过于晦涩难懂，学生最终会在无用功上花费大量时间，同时也会产生挫败感。相反，我们需要少量的突变体，每个突变体都与常见的问题误解相对应。本文介绍了一种部分自动化的工作流程，用于生成这种形式的突变体，值得注意的是，这种突变体不是由突变测试工具生成的。方法：我们对未能正确实施的学生测试进行梳理。学生的误解就蕴含在这些失败中。然后，我们使用各种方法对这些失败进行语义聚类。然后将这些聚类转化为概念突变体。然后，我们可以针对学生数据运行这些方法，以确定它们是否优于先前的方法。其中一些过程还可以实现自动化。知识：我们发现，测试失败所说明的学生误解可以通过上述过程加以操作化。由此产生的突变体能更好地识别学生的错误认知。基础：我们的研究结果基于对学生范例的人工分析，以及对我们的聚类技术和概念突变体制作过程的定量评估。聚类评估使用标准聚类对应措施与基本事实进行比较，而突变体评估则考察概念突变体在学生数据中的表现。重要性：我们的工作提供了一个工作流程，并在一定程度上实现了自动化，从而降低了生成概念性突变体的成本，提高了生成概念性突变体的效率。这种突变既能提高学习效果，又能减少学生的挫败感，从而取得更好的教育成果。在这一过程中，我们还发现了软件文献中不常见的突变测试变体。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊