SmokeOut: An Approach for Testing Clustering Implementations

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST) Pub Date : 2019-04-01 DOI:10.1109/ICST.2019.00057

Vincenzo Musco, Xin Yin, Iulian Neamtiu

{"title":"SmokeOut: An Approach for Testing Clustering Implementations","authors":"Vincenzo Musco, Xin Yin, Iulian Neamtiu","doi":"10.1109/ICST.2019.00057","DOIUrl":null,"url":null,"abstract":"Clustering is a key Machine Learning technique, used in many high-stakes domains from medicine to self-driving cars. Many clustering algorithms have been proposed, and these algorithms have been implemented in many toolkits. Clustering users assume that clustering implementations are correct, reliable, and for a given algorithm, interchangeable. We challenge these assumptions. We introduce SmokeOut, an approach and tool that pits clustering implementations against each other (and against themselves) while controlling for algorithm and dataset, to find datasets where clustering outcomes differ when they shouldn't, and measure this difference. We ran SmokeOut on 7 clustering algorithms (3 deterministic and 4 nondeterministic) implemented in 7 widely-used toolkits, and run in a variety of scenarios on the Penn Machine Learning Benchmark (162 datasets). SmokeOut has revealed that clustering implementations are fragile: on a given input dataset and using a given clustering algorithm, clustering outcomes and accuracy vary widely between (1) successive runs of the same toolkit; (2) different input parameters for that tool; (3) different toolkits.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICST.2019.00057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Clustering is a key Machine Learning technique, used in many high-stakes domains from medicine to self-driving cars. Many clustering algorithms have been proposed, and these algorithms have been implemented in many toolkits. Clustering users assume that clustering implementations are correct, reliable, and for a given algorithm, interchangeable. We challenge these assumptions. We introduce SmokeOut, an approach and tool that pits clustering implementations against each other (and against themselves) while controlling for algorithm and dataset, to find datasets where clustering outcomes differ when they shouldn't, and measure this difference. We ran SmokeOut on 7 clustering algorithms (3 deterministic and 4 nondeterministic) implemented in 7 widely-used toolkits, and run in a variety of scenarios on the Penn Machine Learning Benchmark (162 datasets). SmokeOut has revealed that clustering implementations are fragile: on a given input dataset and using a given clustering algorithm, clustering outcomes and accuracy vary widely between (1) successive runs of the same toolkit; (2) different input parameters for that tool; (3) different toolkits.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

冒烟:一种测试集群实现的方法

聚类是一项关键的机器学习技术，用于从医学到自动驾驶汽车的许多高风险领域。人们提出了许多聚类算法，并在许多工具包中实现了这些算法。聚类用户假设聚类实现是正确、可靠的，并且对于给定的算法是可互换的。我们挑战这些假设。我们介绍了SmokeOut，这是一种方法和工具，在控制算法和数据集的同时，使聚类实现相互竞争(以及相互竞争)，以找到聚类结果不应该存在差异的数据集，并测量这种差异。我们在7个广泛使用的工具包中实现的7种聚类算法(3种确定性和4种不确定性)上运行了SmokeOut，并在宾夕法尼亚大学机器学习基准(162个数据集)上运行了各种场景。SmokeOut揭示了聚类实现是脆弱的:在给定的输入数据集上，使用给定的聚类算法，聚类结果和准确性在(1)同一工具包的连续运行之间差异很大;(2)该工具的输入参数不同;(3)不同的工具箱。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

自引率

0.00%

发文量

期刊最新文献

Parallel Many-Objective Search for Unit Tests SeqFuzzer: An Industrial Protocol Fuzzing Framework from a Deep Learning Perspective Classifying False Positive Static Checker Alarms in Continuous Integration Using Convolutional Neural Networks Automated Function Assessment in Driving Scenarios Techniques for Evolution-Aware Runtime Verification