有向网络的鲁棒分层聚类:一种公理方法

IF 1.6 2区数学 Q2 MATHEMATICS, APPLIED SIAM Journal on Applied Algebra and Geometry Pub Date : 2021-01-01 DOI:10.1137/20m1359201

G. Carlsson, Facundo M'emoli, Santiago Segarra

{"title":"有向网络的鲁棒分层聚类:一种公理方法","authors":"G. Carlsson, Facundo M'emoli, Santiago Segarra","doi":"10.1137/20m1359201","DOIUrl":null,"url":null,"abstract":"We provide a complete taxonomic characterization of robust hierarchical clustering methods for directed networks following an axiomatic approach. We begin by introducing three practical properties associated with the notion of robustness in hierarchical clustering: linear scale preservation, stability, and excisiveness. Linear scale preservation enforces imperviousness to change in units of measure whereas stability ensures that a bounded perturbation in the input network entails a bounded perturbation in the clustering output. Excisiveness refers to the local consistency of the clustering outcome. Algorithmically, excisiveness implies that we can reduce computational complexity by only clustering a subset of our data while theoretically guaranteeing that the same hierarchical outcome would be observed when clustering the whole dataset. In parallel to these three properties, we introduce the concept of representability, a generative model for describing clustering methods through the specification of their action on a collection of networks. Our main result is to leverage this generative model to give a precise characterization of all robust -- i.e., excisive, linear scale preserving, and stable -- hierarchical clustering methods for directed networks. We also address the implementation of our methods and describe an application to real data.","PeriodicalId":48489,"journal":{"name":"SIAM Journal on Applied Algebra and Geometry","volume":"27 1","pages":"675-700"},"PeriodicalIF":1.6000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Robust Hierarchical Clustering for Directed Networks: An Axiomatic Approach\",\"authors\":\"G. Carlsson, Facundo M'emoli, Santiago Segarra\",\"doi\":\"10.1137/20m1359201\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We provide a complete taxonomic characterization of robust hierarchical clustering methods for directed networks following an axiomatic approach. We begin by introducing three practical properties associated with the notion of robustness in hierarchical clustering: linear scale preservation, stability, and excisiveness. Linear scale preservation enforces imperviousness to change in units of measure whereas stability ensures that a bounded perturbation in the input network entails a bounded perturbation in the clustering output. Excisiveness refers to the local consistency of the clustering outcome. Algorithmically, excisiveness implies that we can reduce computational complexity by only clustering a subset of our data while theoretically guaranteeing that the same hierarchical outcome would be observed when clustering the whole dataset. In parallel to these three properties, we introduce the concept of representability, a generative model for describing clustering methods through the specification of their action on a collection of networks. Our main result is to leverage this generative model to give a precise characterization of all robust -- i.e., excisive, linear scale preserving, and stable -- hierarchical clustering methods for directed networks. We also address the implementation of our methods and describe an application to real data.\",\"PeriodicalId\":48489,\"journal\":{\"name\":\"SIAM Journal on Applied Algebra and Geometry\",\"volume\":\"27 1\",\"pages\":\"675-700\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIAM Journal on Applied Algebra and Geometry\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1137/20m1359201\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM Journal on Applied Algebra and Geometry","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1137/20m1359201","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 1

摘要

我们提供了一个完整的分类表征鲁棒分层聚类方法的有向网络遵循公理的方法。我们首先介绍与层次聚类中鲁棒性概念相关的三个实际特性:线性尺度保存、稳定性和可分割性。线性尺度保持增强了对度量单位变化的不渗透性，而稳定性确保了输入网络中的有界扰动会导致聚类输出中的有界扰动。Excisiveness是指聚类结果的局部一致性。在算法上，可分割性意味着我们可以通过只对数据子集进行聚类来降低计算复杂度，同时在理论上保证对整个数据集进行聚类时可以观察到相同的分层结果。与这三个属性并行，我们引入了可表征性的概念，这是一个通过规范它们在网络集合上的作用来描述聚类方法的生成模型。我们的主要结果是利用这个生成模型给出所有鲁棒的精确特征-即，精确，线性尺度保持和稳定-有向网络的分层聚类方法。我们还讨论了方法的实现，并描述了实际数据的应用程序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Robust Hierarchical Clustering for Directed Networks: An Axiomatic Approach

We provide a complete taxonomic characterization of robust hierarchical clustering methods for directed networks following an axiomatic approach. We begin by introducing three practical properties associated with the notion of robustness in hierarchical clustering: linear scale preservation, stability, and excisiveness. Linear scale preservation enforces imperviousness to change in units of measure whereas stability ensures that a bounded perturbation in the input network entails a bounded perturbation in the clustering output. Excisiveness refers to the local consistency of the clustering outcome. Algorithmically, excisiveness implies that we can reduce computational complexity by only clustering a subset of our data while theoretically guaranteeing that the same hierarchical outcome would be observed when clustering the whole dataset. In parallel to these three properties, we introduce the concept of representability, a generative model for describing clustering methods through the specification of their action on a collection of networks. Our main result is to leverage this generative model to give a precise characterization of all robust -- i.e., excisive, linear scale preserving, and stable -- hierarchical clustering methods for directed networks. We also address the implementation of our methods and describe an application to real data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊