{"title":"Fast nonparametric inference of network backbones for graph sparsification","authors":"Alec Kirkley","doi":"arxiv-2409.06417","DOIUrl":null,"url":null,"abstract":"A network backbone provides a useful sparse representation of a weighted\nnetwork by keeping only its most important links, permitting a range of\ncomputational speedups and simplifying complex network visualizations. There\nare many possible criteria for a link to be considered important, and hence\nmany methods have been developed for the task of network backboning for graph\nsparsification. These methods can be classified as global or local in nature\ndepending on whether they evaluate the importance of an edge in the context of\nthe whole network or an individual node neighborhood. A key limitation of\nexisting network backboning methods is that they either artificially restrict\nthe topology of the backbone to take a specific form (e.g. a tree) or they\nrequire the specification of a free parameter (e.g. a significance level) that\ndetermines the number of edges to keep in the backbone. Here we develop a\ncompletely nonparametric framework for inferring the backbone of a weighted\nnetwork that overcomes these limitations by automatically selecting the optimal\nnumber of edges to retain in the backbone using the Minimum Description Length\n(MDL) principle from information theory. We develop two encoding schemes that\nserve as objective functions for global and local network backbones, as well as\nefficient optimization algorithms to identify the optimal backbones according\nto these objectives with runtime complexity log-linear in the number of edges.\nWe show that the proposed framework is generalizable to any discrete weight\ndistribution on the edges using a maximum a posteriori (MAP) estimation\nprocedure with an asymptotically equivalent Bayesian generative model of the\nbackbone. We compare the proposed method with existing methods in a range of\ntasks on real and synthetic networks.","PeriodicalId":501043,"journal":{"name":"arXiv - PHYS - Physics and Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Physics and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06417","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A network backbone provides a useful sparse representation of a weighted
network by keeping only its most important links, permitting a range of
computational speedups and simplifying complex network visualizations. There
are many possible criteria for a link to be considered important, and hence
many methods have been developed for the task of network backboning for graph
sparsification. These methods can be classified as global or local in nature
depending on whether they evaluate the importance of an edge in the context of
the whole network or an individual node neighborhood. A key limitation of
existing network backboning methods is that they either artificially restrict
the topology of the backbone to take a specific form (e.g. a tree) or they
require the specification of a free parameter (e.g. a significance level) that
determines the number of edges to keep in the backbone. Here we develop a
completely nonparametric framework for inferring the backbone of a weighted
network that overcomes these limitations by automatically selecting the optimal
number of edges to retain in the backbone using the Minimum Description Length
(MDL) principle from information theory. We develop two encoding schemes that
serve as objective functions for global and local network backbones, as well as
efficient optimization algorithms to identify the optimal backbones according
to these objectives with runtime complexity log-linear in the number of edges.
We show that the proposed framework is generalizable to any discrete weight
distribution on the edges using a maximum a posteriori (MAP) estimation
procedure with an asymptotically equivalent Bayesian generative model of the
backbone. We compare the proposed method with existing methods in a range of
tasks on real and synthetic networks.