Mutual information and the encoding of contingency tables.

IF 2.4 3区 物理与天体物理 Q2 PHYSICS, FLUIDS & PLASMAS Physical Review E Pub Date : 2024-12-01 DOI:10.1103/PhysRevE.110.064306
Maximilian Jerdee, Alec Kirkley, M E J Newman
{"title":"Mutual information and the encoding of contingency tables.","authors":"Maximilian Jerdee, Alec Kirkley, M E J Newman","doi":"10.1103/PhysRevE.110.064306","DOIUrl":null,"url":null,"abstract":"<p><p>Mutual information is commonly used as a measure of similarity between competing labelings of a given set of objects, for example to quantify performance in classification and community detection tasks. As argued recently, however, the mutual information as conventionally defined can return biased results because it neglects the information cost of the so-called contingency table, a crucial component of the similarity calculation. In principle the bias can be rectified by subtracting the appropriate information cost, leading to the modified measure known as the reduced mutual information, but in practice one can only ever compute an upper bound on this information cost, and the value of the reduced mutual information depends crucially on how good a bound is established. In this paper we describe an improved method for encoding contingency tables that gives a substantially better bound in typical use cases and approaches the ideal value in the common case where the labelings are closely similar, as we demonstrate with extensive numerical results.</p>","PeriodicalId":48698,"journal":{"name":"Physical Review E","volume":"110 6-1","pages":"064306"},"PeriodicalIF":2.4000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical Review E","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1103/PhysRevE.110.064306","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, FLUIDS & PLASMAS","Score":null,"Total":0}
引用次数: 0

Abstract

Mutual information is commonly used as a measure of similarity between competing labelings of a given set of objects, for example to quantify performance in classification and community detection tasks. As argued recently, however, the mutual information as conventionally defined can return biased results because it neglects the information cost of the so-called contingency table, a crucial component of the similarity calculation. In principle the bias can be rectified by subtracting the appropriate information cost, leading to the modified measure known as the reduced mutual information, but in practice one can only ever compute an upper bound on this information cost, and the value of the reduced mutual information depends crucially on how good a bound is established. In this paper we describe an improved method for encoding contingency tables that gives a substantially better bound in typical use cases and approaches the ideal value in the common case where the labelings are closely similar, as we demonstrate with extensive numerical results.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
互信息与列联表编码。
互信息通常用于衡量给定对象集的竞争标签之间的相似性,例如量化分类和社区检测任务中的性能。然而,正如最近争论的那样,传统定义的互信息可能会返回有偏差的结果,因为它忽略了所谓的列联表的信息成本,列联表是相似性计算的关键组成部分。原则上,偏差可以通过减去适当的信息成本来纠正,从而得到被称为减少互信息的改进措施,但在实践中,人们只能计算该信息成本的上限,而减少的互信息的价值关键取决于建立的边界有多好。在本文中,我们描述了一种改进的列联表编码方法,该方法在典型用例中给出了更好的界,并且在标记非常相似的常见情况下接近理想值,正如我们用广泛的数值结果所证明的那样。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Physical Review E
Physical Review E PHYSICS, FLUIDS & PLASMASPHYSICS, MATHEMAT-PHYSICS, MATHEMATICAL
CiteScore
4.50
自引率
16.70%
发文量
2110
期刊介绍: Physical Review E (PRE), broad and interdisciplinary in scope, focuses on collective phenomena of many-body systems, with statistical physics and nonlinear dynamics as the central themes of the journal. Physical Review E publishes recent developments in biological and soft matter physics including granular materials, colloids, complex fluids, liquid crystals, and polymers. The journal covers fluid dynamics and plasma physics and includes sections on computational and interdisciplinary physics, for example, complex networks.
期刊最新文献
Hyperedge overlap modulates synchronization transitions in higher-order Sakaguchi-Kuramoto model. Higher-order interactions enhance stochastic resonance in coupled oscillators. Spectral densities approximations of incidence-based locally treelike hypergraph matrices via the cavity method. Spatially coherent oscillations in neural fields with inhibition and adaptation. II. Two-dimensional domains. Integrability and exact large deviations of the weakly asymmetric exclusion process.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1