On the mathematical quantification of inequality in probability distributions

IF 1.5 Q3 PHYSICS, MULTIDISCIPLINARY Journal of Physics Communications Pub Date : 2024-08-14 DOI:10.1088/2399-6528/ad6ad1

R Rajaram, N Ritchey, B Castellani

{"title":"On the mathematical quantification of inequality in probability distributions","authors":"R Rajaram, N Ritchey, B Castellani","doi":"10.1088/2399-6528/ad6ad1","DOIUrl":null,"url":null,"abstract":"A fundamental challenge in the study of probability distributions is the quantification of inequality that is inherently present in them. Some parts of the distribution are more probable and some others are not, and we are interested in the quantification of this inequality through the lens of mathematical diversity, which is a new approach to studying inequality. We offer a theoretical advance, based on case-based entropy and slope of diversity, which addresses inequality for arbitrary probability distributions through the concept of mathematical diversity. Our approach is useful in three important ways: (1) it offers a universal way to measure inequality in arbitrary probability distributions based purely on the entropic uncertainty that is inherent in them and nothing else; (2) it allows us to compare the degree of inequality of arbitrary parts of any distribution (not just tails) and entire distributions alike; and (3) it can glean out empirical rules similar to the 80/20 rule, not just for the power law but for any given distribution or its parts thereof. The techniques shown in this paper demonstrate a more general machinery to quantify inequality, compare the degree of inequality of parts or whole of general distributions, and prove or glean out empirical rules for general distributions based on mathematical diversity. We demonstrate the utility of this new machinery by applying it to the power law, the exponential and the geometric distributions. The 60 − 40 rule of restricted diversity states that 60 percent or more of cases following a power law (or more generally a right skewed distribution) reside within 40 percent or less of the lower bound of Shannon equivalent equi-probable (SEE) types as measured by case-based entropy. In this paper, we prove the 60 − 40 rule for power law distributions analytically. We also show that in all power law distributions, the second half of the distribution is at least 4 times more uniformly distributed as the first. Lastly, we also show a scale-free way of comparing probability distributions based on the idea of mathematical diversity of parts of a distribution. We use this comparison technique to compare the exponential and power law distribution, and obtain the exponential distribution as an entropic limit of the power law distribution. We also demonstrate that the machinery is applicable to discrete distributions by proving a general result regarding the comparison of parts of the geometric distribution.","PeriodicalId":47089,"journal":{"name":"Journal of Physics Communications","volume":"13 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Physics Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2399-6528/ad6ad1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

A fundamental challenge in the study of probability distributions is the quantification of inequality that is inherently present in them. Some parts of the distribution are more probable and some others are not, and we are interested in the quantification of this inequality through the lens of mathematical diversity, which is a new approach to studying inequality. We offer a theoretical advance, based on case-based entropy and slope of diversity, which addresses inequality for arbitrary probability distributions through the concept of mathematical diversity. Our approach is useful in three important ways: (1) it offers a universal way to measure inequality in arbitrary probability distributions based purely on the entropic uncertainty that is inherent in them and nothing else; (2) it allows us to compare the degree of inequality of arbitrary parts of any distribution (not just tails) and entire distributions alike; and (3) it can glean out empirical rules similar to the 80/20 rule, not just for the power law but for any given distribution or its parts thereof. The techniques shown in this paper demonstrate a more general machinery to quantify inequality, compare the degree of inequality of parts or whole of general distributions, and prove or glean out empirical rules for general distributions based on mathematical diversity. We demonstrate the utility of this new machinery by applying it to the power law, the exponential and the geometric distributions. The 60 − 40 rule of restricted diversity states that 60 percent or more of cases following a power law (or more generally a right skewed distribution) reside within 40 percent or less of the lower bound of Shannon equivalent equi-probable (SEE) types as measured by case-based entropy. In this paper, we prove the 60 − 40 rule for power law distributions analytically. We also show that in all power law distributions, the second half of the distribution is at least 4 times more uniformly distributed as the first. Lastly, we also show a scale-free way of comparing probability distributions based on the idea of mathematical diversity of parts of a distribution. We use this comparison technique to compare the exponential and power law distribution, and obtain the exponential distribution as an entropic limit of the power law distribution. We also demonstrate that the machinery is applicable to discrete distributions by proving a general result regarding the comparison of parts of the geometric distribution.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

论概率分布中不等式的数学量化

研究概率分布的一个基本挑战是如何量化分布中固有的不平等现象。分布中的某些部分更有可能出现，而另一些部分则不可能出现，我们有兴趣通过数学多样性的视角来量化这种不平等，这是研究不平等的一种新方法。我们基于基于案例的熵和多样性斜率，提出了一种理论进展，通过数学多样性的概念来解决任意概率分布的不平等问题。我们的方法在三个重要方面非常有用：(1) 它提供了一种通用方法，纯粹基于任意概率分布固有的熵的不确定性，而不是其他；(2) 它允许我们比较任意分布的任意部分（不仅是尾部）和整个分布的不平等程度；(3) 它可以总结出类似于 80/20 规则的经验规则，不仅适用于幂律，而且适用于任何给定的分布或其部分。本文所展示的技术展示了一种更通用的机制，可用于量化不等式、比较一般分布的部分或整体的不等式程度，以及基于数学多样性证明或总结出一般分布的经验规则。我们将这一新机制应用于幂律分布、指数分布和几何分布，以此证明它的实用性。受限多样性的 60 - 40 规则指出，在幂律分布（或更一般的右偏分布）中，60% 或更多的案例位于基于案例熵衡量的香农等效等可能（SEE）类型下限的 40% 或更小范围内。本文通过分析证明了幂律分布的 60 - 40 规则。我们还证明，在所有幂律分布中，后半部分的均匀分布至少是前半部分的 4 倍。最后，我们还展示了一种基于分布各部分数学多样性思想的无标度概率分布比较方法。我们用这种比较技术来比较指数分布和幂律分布，并得出指数分布是幂律分布的熵极限。我们还通过证明有关几何分布各部分比较的一般结果，证明该机制适用于离散分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊