{"title":"On the mathematical quantification of inequality in probability distributions","authors":"R Rajaram, N Ritchey, B Castellani","doi":"10.1088/2399-6528/ad6ad1","DOIUrl":null,"url":null,"abstract":"A fundamental challenge in the study of probability distributions is the quantification of inequality that is inherently present in them. Some parts of the distribution are more probable and some others are not, and we are interested in the quantification of this inequality through the lens of mathematical diversity, which is a new approach to studying inequality. We offer a theoretical advance, based on case-based entropy and slope of diversity, which addresses inequality for arbitrary probability distributions through the concept of mathematical diversity. Our approach is useful in three important ways: (1) it offers a universal way to measure inequality in arbitrary probability distributions based purely on the entropic uncertainty that is inherent in them and nothing else; (2) it allows us to compare the degree of inequality of arbitrary parts of any distribution (not just tails) and entire distributions alike; and (3) it can glean out empirical rules similar to the 80/20 rule, not just for the power law but for any given distribution or its parts thereof. The techniques shown in this paper demonstrate a more general machinery to quantify inequality, compare the degree of inequality of parts or whole of general distributions, and prove or glean out empirical rules for general distributions based on mathematical diversity. We demonstrate the utility of this new machinery by applying it to the power law, the exponential and the geometric distributions. The 60 − 40 rule of restricted diversity states that 60 percent or more of cases following a power law (or more generally a right skewed distribution) reside within 40 percent or less of the lower bound of Shannon equivalent equi-probable (SEE) types as measured by case-based entropy. In this paper, we prove the 60 − 40 rule for power law distributions analytically. We also show that in all power law distributions, the second half of the distribution is at least 4 times more uniformly distributed as the first. Lastly, we also show a scale-free way of comparing probability distributions based on the idea of mathematical diversity of parts of a distribution. We use this comparison technique to compare the exponential and power law distribution, and obtain the exponential distribution as an entropic limit of the power law distribution. We also demonstrate that the machinery is applicable to discrete distributions by proving a general result regarding the comparison of parts of the geometric distribution.","PeriodicalId":47089,"journal":{"name":"Journal of Physics Communications","volume":"13 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Physics Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2399-6528/ad6ad1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
A fundamental challenge in the study of probability distributions is the quantification of inequality that is inherently present in them. Some parts of the distribution are more probable and some others are not, and we are interested in the quantification of this inequality through the lens of mathematical diversity, which is a new approach to studying inequality. We offer a theoretical advance, based on case-based entropy and slope of diversity, which addresses inequality for arbitrary probability distributions through the concept of mathematical diversity. Our approach is useful in three important ways: (1) it offers a universal way to measure inequality in arbitrary probability distributions based purely on the entropic uncertainty that is inherent in them and nothing else; (2) it allows us to compare the degree of inequality of arbitrary parts of any distribution (not just tails) and entire distributions alike; and (3) it can glean out empirical rules similar to the 80/20 rule, not just for the power law but for any given distribution or its parts thereof. The techniques shown in this paper demonstrate a more general machinery to quantify inequality, compare the degree of inequality of parts or whole of general distributions, and prove or glean out empirical rules for general distributions based on mathematical diversity. We demonstrate the utility of this new machinery by applying it to the power law, the exponential and the geometric distributions. The 60 − 40 rule of restricted diversity states that 60 percent or more of cases following a power law (or more generally a right skewed distribution) reside within 40 percent or less of the lower bound of Shannon equivalent equi-probable (SEE) types as measured by case-based entropy. In this paper, we prove the 60 − 40 rule for power law distributions analytically. We also show that in all power law distributions, the second half of the distribution is at least 4 times more uniformly distributed as the first. Lastly, we also show a scale-free way of comparing probability distributions based on the idea of mathematical diversity of parts of a distribution. We use this comparison technique to compare the exponential and power law distribution, and obtain the exponential distribution as an entropic limit of the power law distribution. We also demonstrate that the machinery is applicable to discrete distributions by proving a general result regarding the comparison of parts of the geometric distribution.