{"title":"Maximal point-polyserial correlation for non-normal random distributions.","authors":"Alessandro Barbiero","doi":"10.1111/bmsp.12362","DOIUrl":null,"url":null,"abstract":"<p><p>We consider the problem of determining the maximum value of the point-polyserial correlation between a random variable with an assigned continuous distribution and an ordinal random variable with <math> <semantics> <mrow><mrow><mi>k</mi></mrow> </mrow> <annotation>$$ k $$</annotation></semantics> </math> categories, which are assigned the first <math> <semantics> <mrow><mrow><mi>k</mi></mrow> </mrow> <annotation>$$ k $$</annotation></semantics> </math> natural values <math> <semantics> <mrow><mrow><mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mtext>…</mtext> <mo>,</mo> <mi>k</mi></mrow> </mrow> <annotation>$$ 1,2,\\dots, k $$</annotation></semantics> </math> , and arbitrary probabilities <math> <semantics> <mrow> <mrow> <msub><mrow><mi>p</mi></mrow> <mrow><mi>i</mi></mrow> </msub> </mrow> </mrow> <annotation>$$ {p}_i $$</annotation></semantics> </math> . For different parametric distributions, we derive a closed-form formula for the maximal point-polyserial correlation as a function of the <math> <semantics> <mrow> <mrow> <msub><mrow><mi>p</mi></mrow> <mrow><mi>i</mi></mrow> </msub> </mrow> </mrow> <annotation>$$ {p}_i $$</annotation></semantics> </math> and of the distribution's parameters; we devise an algorithm for obtaining its maximum value numerically for any given <math> <semantics> <mrow><mrow><mi>k</mi></mrow> </mrow> <annotation>$$ k $$</annotation></semantics> </math> . These maximum values and the features of the corresponding <math> <semantics> <mrow><mrow><mi>k</mi></mrow> </mrow> <annotation>$$ k $$</annotation></semantics> </math> -point discrete random variables are discussed with respect to the underlying continuous distribution. Furthermore, we prove that if we do not assign the values of the ordinal random variable a priori but instead include them in the optimization problem, this latter approach is equivalent to the optimal quantization problem. In some circumstances, it leads to a significant increase in the maximum value of the point-polyserial correlation. An application to real data exemplifies the main findings. A comparison between the discretization leading to the maximum point-polyserial correlation and those obtained from optimal quantization and moment matching is sketched.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Mathematical & Statistical Psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/bmsp.12362","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
We consider the problem of determining the maximum value of the point-polyserial correlation between a random variable with an assigned continuous distribution and an ordinal random variable with categories, which are assigned the first natural values , and arbitrary probabilities . For different parametric distributions, we derive a closed-form formula for the maximal point-polyserial correlation as a function of the and of the distribution's parameters; we devise an algorithm for obtaining its maximum value numerically for any given . These maximum values and the features of the corresponding -point discrete random variables are discussed with respect to the underlying continuous distribution. Furthermore, we prove that if we do not assign the values of the ordinal random variable a priori but instead include them in the optimization problem, this latter approach is equivalent to the optimal quantization problem. In some circumstances, it leads to a significant increase in the maximum value of the point-polyserial correlation. An application to real data exemplifies the main findings. A comparison between the discretization leading to the maximum point-polyserial correlation and those obtained from optimal quantization and moment matching is sketched.
我们考虑的问题是确定一个具有指定连续分布的随机变量与一个具有 k $$ k $$ 类别的序数随机变量之间的点-序列相关性的最大值,这些类别被赋予前 k $$ k $$ 个自然值 1 , 2 , ... , k $$ 1,2,\dots, k $$ 以及任意概率 p i $$ {p}_i $$。对于不同的参数分布,我们推导出了最大点-多序列相关性的闭式公式,它是 p i $$ {p}_i $$ 和分布参数的函数;我们还设计了一种算法,用于在任何给定 k $$ k $$ 的情况下数值求取其最大值。我们讨论了这些最大值以及相应 k $$ k $$ 点离散型随机变量与基本连续分布的关系。此外,我们还证明,如果我们不先验地分配顺序随机变量的值,而是将它们纳入优化问题,那么后一种方法就等同于最优量化问题。在某些情况下,它能显著提高点-多序列相关性的最大值。对真实数据的应用举例说明了主要发现。我们还将对获得最大点-多序列相关性的离散化方法与最优量化和矩匹配方法进行比较。
期刊介绍:
The British Journal of Mathematical and Statistical Psychology publishes articles relating to areas of psychology which have a greater mathematical or statistical aspect of their argument than is usually acceptable to other journals including:
• mathematical psychology
• statistics
• psychometrics
• decision making
• psychophysics
• classification
• relevant areas of mathematics, computing and computer software
These include articles that address substantitive psychological issues or that develop and extend techniques useful to psychologists. New models for psychological processes, new approaches to existing data, critiques of existing models and improved algorithms for estimating the parameters of a model are examples of articles which may be favoured.