首页 > 最新文献

Statistical Papers最新文献

英文 中文
A scale-invariant test for linear hypothesis of means in high dimensions 高维度均值线性假设的标度不变检验
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-02-29 DOI: 10.1007/s00362-024-01530-8

Abstract

In this paper, we propose a new scale-invariant test for linear hypothesis of mean vectors with heteroscedasticity in high-dimensional settings. Most existing tests impose strong conditions on covariance matrices so that null distributions of their tests are asymptotically normal, which restricts the application of test procedures. However, our proposed test has different null distributions under mild conditions. Additionally, the well-known Welch-Satterthwaite chi-square approximation we adopted can automatically mimic the shapes of the null distributions of the test statistic. The performances of the test are illustrated by simulation and real data in finite samples which show that it has robustness and is more powerful than three competitors.

摘要 本文提出了一种新的规模不变检验方法,用于检验高维环境下具有异方差性的均值向量线性假设。现有的大多数检验都对协方差矩阵施加了强条件,使其检验的空分布为渐近正态分布,这限制了检验程序的应用。然而,我们提出的检验在温和条件下具有不同的空分布。此外,我们采用的著名的韦尔奇-萨特斯韦特卡方近似法可以自动模拟检验统计量的空分布形状。我们通过模拟和有限样本中的真实数据来说明该检验的性能,结果表明它具有稳健性,而且比三个竞争对手更强大。
{"title":"A scale-invariant test for linear hypothesis of means in high dimensions","authors":"","doi":"10.1007/s00362-024-01530-8","DOIUrl":"https://doi.org/10.1007/s00362-024-01530-8","url":null,"abstract":"<h3>Abstract</h3> <p>In this paper, we propose a new scale-invariant test for linear hypothesis of mean vectors with heteroscedasticity in high-dimensional settings. Most existing tests impose strong conditions on covariance matrices so that null distributions of their tests are asymptotically normal, which restricts the application of test procedures. However, our proposed test has different null distributions under mild conditions. Additionally, the well-known Welch-Satterthwaite chi-square approximation we adopted can automatically mimic the shapes of the null distributions of the test statistic. The performances of the test are illustrated by simulation and real data in finite samples which show that it has robustness and is more powerful than three competitors.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"46 22 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140001940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified approach to goodness-of-fit testing for spherical and hyperspherical data 球形和超球形数据拟合优度测试的统一方法
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-02-26 DOI: 10.1007/s00362-024-01529-1
Bruno Ebner, Norbert Henze, Simos Meintanis

We propose a general and relatively simple method to construct goodness-of-fit tests on the sphere and the hypersphere. The method is based on the characterization of probability distributions via their characteristic function, and it leads to test criteria that are convenient regarding applications and consistent against arbitrary deviations from the model under test. We emphasize goodness-of-fit tests for spherical distributions due to their importance in applications and the relative scarcity of available methods.

我们提出了一种在球面和超球面上构建拟合优度检验的通用而相对简单的方法。该方法基于通过概率分布的特征函数对概率分布进行表征,得出的检验标准既便于应用,又能与被检验模型的任意偏差保持一致。我们强调球面分布的拟合优度检验,因为它们在应用中非常重要,而且可用的方法相对较少。
{"title":"A unified approach to goodness-of-fit testing for spherical and hyperspherical data","authors":"Bruno Ebner, Norbert Henze, Simos Meintanis","doi":"10.1007/s00362-024-01529-1","DOIUrl":"https://doi.org/10.1007/s00362-024-01529-1","url":null,"abstract":"<p>We propose a general and relatively simple method to construct goodness-of-fit tests on the sphere and the hypersphere. The method is based on the characterization of probability distributions via their characteristic function, and it leads to test criteria that are convenient regarding applications and consistent against arbitrary deviations from the model under test. We emphasize goodness-of-fit tests for spherical distributions due to their importance in applications and the relative scarcity of available methods.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"2 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139969034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is Fisher inference inferior to Neyman inference for policy analysis? 在政策分析中,费雪推断是否不如奈曼推断?
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-02-20 DOI: 10.1007/s00362-024-01528-2
Rauf Ahmad, Per Johansson, Mårten Schultzberg

The increasing computational power has led to an increasing interest in Fisher’s test in social science. As the Fisher and Neyman inference are based on different principles there is also an increasing interest in understanding the differential features of the two procedures. For example, Young (2018) found that the Fisher test has better size properties than the Neyman test in the situation with influential observations. Ding (2017), on the other hand, showed that the asymptotic variance of the mean-difference estimator (MDE) under Fisher inference is larger than that under Neyman inference, and that the asymptotic Fisher test is less powerful than the t-test even for the simplest case of homogeneous effect. Since MDE plays an important role for policy evaluation, these latter results are a concern for using Fisher’s test as argued in Young (2018). With the aim of providing an understanding of the usefulness of the exact Fisher test for inference to the sample and to the population, this paper clarifies the results in Ding (2017). Using a novel Monte Carlo simulation following the same data generating processes as in Ding (2017), we demonstrate that the Fisher test has no worse power properties than the t-test even with heterogeneous effects.

随着计算能力的不断提高,社会科学界对费雪检验的兴趣与日俱增。由于费雪推断和奈曼推断基于不同的原理,人们也越来越有兴趣了解这两种程序的不同特点。例如,Young(2018)发现,在有影响观测值的情况下,Fisher 检验比 Neyman 检验具有更好的规模属性。而 Ding(2017)的研究则表明,Fisher 推断下均值差估计器(MDE)的渐近方差大于 Neyman 推断下的方差,即使在最简单的同质效应情况下,渐近 Fisher 检验也不如 t 检验有力。由于 MDE 在政策评估中发挥着重要作用,正如 Young(2018)所论证的那样,后面这些结果是使用 Fisher 检验的一个顾虑。为了让人们了解精确费雪检验对样本和总体推断的有用性,本文澄清了 Ding(2017)的结果。通过使用与 Ding(2017)中相同的数据生成过程进行新颖的蒙特卡罗模拟,我们证明了费雪检验的功率特性并不比 t 检验差,即使在异质效应的情况下也是如此。
{"title":"Is Fisher inference inferior to Neyman inference for policy analysis?","authors":"Rauf Ahmad, Per Johansson, Mårten Schultzberg","doi":"10.1007/s00362-024-01528-2","DOIUrl":"https://doi.org/10.1007/s00362-024-01528-2","url":null,"abstract":"<p>The increasing computational power has led to an increasing interest in Fisher’s test in social science. As the Fisher and Neyman inference are based on different principles there is also an increasing interest in understanding the differential features of the two procedures. For example, Young (2018) found that the Fisher test has better size properties than the Neyman test in the situation with influential observations. Ding (2017), on the other hand, showed that the asymptotic variance of the mean-difference estimator (MDE) under Fisher inference is larger than that under Neyman inference, and that the asymptotic Fisher test is less powerful than the <i>t</i>-test even for the simplest case of homogeneous effect. Since MDE plays an important role for policy evaluation, these latter results are a concern for using Fisher’s test as argued in Young (2018). With the aim of providing an understanding of the usefulness of the exact Fisher test for inference to the sample and to the population, this paper clarifies the results in Ding (2017). Using a novel Monte Carlo simulation following the same data generating processes as in Ding (2017), we demonstrate that the Fisher test has no worse power properties than the t-test even with heterogeneous effects.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"70 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139921756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effect of correlated errors on the performance of local linear estimation of regression function based on random functional design 相关误差对基于随机函数设计的回归函数局部线性估计性能的影响
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-02-14 DOI: 10.1007/s00362-023-01523-z
Karim Benhenni, Ali Hajj Hassan, Yingcai Su

This article considers the problem of nonparametric estimation of the regression function (r) in a functional regression model (Y = r(X) +varepsilon ) with a scalar response Y, a functional explanatory variable X, and a second order stationary error process (varepsilon ). Under some specific criteria, we construct a local linear kernel estimator of (r) from functional random design with correlated errors. The exact rates of convergence of mean squared error of the constructed estimator are established for both short and long range dependent error processes. Simulation studies are conducted on the performance of the proposed simple local linear estimator. Examples of time series data are considered.

本文考虑的问题是在函数回归模型(Y = r(X) +varepsilon )中回归函数(r)的非参数估计,该模型具有标量响应 Y、函数解释变量 X 和二阶静态误差过程 (varepsilon)。在一些特定的标准下,我们从具有相关误差的函数随机设计中构建了一个局部线性核估计器((r))。在短程和长程依赖误差过程中,都建立了所建估计器均方误差的精确收敛率。对所提出的简单局部线性估计器的性能进行了仿真研究。考虑了时间序列数据的实例。
{"title":"The effect of correlated errors on the performance of local linear estimation of regression function based on random functional design","authors":"Karim Benhenni, Ali Hajj Hassan, Yingcai Su","doi":"10.1007/s00362-023-01523-z","DOIUrl":"https://doi.org/10.1007/s00362-023-01523-z","url":null,"abstract":"<p>This article considers the problem of nonparametric estimation of the regression function <span>(r)</span> in a functional regression model <span>(Y = r(X) +varepsilon )</span> with a scalar response <i>Y</i>, a functional explanatory variable <i>X</i>, and a second order stationary error process <span>(varepsilon )</span>. Under some specific criteria, we construct a local linear kernel estimator of <span>(r)</span> from functional random design with correlated errors. The exact rates of convergence of mean squared error of the constructed estimator are established for both short and long range dependent error processes. Simulation studies are conducted on the performance of the proposed simple local linear estimator. Examples of time series data are considered.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"208 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139762372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Strong consistency of tail value-at-risk estimator and corresponding general results under widely orthant dependent samples 广泛正交依存样本下尾部风险价值估计器的强一致性及相应的一般结果
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-01-17 DOI: 10.1007/s00362-023-01525-x
Jinyu Zhou, Jigao Yan, Dongya Cheng

In this paper, strong consistency of tail value-at-risk (TVaR) estimator under widely orthant dependent (WOD) samples is established, and a numerical simulation is performed to verify the validity of the theoretical results. To reveal the essence of the result, theoretical discussion on complete and complete moment convergence corresponding to the Baum–Katz law, as well as the Marcinkiewicz–Zygmund type strong law of large numbers (MZSLLN) for maximal weighted sums and maximal product sums of widely orthant dependent (WOD) random variables are investigated. The results obtained in the context extend the corresponding ones for independent and some dependent random variables.

本文建立了广泛正交依赖(WOD)样本下尾部风险值(TVaR)估计器的强一致性,并通过数值模拟验证了理论结果的正确性。为了揭示结果的本质,研究了与 Baum-Katz 定律相对应的完全收敛和完全矩收敛,以及广泛正交依存(WOD)随机变量的最大加权和和最大乘积和的 Marcinkiewicz-Zygmund 型强大数定律(MZSLLN)。在此背景下获得的结果扩展了独立随机变量和某些从属随机变量的相应结果。
{"title":"Strong consistency of tail value-at-risk estimator and corresponding general results under widely orthant dependent samples","authors":"Jinyu Zhou, Jigao Yan, Dongya Cheng","doi":"10.1007/s00362-023-01525-x","DOIUrl":"https://doi.org/10.1007/s00362-023-01525-x","url":null,"abstract":"<p>In this paper, strong consistency of tail value-at-risk (TVaR) estimator under widely orthant dependent (WOD) samples is established, and a numerical simulation is performed to verify the validity of the theoretical results. To reveal the essence of the result, theoretical discussion on complete and complete moment convergence corresponding to the Baum–Katz law, as well as the Marcinkiewicz–Zygmund type strong law of large numbers (MZSLLN) for maximal weighted sums and maximal product sums of widely orthant dependent (WOD) random variables are investigated. The results obtained in the context extend the corresponding ones for independent and some dependent random variables.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"1 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139501214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subgroup analysis with concave pairwise fusion penalty for ordinal response 采用凹面成对融合惩罚对序数反应进行分组分析
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-01-13 DOI: 10.1007/s00362-023-01526-w
Weirong Li, Wensheng Zhu

The growing popularity of data heterogeneity motivates people to identify homogeneous subgroups with identical parameters. Meanwhile, in many fields of recent data science for some applications, such as personalized education and personalized marketing, the massive data are usually recorded as categorical or ordinal variables, which highlights the importance of performing subgroup analysis on those ordinal outcomes. In this paper, we propose a cumulative link model with subject-specific intercepts to detect and identify homogeneous subgroups through concave pairwise fusion penalty for ordinal response, where heterogeneity arises from some unknown or unobserved latent factors. The concave fusion method can simultaneously determine the number of subgroups, identify the group membership, and estimate the regression coefficients. An alternating direction method of multipliers algorithm with concave penalties for the generalized linear regression model with logit link is developed and its convergence property is studied. We also establish the oracle property of the proposed penalized estimator under some mild conditions. Our simulation studies show that the proposed method could recover the heterogeneous subgroup structure effectively when the response of interest is ordinal. Further, the advantages of our method are illustrated by the analysis on a Mathematics Student Performance Data Set of two public schools from the Alentejo region of Portugal.

数据异质性的日益普及促使人们去识别具有相同参数的同质子群。同时,在近年来数据科学的许多应用领域,如个性化教育和个性化营销,海量数据通常记录为分类或序数变量,这就凸显了对这些序数结果进行亚组分析的重要性。在本文中,我们提出了一种带有特定受试者截距的累积链接模型,通过对序数响应的凹对融合惩罚来检测和识别同质亚组,其中异质性来自一些未知或未观察到的潜在因素。凹对融合法可以同时确定亚组数量、识别组内成员和估计回归系数。针对带有 logit 链接的广义线性回归模型,我们开发了一种带有凹面惩罚的交替方向乘法算法,并对其收敛特性进行了研究。我们还在一些温和的条件下建立了所提出的惩罚估计器的甲骨文特性。我们的模拟研究表明,当感兴趣的响应是序数时,所提出的方法可以有效地恢复异质子群结构。此外,我们还通过对葡萄牙阿连特茹地区两所公立学校的数学学生成绩数据集的分析,说明了我们的方法的优势。
{"title":"Subgroup analysis with concave pairwise fusion penalty for ordinal response","authors":"Weirong Li, Wensheng Zhu","doi":"10.1007/s00362-023-01526-w","DOIUrl":"https://doi.org/10.1007/s00362-023-01526-w","url":null,"abstract":"<p>The growing popularity of data heterogeneity motivates people to identify homogeneous subgroups with identical parameters. Meanwhile, in many fields of recent data science for some applications, such as personalized education and personalized marketing, the massive data are usually recorded as categorical or ordinal variables, which highlights the importance of performing subgroup analysis on those ordinal outcomes. In this paper, we propose a cumulative link model with subject-specific intercepts to detect and identify homogeneous subgroups through concave pairwise fusion penalty for ordinal response, where heterogeneity arises from some unknown or unobserved latent factors. The concave fusion method can simultaneously determine the number of subgroups, identify the group membership, and estimate the regression coefficients. An alternating direction method of multipliers algorithm with concave penalties for the generalized linear regression model with logit link is developed and its convergence property is studied. We also establish the oracle property of the proposed penalized estimator under some mild conditions. Our simulation studies show that the proposed method could recover the heterogeneous subgroup structure effectively when the response of interest is ordinal. Further, the advantages of our method are illustrated by the analysis on a Mathematics Student Performance Data Set of two public schools from the Alentejo region of Portugal.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"46 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139460088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Some additional remarks on statistical properties of Cohen’s d in the presence of covariates 关于存在协变量时 Cohen's d 统计特性的一些补充说明
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-01-12 DOI: 10.1007/s00362-023-01527-9
Jürgen Groß, Annette Möller

The size of the effect of the difference in two groups with respect to a variable of interest may be estimated by the classical Cohen’s d. A recently proposed generalized estimator allows conditioning on further independent variables within the framework of a linear regression model. In this note, it is demonstrated how unbiased estimation of the effect size parameter together with a corresponding standard error may be obtained based on the non-central t distribution. The portrayed estimator may be considered as a natural generalization of the unbiased Hedges’ g. In addition, confidence interval estimation for the unknown parameter is demonstrated by applying the so-called inversion confidence interval principle. The regarded properties collapse to already known ones in case of absence of any additional independent variables. The stated remarks are illustrated with a publicly available data set.

最近提出的一种广义估计方法允许在线性回归模型的框架内对更多的独立变量进行调节。在本说明中,我们将展示如何基于非中心 t 分布,对效应大小参数进行无偏估计,并得出相应的标准误差。所描绘的估计器可视为无偏 Hedges' g 的自然概括。此外,通过应用所谓的反转置信区间原理,还演示了未知参数的置信区间估计。在没有任何额外自变量的情况下,所考虑的特性与已知的特性相吻合。上述论述将通过一组公开数据加以说明。
{"title":"Some additional remarks on statistical properties of Cohen’s d in the presence of covariates","authors":"Jürgen Groß, Annette Möller","doi":"10.1007/s00362-023-01527-9","DOIUrl":"https://doi.org/10.1007/s00362-023-01527-9","url":null,"abstract":"<p>The size of the effect of the difference in two groups with respect to a variable of interest may be estimated by the classical Cohen’s <i>d</i>. A recently proposed generalized estimator allows conditioning on further independent variables within the framework of a linear regression model. In this note, it is demonstrated how unbiased estimation of the effect size parameter together with a corresponding standard error may be obtained based on the non-central <i>t</i> distribution. The portrayed estimator may be considered as a natural generalization of the unbiased Hedges’ <i>g</i>. In addition, confidence interval estimation for the unknown parameter is demonstrated by applying the so-called inversion confidence interval principle. The regarded properties collapse to already known ones in case of absence of any additional independent variables. The stated remarks are illustrated with a publicly available data set.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"17 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139460089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deficiency bounds for the multivariate inverse hypergeometric distribution 多元反超几何分布的缺陷边界
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-01-09 DOI: 10.1007/s00362-023-01524-y
Frédéric Ouimet

The multivariate inverse hypergeometric (MIH) distribution is an extension of the negative multinomial (NM) model that accounts for sampling without replacement in a finite population. Even though most studies on longitudinal count data with a specific number of ‘failures’ occur in a finite setting, the NM model is typically chosen over the more accurate MIH model. This raises the question: How much information is lost when inferring with the approximate NM model instead of the true MIH model? The loss is quantified by a measure called deficiency in statistics. In this paper, asymptotic bounds for the deficiencies between MIH and NM experiments are derived, as well as between MIH and the corresponding multivariate normal experiments with the same mean-covariance structure. The findings are supported by a local approximation for the log-ratio of the MIH and NM probability mass functions, and by Hellinger distance bounds.

多变量反超几何(MIH)分布是负多叉(NM)模型的扩展,它考虑了在有限群体中不替换抽样的情况。尽管大多数关于具有特定 "失败 "次数的纵向计数数据的研究都是在有限的环境中进行的,但一般都会选择 NM 模型而不是更精确的 MIH 模型。这就提出了一个问题:使用近似的 NM 模型而非真正的 MIH 模型进行推断会损失多少信息?这种损失可以用统计学中一种称为缺陷的量度来量化。本文推导出了 MIH 与 NM 实验之间以及 MIH 与具有相同均值-协方差结构的相应多元正态实验之间的缺陷渐近限。这些发现得到了 MIH 和 NM 概率质量函数对数比的局部近似值以及海灵格距离界值的支持。
{"title":"Deficiency bounds for the multivariate inverse hypergeometric distribution","authors":"Frédéric Ouimet","doi":"10.1007/s00362-023-01524-y","DOIUrl":"https://doi.org/10.1007/s00362-023-01524-y","url":null,"abstract":"<p>The multivariate inverse hypergeometric (MIH) distribution is an extension of the negative multinomial (NM) model that accounts for sampling without replacement in a finite population. Even though most studies on longitudinal count data with a specific number of ‘failures’ occur in a finite setting, the NM model is typically chosen over the more accurate MIH model. This raises the question: How much information is lost when inferring with the approximate NM model instead of the true MIH model? The loss is quantified by a measure called deficiency in statistics. In this paper, asymptotic bounds for the deficiencies between MIH and NM experiments are derived, as well as between MIH and the corresponding multivariate normal experiments with the same mean-covariance structure. The findings are supported by a local approximation for the log-ratio of the MIH and NM probability mass functions, and by Hellinger distance bounds.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"40 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139409125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Breitung and Roling estimator for mixed-frequency models with application to forecasting inflation rates 混合频率模型的改进 Breitung 和 Roling 估计器在预测通货膨胀率中的应用
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-01-04 DOI: 10.1007/s00362-023-01520-2

Abstract

Instead of applying the commonly used parametric Almon or Beta lag distribution of MIDAS, Breitung and Roling (J Forecast 34:588–603, 2015) suggested a nonparametric smoothed least-squares shrinkage estimator (henceforth ({SLS}_{1}) ) for estimating mixed-frequency models. This ({SLS}_{1}) approach ensures a flexible smooth trending lag distribution. However, even if the biasing parameter in ({SLS}_{1}) solves the overparameterization problem, the cost is a decreased goodness-of-fit. Therefore, we suggest a modification of this shrinkage regression into a two-parameter smoothed least-squares estimator ( ({SLS}_{2}) ). This estimator solves the overparameterization problem, and it has superior properties since it ensures that the orthogonality assumption between residuals and the predicted dependent variable holds, which leads to an increased goodness-of-fit. Our theoretical comparisons, supported by simulations, demonstrate that the increase in goodness-of-fit of the proposed two-parameter estimator also leads to a decrease in the mean square error of ({SLS}_{2},) compared to that of ({SLS}_{1}) . Empirical results, where the inflation rate is forecasted based on the oil returns, demonstrate that our proposed ({SLS}_{2}) estimator for mixed-frequency models provides better estimates in terms of decreased MSE and improved R2, which in turn leads to better forecasts.

摘要 Breitung和Roling(J Forecast 34:588-603,2015)提出了一种非参数平滑最小二乘收缩估计器(以下简称({SLS}_{1}))来估计混合频率模型,而不是应用MIDAS常用的参数Almon或Beta滞后分布。这种({SLS}_{1})方法确保了灵活平滑的趋势滞后分布。然而,即使 ({SLS}_{1}) 中的偏置参数解决了过参数化问题,其代价也是拟合优度的下降。因此,我们建议将这种收缩回归修改为双参数平滑最小二乘估计器(({SLS}_{2}) )。这种估计方法解决了过参数化问题,而且具有更优越的特性,因为它确保了残差与预测因变量之间的正交假设成立,从而提高了拟合优度。我们的理论比较和模拟证明,与 ({SLS}_{1})相比,所提出的双参数估计器拟合优度的提高也导致了 ({SLS}_{2},)均方误差的减小。基于石油收益率预测通货膨胀率的实证结果表明,我们为混合频率模型提出的 ({SLS}_{2})估计器在减少均方误差和提高 R2 方面提供了更好的估计,从而带来更好的预测。
{"title":"Improved Breitung and Roling estimator for mixed-frequency models with application to forecasting inflation rates","authors":"","doi":"10.1007/s00362-023-01520-2","DOIUrl":"https://doi.org/10.1007/s00362-023-01520-2","url":null,"abstract":"<h3>Abstract</h3> <p>Instead of applying the commonly used parametric Almon or Beta lag distribution of MIDAS, Breitung and Roling (J Forecast 34:588–603, 2015) suggested a nonparametric smoothed least-squares shrinkage estimator (henceforth <span> <span>({SLS}_{1})</span> </span>) for estimating mixed-frequency models. This <span> <span>({SLS}_{1})</span> </span> approach ensures a flexible smooth trending lag distribution. However, even if the biasing parameter in <span> <span>({SLS}_{1})</span> </span> solves the overparameterization problem, the cost is a decreased goodness-of-fit. Therefore, we suggest a modification of this shrinkage regression into a two-parameter smoothed least-squares estimator (<span> <span>({SLS}_{2})</span> </span>). This estimator solves the overparameterization problem, and it has superior properties since it ensures that the orthogonality assumption between residuals and the predicted dependent variable holds, which leads to an increased goodness-of-fit. Our theoretical comparisons, supported by simulations, demonstrate that the increase in goodness-of-fit of the proposed two-parameter estimator also leads to a decrease in the mean square error of <span> <span>({SLS}_{2},)</span> </span> compared to that of <span> <span>({SLS}_{1})</span> </span>. Empirical results, where the inflation rate is forecasted based on the oil returns, demonstrate that our proposed <span> <span>({SLS}_{2})</span> </span> estimator for mixed-frequency models provides better estimates in terms of decreased MSE and improved R<sup>2</sup>, which in turn leads to better forecasts.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"15 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139104723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal dichotomization of bimodal Gaussian mixtures 双峰高斯混合物的最佳二分法
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-01-02 DOI: 10.1007/s00362-023-01521-1
Yan-ni Jhan, Wan-cen Li, Shin-hui Ruan, Jia-jyun Sie, Iebin Lian

Despite criticism for loss of information and power, dichotomization of variables is still frequently used in social, behavioral, and medical sciences, mainly because it yields more interpretable conclusions for research outcomes and is useful for decision making. However, the artificial choice of cut-points can be controversial and needs proper justification. In this work, we investigate the properties of point-biserial correlation after dichotomization with underlying bimodal Gaussian mixture distributions. We propose a dichotomous grouping procedure that considers the largest standardized difference in group mean while minimizing information loss.

尽管二分法因其丧失信息和力量而受到批评,但在社会科学、行为科学和医学中仍被频繁使用,主要是因为它能为研究成果提供更多可解释的结论,并有助于决策。然而,人为地选择切点可能会引起争议,需要适当的论证。在这项工作中,我们研究了基础双峰高斯混合分布二分法后的点-双峰相关性的特性。我们提出了一种二分法分组程序,该程序考虑了分组平均值的最大标准化差异,同时最大限度地减少了信息丢失。
{"title":"Optimal dichotomization of bimodal Gaussian mixtures","authors":"Yan-ni Jhan, Wan-cen Li, Shin-hui Ruan, Jia-jyun Sie, Iebin Lian","doi":"10.1007/s00362-023-01521-1","DOIUrl":"https://doi.org/10.1007/s00362-023-01521-1","url":null,"abstract":"<p>Despite criticism for loss of information and power, dichotomization of variables is still frequently used in social, behavioral, and medical sciences, mainly because it yields more interpretable conclusions for research outcomes and is useful for decision making. However, the artificial choice of cut-points can be controversial and needs proper justification. In this work, we investigate the properties of point-biserial correlation after dichotomization with underlying bimodal Gaussian mixture distributions. We propose a dichotomous grouping procedure that considers the largest standardized difference in group mean while minimizing information loss.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"21 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139078896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistical Papers
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1