Pub Date : 2024-03-06DOI: 10.1007/s00362-023-01491-4
Fengying Li, Yuqiang Li, Xianyi Wu
Reinforcement learning policy evaluation problems are often modeled as finite or discounted/averaged infinite-horizon Markov Decision Processes (MDPs). In this paper, we study undiscounted off-policy evaluation for absorbing MDPs. Given the dataset consisting of i.i.d episodes under a given truncation level, we propose an algorithm (referred to as MWLA in the text) to directly estimate the expected return via the importance ratio of the state-action occupancy measure. The Mean Square Error (MSE) bound of the MWLA method is provided and the dependence of statistical errors on the data size and the truncation level are analyzed. The performance of the algorithm is illustrated by means of computational experiments under an episodic taxi environment
{"title":"Minimax weight learning for absorbing MDPs","authors":"Fengying Li, Yuqiang Li, Xianyi Wu","doi":"10.1007/s00362-023-01491-4","DOIUrl":"https://doi.org/10.1007/s00362-023-01491-4","url":null,"abstract":"<p>Reinforcement learning policy evaluation problems are often modeled as finite or discounted/averaged infinite-horizon Markov Decision Processes (MDPs). In this paper, we study undiscounted off-policy evaluation for absorbing MDPs. Given the dataset consisting of i.i.d episodes under a given truncation level, we propose an algorithm (referred to as MWLA in the text) to directly estimate the expected return via the importance ratio of the state-action occupancy measure. The Mean Square Error (MSE) bound of the MWLA method is provided and the dependence of statistical errors on the data size and the truncation level are analyzed. The performance of the algorithm is illustrated by means of computational experiments under an episodic taxi environment</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"43 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140045510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-04DOI: 10.1007/s00362-024-01532-6
Shuyi Liang, Kai-Tai Fang, Xin-Wei Huang, Yijing Xin, Chang-Xing Ma
In clinical trials studying paired parts of a subject with binary outcomes, it is expected to collect measurements bilaterally. However, there are cases where subjects contribute measurements for only one part. By utilizing combined data, it is possible to gain additional information compared to using bilateral or unilateral data alone. With the combined data, this article investigates homogeneity tests of risk differences with the presence of stratification effects and proposes interval estimations of a common risk difference if stratification does not introduce underlying dissimilarities. Under Dallal’s model (Biometrics 44:253–257, 1988), we propose three test statistics and evaluate their performances regarding type I error controls and powers. Confidence intervals of a common risk difference with satisfactory coverage probabilities and interval length are constructed. Our simulation results show that the score test is the most robust and the profile likelihood confidence interval outperforms other methods proposed. Data from a study of acute otitis media is used to illustrate our proposed procedures.
在对受试者的成对部分进行二元结果研究的临床试验中,预计要收集双侧的测量数据。不过,也有受试者只对一个部位进行测量的情况。与单独使用双侧或单侧数据相比,利用组合数据可以获得更多信息。利用合并数据,本文研究了存在分层效应时风险差异的同质性检验,并提出了在分层不引入潜在差异的情况下共同风险差异的区间估计。根据 Dallal 的模型(Biometrics 44:253-257, 1988),我们提出了三种检验统计量,并评估了它们在 I 型误差控制和幂级数方面的表现。我们构建了具有令人满意的覆盖概率和区间长度的共同风险差异置信区间。我们的模拟结果表明,得分检验是最稳健的,轮廓似然置信区间优于其他方法。我们使用急性中耳炎的研究数据来说明我们提出的程序。
{"title":"Homogeneity tests and interval estimations of risk differences for stratified bilateral and unilateral correlated data","authors":"Shuyi Liang, Kai-Tai Fang, Xin-Wei Huang, Yijing Xin, Chang-Xing Ma","doi":"10.1007/s00362-024-01532-6","DOIUrl":"https://doi.org/10.1007/s00362-024-01532-6","url":null,"abstract":"<p>In clinical trials studying paired parts of a subject with binary outcomes, it is expected to collect measurements bilaterally. However, there are cases where subjects contribute measurements for only one part. By utilizing combined data, it is possible to gain additional information compared to using bilateral or unilateral data alone. With the combined data, this article investigates homogeneity tests of risk differences with the presence of stratification effects and proposes interval estimations of a common risk difference if stratification does not introduce underlying dissimilarities. Under Dallal’s model (Biometrics 44:253–257, 1988), we propose three test statistics and evaluate their performances regarding type I error controls and powers. Confidence intervals of a common risk difference with satisfactory coverage probabilities and interval length are constructed. Our simulation results show that the score test is the most robust and the profile likelihood confidence interval outperforms other methods proposed. Data from a study of acute otitis media is used to illustrate our proposed procedures.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"55 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140033154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-04DOI: 10.1007/s00362-024-01531-7
David Curtis
It has previously been pointed out that Student’s t test, which assumes that samples are drawn from populations with equal standard deviations, can have an inflated Type I error rate if this assumption is violated. Hence it has been recommended that Welch’s t test should be preferred. In the context of carrying out gene-wise weighted burden tests for detecting association of rare variants with psoriasis we observe that Welch’s test performs unsatisfactorily. We show that if the assumption of normality is violated and observations follow a Poisson distribution, then with unequal sample sizes Welch’s t test has an inflated Type I error rate, is systematically biased and is prone to produce extremely low p values. We argue that such data can arise in a variety of real world situations and believe that researchers should be aware of this issue. Student’s t test performs much better in this scenario but a likelihood ratio test based on logistic regression models performs better still and we suggest that this might generally be a preferable method to test for a difference in distributions between two samples.
This research has been conducted using the UK Biobank Resource.
以前曾有人指出,学生 t 检验假定样本来自标准差相等的群体,如果违反了这一假定,I 类错误率就会增大。因此,建议采用韦尔奇 t 检验。在为检测罕见变异体与银屑病的关联而进行基因加权负担测试时,我们发现韦尔奇检验的表现并不令人满意。我们的研究表明,如果违反了正态性假设,观察结果呈泊松分布,那么在样本量不等的情况下,韦尔奇 t 检验的 I 类错误率就会升高,出现系统性偏差,并容易产生极低的 p 值。我们认为,这种数据可能出现在现实世界的各种情况中,研究人员应该意识到这个问题。在这种情况下,学生 t 检验的效果要好得多,但基于逻辑回归模型的似然比检验的效果更好,我们认为这可能是检验两个样本分布差异的较好方法。
{"title":"Welch’s t test is more sensitive to real world violations of distributional assumptions than student’s t test but logistic regression is more robust than either","authors":"David Curtis","doi":"10.1007/s00362-024-01531-7","DOIUrl":"https://doi.org/10.1007/s00362-024-01531-7","url":null,"abstract":"<p>It has previously been pointed out that Student’s <i>t</i> test, which assumes that samples are drawn from populations with equal standard deviations, can have an inflated Type I error rate if this assumption is violated. Hence it has been recommended that Welch’s <i>t</i> test should be preferred. In the context of carrying out gene-wise weighted burden tests for detecting association of rare variants with psoriasis we observe that Welch’s test performs unsatisfactorily. We show that if the assumption of normality is violated and observations follow a Poisson distribution, then with unequal sample sizes Welch’s <i>t</i> test has an inflated Type I error rate, is systematically biased and is prone to produce extremely low <i>p</i> values. We argue that such data can arise in a variety of real world situations and believe that researchers should be aware of this issue. Student’s <i>t</i> test performs much better in this scenario but a likelihood ratio test based on logistic regression models performs better still and we suggest that this might generally be a preferable method to test for a difference in distributions between two samples.</p><p>This research has been conducted using the UK Biobank Resource.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"239 ","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140037982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-29DOI: 10.1007/s00362-024-01530-8
Abstract
In this paper, we propose a new scale-invariant test for linear hypothesis of mean vectors with heteroscedasticity in high-dimensional settings. Most existing tests impose strong conditions on covariance matrices so that null distributions of their tests are asymptotically normal, which restricts the application of test procedures. However, our proposed test has different null distributions under mild conditions. Additionally, the well-known Welch-Satterthwaite chi-square approximation we adopted can automatically mimic the shapes of the null distributions of the test statistic. The performances of the test are illustrated by simulation and real data in finite samples which show that it has robustness and is more powerful than three competitors.
{"title":"A scale-invariant test for linear hypothesis of means in high dimensions","authors":"","doi":"10.1007/s00362-024-01530-8","DOIUrl":"https://doi.org/10.1007/s00362-024-01530-8","url":null,"abstract":"<h3>Abstract</h3> <p>In this paper, we propose a new scale-invariant test for linear hypothesis of mean vectors with heteroscedasticity in high-dimensional settings. Most existing tests impose strong conditions on covariance matrices so that null distributions of their tests are asymptotically normal, which restricts the application of test procedures. However, our proposed test has different null distributions under mild conditions. Additionally, the well-known Welch-Satterthwaite chi-square approximation we adopted can automatically mimic the shapes of the null distributions of the test statistic. The performances of the test are illustrated by simulation and real data in finite samples which show that it has robustness and is more powerful than three competitors.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"46 22 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140001940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-26DOI: 10.1007/s00362-024-01529-1
Bruno Ebner, Norbert Henze, Simos Meintanis
We propose a general and relatively simple method to construct goodness-of-fit tests on the sphere and the hypersphere. The method is based on the characterization of probability distributions via their characteristic function, and it leads to test criteria that are convenient regarding applications and consistent against arbitrary deviations from the model under test. We emphasize goodness-of-fit tests for spherical distributions due to their importance in applications and the relative scarcity of available methods.
{"title":"A unified approach to goodness-of-fit testing for spherical and hyperspherical data","authors":"Bruno Ebner, Norbert Henze, Simos Meintanis","doi":"10.1007/s00362-024-01529-1","DOIUrl":"https://doi.org/10.1007/s00362-024-01529-1","url":null,"abstract":"<p>We propose a general and relatively simple method to construct goodness-of-fit tests on the sphere and the hypersphere. The method is based on the characterization of probability distributions via their characteristic function, and it leads to test criteria that are convenient regarding applications and consistent against arbitrary deviations from the model under test. We emphasize goodness-of-fit tests for spherical distributions due to their importance in applications and the relative scarcity of available methods.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"2 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139969034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-20DOI: 10.1007/s00362-024-01528-2
Rauf Ahmad, Per Johansson, Mårten Schultzberg
The increasing computational power has led to an increasing interest in Fisher’s test in social science. As the Fisher and Neyman inference are based on different principles there is also an increasing interest in understanding the differential features of the two procedures. For example, Young (2018) found that the Fisher test has better size properties than the Neyman test in the situation with influential observations. Ding (2017), on the other hand, showed that the asymptotic variance of the mean-difference estimator (MDE) under Fisher inference is larger than that under Neyman inference, and that the asymptotic Fisher test is less powerful than the t-test even for the simplest case of homogeneous effect. Since MDE plays an important role for policy evaluation, these latter results are a concern for using Fisher’s test as argued in Young (2018). With the aim of providing an understanding of the usefulness of the exact Fisher test for inference to the sample and to the population, this paper clarifies the results in Ding (2017). Using a novel Monte Carlo simulation following the same data generating processes as in Ding (2017), we demonstrate that the Fisher test has no worse power properties than the t-test even with heterogeneous effects.
{"title":"Is Fisher inference inferior to Neyman inference for policy analysis?","authors":"Rauf Ahmad, Per Johansson, Mårten Schultzberg","doi":"10.1007/s00362-024-01528-2","DOIUrl":"https://doi.org/10.1007/s00362-024-01528-2","url":null,"abstract":"<p>The increasing computational power has led to an increasing interest in Fisher’s test in social science. As the Fisher and Neyman inference are based on different principles there is also an increasing interest in understanding the differential features of the two procedures. For example, Young (2018) found that the Fisher test has better size properties than the Neyman test in the situation with influential observations. Ding (2017), on the other hand, showed that the asymptotic variance of the mean-difference estimator (MDE) under Fisher inference is larger than that under Neyman inference, and that the asymptotic Fisher test is less powerful than the <i>t</i>-test even for the simplest case of homogeneous effect. Since MDE plays an important role for policy evaluation, these latter results are a concern for using Fisher’s test as argued in Young (2018). With the aim of providing an understanding of the usefulness of the exact Fisher test for inference to the sample and to the population, this paper clarifies the results in Ding (2017). Using a novel Monte Carlo simulation following the same data generating processes as in Ding (2017), we demonstrate that the Fisher test has no worse power properties than the t-test even with heterogeneous effects.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"70 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139921756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-14DOI: 10.1007/s00362-023-01523-z
Karim Benhenni, Ali Hajj Hassan, Yingcai Su
This article considers the problem of nonparametric estimation of the regression function (r) in a functional regression model (Y = r(X) +varepsilon ) with a scalar response Y, a functional explanatory variable X, and a second order stationary error process (varepsilon ). Under some specific criteria, we construct a local linear kernel estimator of (r) from functional random design with correlated errors. The exact rates of convergence of mean squared error of the constructed estimator are established for both short and long range dependent error processes. Simulation studies are conducted on the performance of the proposed simple local linear estimator. Examples of time series data are considered.
本文考虑的问题是在函数回归模型(Y = r(X) +varepsilon )中回归函数(r)的非参数估计,该模型具有标量响应 Y、函数解释变量 X 和二阶静态误差过程 (varepsilon)。在一些特定的标准下,我们从具有相关误差的函数随机设计中构建了一个局部线性核估计器((r))。在短程和长程依赖误差过程中,都建立了所建估计器均方误差的精确收敛率。对所提出的简单局部线性估计器的性能进行了仿真研究。考虑了时间序列数据的实例。
{"title":"The effect of correlated errors on the performance of local linear estimation of regression function based on random functional design","authors":"Karim Benhenni, Ali Hajj Hassan, Yingcai Su","doi":"10.1007/s00362-023-01523-z","DOIUrl":"https://doi.org/10.1007/s00362-023-01523-z","url":null,"abstract":"<p>This article considers the problem of nonparametric estimation of the regression function <span>(r)</span> in a functional regression model <span>(Y = r(X) +varepsilon )</span> with a scalar response <i>Y</i>, a functional explanatory variable <i>X</i>, and a second order stationary error process <span>(varepsilon )</span>. Under some specific criteria, we construct a local linear kernel estimator of <span>(r)</span> from functional random design with correlated errors. The exact rates of convergence of mean squared error of the constructed estimator are established for both short and long range dependent error processes. Simulation studies are conducted on the performance of the proposed simple local linear estimator. Examples of time series data are considered.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"208 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139762372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-17DOI: 10.1007/s00362-023-01525-x
Jinyu Zhou, Jigao Yan, Dongya Cheng
In this paper, strong consistency of tail value-at-risk (TVaR) estimator under widely orthant dependent (WOD) samples is established, and a numerical simulation is performed to verify the validity of the theoretical results. To reveal the essence of the result, theoretical discussion on complete and complete moment convergence corresponding to the Baum–Katz law, as well as the Marcinkiewicz–Zygmund type strong law of large numbers (MZSLLN) for maximal weighted sums and maximal product sums of widely orthant dependent (WOD) random variables are investigated. The results obtained in the context extend the corresponding ones for independent and some dependent random variables.
{"title":"Strong consistency of tail value-at-risk estimator and corresponding general results under widely orthant dependent samples","authors":"Jinyu Zhou, Jigao Yan, Dongya Cheng","doi":"10.1007/s00362-023-01525-x","DOIUrl":"https://doi.org/10.1007/s00362-023-01525-x","url":null,"abstract":"<p>In this paper, strong consistency of tail value-at-risk (TVaR) estimator under widely orthant dependent (WOD) samples is established, and a numerical simulation is performed to verify the validity of the theoretical results. To reveal the essence of the result, theoretical discussion on complete and complete moment convergence corresponding to the Baum–Katz law, as well as the Marcinkiewicz–Zygmund type strong law of large numbers (MZSLLN) for maximal weighted sums and maximal product sums of widely orthant dependent (WOD) random variables are investigated. The results obtained in the context extend the corresponding ones for independent and some dependent random variables.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"1 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139501214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-13DOI: 10.1007/s00362-023-01526-w
Weirong Li, Wensheng Zhu
The growing popularity of data heterogeneity motivates people to identify homogeneous subgroups with identical parameters. Meanwhile, in many fields of recent data science for some applications, such as personalized education and personalized marketing, the massive data are usually recorded as categorical or ordinal variables, which highlights the importance of performing subgroup analysis on those ordinal outcomes. In this paper, we propose a cumulative link model with subject-specific intercepts to detect and identify homogeneous subgroups through concave pairwise fusion penalty for ordinal response, where heterogeneity arises from some unknown or unobserved latent factors. The concave fusion method can simultaneously determine the number of subgroups, identify the group membership, and estimate the regression coefficients. An alternating direction method of multipliers algorithm with concave penalties for the generalized linear regression model with logit link is developed and its convergence property is studied. We also establish the oracle property of the proposed penalized estimator under some mild conditions. Our simulation studies show that the proposed method could recover the heterogeneous subgroup structure effectively when the response of interest is ordinal. Further, the advantages of our method are illustrated by the analysis on a Mathematics Student Performance Data Set of two public schools from the Alentejo region of Portugal.
{"title":"Subgroup analysis with concave pairwise fusion penalty for ordinal response","authors":"Weirong Li, Wensheng Zhu","doi":"10.1007/s00362-023-01526-w","DOIUrl":"https://doi.org/10.1007/s00362-023-01526-w","url":null,"abstract":"<p>The growing popularity of data heterogeneity motivates people to identify homogeneous subgroups with identical parameters. Meanwhile, in many fields of recent data science for some applications, such as personalized education and personalized marketing, the massive data are usually recorded as categorical or ordinal variables, which highlights the importance of performing subgroup analysis on those ordinal outcomes. In this paper, we propose a cumulative link model with subject-specific intercepts to detect and identify homogeneous subgroups through concave pairwise fusion penalty for ordinal response, where heterogeneity arises from some unknown or unobserved latent factors. The concave fusion method can simultaneously determine the number of subgroups, identify the group membership, and estimate the regression coefficients. An alternating direction method of multipliers algorithm with concave penalties for the generalized linear regression model with logit link is developed and its convergence property is studied. We also establish the oracle property of the proposed penalized estimator under some mild conditions. Our simulation studies show that the proposed method could recover the heterogeneous subgroup structure effectively when the response of interest is ordinal. Further, the advantages of our method are illustrated by the analysis on a Mathematics Student Performance Data Set of two public schools from the Alentejo region of Portugal.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"46 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139460088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-12DOI: 10.1007/s00362-023-01527-9
Jürgen Groß, Annette Möller
The size of the effect of the difference in two groups with respect to a variable of interest may be estimated by the classical Cohen’s d. A recently proposed generalized estimator allows conditioning on further independent variables within the framework of a linear regression model. In this note, it is demonstrated how unbiased estimation of the effect size parameter together with a corresponding standard error may be obtained based on the non-central t distribution. The portrayed estimator may be considered as a natural generalization of the unbiased Hedges’ g. In addition, confidence interval estimation for the unknown parameter is demonstrated by applying the so-called inversion confidence interval principle. The regarded properties collapse to already known ones in case of absence of any additional independent variables. The stated remarks are illustrated with a publicly available data set.
最近提出的一种广义估计方法允许在线性回归模型的框架内对更多的独立变量进行调节。在本说明中,我们将展示如何基于非中心 t 分布,对效应大小参数进行无偏估计,并得出相应的标准误差。所描绘的估计器可视为无偏 Hedges' g 的自然概括。此外,通过应用所谓的反转置信区间原理,还演示了未知参数的置信区间估计。在没有任何额外自变量的情况下,所考虑的特性与已知的特性相吻合。上述论述将通过一组公开数据加以说明。
{"title":"Some additional remarks on statistical properties of Cohen’s d in the presence of covariates","authors":"Jürgen Groß, Annette Möller","doi":"10.1007/s00362-023-01527-9","DOIUrl":"https://doi.org/10.1007/s00362-023-01527-9","url":null,"abstract":"<p>The size of the effect of the difference in two groups with respect to a variable of interest may be estimated by the classical Cohen’s <i>d</i>. A recently proposed generalized estimator allows conditioning on further independent variables within the framework of a linear regression model. In this note, it is demonstrated how unbiased estimation of the effect size parameter together with a corresponding standard error may be obtained based on the non-central <i>t</i> distribution. The portrayed estimator may be considered as a natural generalization of the unbiased Hedges’ <i>g</i>. In addition, confidence interval estimation for the unknown parameter is demonstrated by applying the so-called inversion confidence interval principle. The regarded properties collapse to already known ones in case of absence of any additional independent variables. The stated remarks are illustrated with a publicly available data set.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"17 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139460089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}