Statistica Neerlandica最新文献

英文中文

Connections between two classes of estimators for single‐index models 单指标模型的两类估计量之间的联系

3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-10-13 DOI: 10.1111/stan.12329

Weichao Yang, Xu Guo, Niwen Zhou, Changliang Zou

Single‐index model is a very popular and powerful semiparametric model. As an improvement of the maximum rank correlation estimator, [[spiapacite]]bib1[[/spiapacite]] proposed the linearized maximum rank correlation estimator. We show that this estimator has some interesting connections with the distribution‐transformed least‐squares estimator for single‐index models. We also propose a rescaled distribution‐transformed least‐squares estimator, which is mathematically equivalent to the linearized maximum rank correlation estimator when the distribution of the response is absolutely continuous. Despite some nontrivial connections, the two estimation procedures are different in terms of motivations, interpretations, and applications. We discuss some of the differences between the two estimation procedures. This article is protected by copyright. All rights reserved.

单指标模型是一种非常流行且功能强大的半参数模型。作为对最大秩相关估计器的改进，[[spiapacite]]bib1[[/spiapacite]]提出了线性化最大秩相关估计器。我们证明了这个估计量与单指标模型的分布变换最小二乘估计量有一些有趣的联系。我们还提出了一个重标化的分布变换最小二乘估计量，当响应分布绝对连续时，它在数学上等同于线性化的最大秩相关估计量。尽管有一些重要的联系，但这两种评估过程在动机、解释和应用方面是不同的。我们将讨论这两种估计过程之间的一些差异。这篇文章受版权保护。版权所有。

引用次数: 0

Testing Conditional Independence in Casual Inference for Time Series Data^† 时间序列数据随机推理的条件独立性检验

3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-09-27 DOI: 10.1111/stan.12323

Zongwu Cai, Ying Fang, Ming Lin, Shengfang Tang

In this paper, we propose a new procedure to test conditional independence assumption in studying casual inference for time series data. The conditional independence assumption is transformed to a nonparametric conditional moment test with the help of auxiliary variables which are allowed to affect policy choice but the dependence can be fully captured by potential outcomes and observable controls. When the policy choice is binary, a nonparametric statistic test is developed further for testing the conditional independence assumption conditional on policy propensity score. Under some regular conditions, we show that the proposed test statistics are asymptotically normal under the null hypotheses for time series data. In addition, the performances of the proposed methods are illustrated through Monte Carlo simulations and a real example considered in Angrist and Kuersteiner (2011).

本文提出了一种检验时间序列随机推理中条件独立性假设的新方法。在辅助变量的帮助下，将条件独立假设转换为非参数条件矩检验，这些辅助变量允许影响策略选择，但依赖性可以通过潜在结果和可观察控制完全捕获。当策略选择为二元时，进一步发展了非参数统计检验，用于检验以策略倾向得分为条件的条件独立假设。在一些正则条件下，我们证明了所提出的检验统计量在零假设下是渐近正态的。此外，所提出的方法的性能通过Monte Carlo模拟和Angrist和Kuersteiner(2011)中考虑的一个真实例子来说明。

引用次数: 0

An Informative Prior distribution on Functions with Application to Functional Regression 函数的信息先验分布及其在函数回归中的应用

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-09-08 DOI: 10.1111/stan.12322

C. Abraham

We provide a prior distribution for a functional parameter so that its trajectories are smooth and vanish on a given subset. This distribution can be interpreted as the distribution of an initial Gaussian process conditioned to be zero on a given subset. Precisely, we show that the initial Gaussian process is the sum of the conditioned process and an independent process with probability one and that all the processes have the same almost sure regularity. This prior distribution is use to provide an interpretable estimate of the coefficient function in the linear scalar‐on‐function regression; by interpretable, we mean a smooth function that may possibly be zero on some intervals. We apply our model in a simulation and real case studies with two different priors for the null region of the coefficient function. In one case, the null region is known to be an unknown single interval. In the other case, it can be any unknown unions of intervals.This article is protected by copyright. All rights reserved.

我们提供了一个函数参数的先验分布，使得它的轨迹在给定的子集上是光滑的和消失的。这个分布可以解释为初始高斯过程在给定子集上条件为零的分布。准确地说，我们证明了初始高斯过程是有条件过程和一个概率为1的独立过程的和，并且所有的过程都具有相同的几乎确定的规律性。该先验分布用于提供线性标量函数回归中系数函数的可解释估计;所谓可解释，我们指的是一个平滑函数，它可能在某些区间上为零。我们将我们的模型应用于模拟和实际案例研究中，对系数函数的零区有两种不同的先验。在一种情况下，已知空区域是一个未知的单个区间。在另一种情况下，它可以是任何未知的区间并集。这篇文章受版权保护。版权所有。

引用次数: 0

Semiparametric Recovery of Central Dimension Reduction Space with Nonignorable Nonresponse‪ 具有不可忽略非响应的中心降维空间半参数恢复

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-09-06 DOI: 10.1111/stan.12321

Siming Zheng, Alan T.K. Wan, Yong Zhou

Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are non‐ignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy‐to‐implement SDR estimator based on a semiparametric propensity score function for response data with non‐ignorable missing values. We refer to it as the dimension reduction‐based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction‐based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non‐trivial extension of the existing literature, due to the technical challenges posed by non‐ignorable missingness. All the technical proofs of the theorems are given in the Online Supplementary Material.This article is protected by copyright. All rights reserved.

充分降维方法是处理高维数据的有效工具。经典的SDR方法是在完全观测数据的假设下发展起来的。当数据由于缺失值而不完整时，SDR只在数据随机缺失时被考虑，而在数据不可忽略缺失时则不被考虑，由于缺失值依赖于它们缺失的原因，这可以说是更难以处理。本文的目的就是填补这一空白。我们提出了一个直观的，易于实现的基于半参数倾向评分函数的SDR估计器，用于具有不可忽略缺失值的响应数据。我们将其称为基于降维的估算估计器。我们建立了该估计器的理论性质，并通过对真实和模拟数据的广泛数值研究来检验其经验性能。此外，我们还比较了我们提出的基于降维的估计器与两种竞争估计器的性能，包括融合改进估计器和累积切片估计器。我们的方法的一个显著特征是它不需要验证样本。由于不可忽视的缺失所带来的技术挑战，本文中发展的SDR理论是对现有文献的非平凡扩展。所有这些定理的技术证明都在在线补充材料中给出。这篇文章受版权保护。版权所有。

{"title":"Semiparametric Recovery of Central Dimension Reduction Space with Nonignorable Nonresponse‪","authors":"Siming Zheng, Alan T.K. Wan, Yong Zhou","doi":"10.1111/stan.12321","DOIUrl":"https://doi.org/10.1111/stan.12321","url":null,"abstract":"Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are non‐ignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy‐to‐implement SDR estimator based on a semiparametric propensity score function for response data with non‐ignorable missing values. We refer to it as the dimension reduction‐based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction‐based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non‐trivial extension of the existing literature, due to the technical challenges posed by non‐ignorable missingness. All the technical proofs of the theorems are given in the Online Supplementary Material.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"1 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89912126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Yates, Conover and Mantel statistics in 2×2 tables revisited (and extended) Yates, Conover和Mantel在2×2表中的统计数据重新访问(并扩展)

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-08-31 DOI: 10.1111/stan.12320

A. Martín Andrés, Álvarez Hernández M, Gayá Moreno F

Asymptotic inferences about the difference, ratio or odds‐ratio of two independent proportions are very common in diverse fields. This article defines for each parameter eight conditional inference methods. These methods depend on: (1) using a chi‐squared type statistic or a z type one; (2) using the classic Yates continuity correction or the less well‐known Conover one; and (3) whether the p‐value of the test is determined by doubling the one‐tailed p‐value or by the Mantel method (asymmetrical approach). In all cases, the conclusions are: (i) the methods based on the chi‐squared statistic should not be used, as they are too liberal; (ii) for those in favour of using the criterion of doubling the p‐value, the best method is using the z statistic with Conover continuity correction; and (iii) for those in favour of the asymmetrical approach, the best method is based on the z statistic with Conover continuity correction and the Mantel p‐value.This article is protected by copyright. All rights reserved.

关于两个独立比例的差、比或比值比的渐近推断在各个领域都很常见。本文为每个参数定义了八种条件推理方法。这些方法取决于:(1)使用卡平方型统计量或z型统计量;(2)使用经典的耶茨连续性校正或不太为人所知的康诺弗连续性校正;(3)检验的p值是通过将单尾p值加倍还是通过曼特尔方法(不对称方法)确定的。在所有情况下，结论是:(i)不应该使用基于卡平方统计量的方法，因为它们过于自由;(ii)对于那些赞成使用p值加倍标准的人，最好的方法是使用具有Conover连续性校正的z统计量;(iii)对于那些赞成不对称方法的人来说，最好的方法是基于具有Conover连续性校正和Mantel p值的z统计量。这篇文章受版权保护。版权所有。

引用次数: 0

Improved estimation of average treatment effects under covariate‐adaptive randomization methods 协变量自适应随机化方法下平均治疗效果的改进估计

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-08-30 DOI: 10.1111/stan.12319

Jun Wang, Yahe Yu

Estimation of the average treatment effect is one of the crucial problems in clinical trials for two or multiple treatments. The covariate‐adaptive randomization methods are often applied to balance treatment assignments across prognostic factors in clinical trials, such as the minimization and stratified permuted blocks method. We propose a model‐free estimator of average treatment effects under covariate‐adaptive randomization methods, which is least square adjustment for the estimator of outcome models. The proposed estimator is not only applicable to the case of binary treatment, but also can be extended to the case of multiple treatment. The proposed estimator is consistent and asymptotically normally distributed. Simulation studies show that the proposed estimator and Ye's estimator are comparable, and it performs better than Bugni's estimator when the outcome model is linear. The proposed estimator has some advantages over targeted maximum likelihood estimator, Bugni's estimator and Ye's estimator in terms of the standard error and root mean squared error when the outcome model is nonlinear. The proposed estimator is stable for the from of outcome model. Finally, we apply the proposed methodology to a data set that studies the causal effect promotional videos mode on the school‐age children's educational attainment in Peru.This article is protected by copyright. All rights reserved.

在两种或多种治疗的临床试验中，平均治疗效果的估计是关键问题之一。在临床试验中，协变量自适应随机化方法通常用于平衡预后因素之间的治疗分配，例如最小化和分层排列块方法。我们提出了协变量自适应随机化方法下平均治疗效果的无模型估计量，即结果模型估计量的最小二乘调整。所提出的估计量不仅适用于二元处理的情况，而且可以推广到多重处理的情况。所提出的估计量是一致且渐近正态分布的。仿真研究表明，该估计量与Ye的估计量具有可比性，且在输出模型为线性的情况下，其性能优于Bugni的估计量。当结果模型为非线性时，所提出的估计量在标准误差和均方根误差方面优于目标极大似然估计量、Bugni估计量和Ye估计量。所提出的估计量对于结果模型是稳定的。最后，我们将提出的方法应用于一个数据集，该数据集研究了宣传片模式对秘鲁学龄儿童教育成就的因果效应。这篇文章受版权保护。版权所有。

{"title":"Improved estimation of average treatment effects under covariate‐adaptive randomization methods","authors":"Jun Wang, Yahe Yu","doi":"10.1111/stan.12319","DOIUrl":"https://doi.org/10.1111/stan.12319","url":null,"abstract":"Estimation of the average treatment effect is one of the crucial problems in clinical trials for two or multiple treatments. The covariate‐adaptive randomization methods are often applied to balance treatment assignments across prognostic factors in clinical trials, such as the minimization and stratified permuted blocks method. We propose a model‐free estimator of average treatment effects under covariate‐adaptive randomization methods, which is least square adjustment for the estimator of outcome models. The proposed estimator is not only applicable to the case of binary treatment, but also can be extended to the case of multiple treatment. The proposed estimator is consistent and asymptotically normally distributed. Simulation studies show that the proposed estimator and Ye's estimator are comparable, and it performs better than Bugni's estimator when the outcome model is linear. The proposed estimator has some advantages over targeted maximum likelihood estimator, Bugni's estimator and Ye's estimator in terms of the standard error and root mean squared error when the outcome model is nonlinear. The proposed estimator is stable for the from of outcome model. Finally, we apply the proposed methodology to a data set that studies the causal effect promotional videos mode on the school‐age children's educational attainment in Peru.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"5 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90073386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Franklin's Randomized Response Model With Correlated Scrambled Variables 富兰克林的随机反应模型与相关的混乱变量

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-08-28 DOI: 10.1111/stan.12318

Christopher Aguirre‐Hamilton, Stephen A. Sedory, Sarjinder Singh

We propose two types of estimators that are analogous to Franklin's model. One estimator is derived by concentrating on the row averages of the responses, and another is obtained by concentrating on the column averages of the observed responses. In the latter case we have two responses per respondent from a bi‐variate normal distribution. The proposed estimator based on row averages, by making use of negatively correlated random numbers from a multivariate density, is always more efficient than the corresponding Franklin's estimator. In the case of the proposed estimator based on column averages, we found that the use of positively correlated random numbers from a bivariate density can lead to the most efficient estimator. We also discuss results which are observed by making use of three responses per respondent. When the three responses are recorded, three independent normal densities are derived from three correlated variables. The findings are supported based on analytical, numerical and simulation studies. A simulation study was done to determine the minimum sample size required to produce non‐negative estimates of the population proportion of a sensitive characteristic, and to investigate the 95% nominal coverage by the interval estimates. Ultimately at the end, one best estimator is suggested. A very neat and clean derivations of theoretical results and discussion of numerical and simulation studies are documented in online supplementary material.This article is protected by copyright. All rights reserved.

我们提出了两种类似于富兰克林模型的估计器。一个估计量是通过集中于响应的行平均值得到的，另一个估计量是通过集中于观察到的响应的列平均值得到的。在后一种情况下，我们从双变量正态分布中得到每个应答者的两个回答。所提出的基于行平均值的估计器，通过利用来自多元密度的负相关随机数，总是比相应的富兰克林估计器更有效。在基于列平均值的估计器的情况下，我们发现使用来自二元密度的正相关随机数可以产生最有效的估计器。我们还讨论了通过使用每个受访者的三个回答来观察到的结果。当记录三个响应时，三个独立的正态密度由三个相关变量导出。这些发现得到了分析、数值和模拟研究的支持。进行了模拟研究，以确定产生敏感特征总体比例的非负估计所需的最小样本量，并通过区间估计调查95%的名义覆盖率。最后，给出了一个最好的估计器。一个非常整洁和干净的推导理论结果和讨论的数值和模拟研究记录在网上补充材料。这篇文章受版权保护。版权所有。

{"title":"Franklin's Randomized Response Model With Correlated Scrambled Variables","authors":"Christopher Aguirre‐Hamilton, Stephen A. Sedory, Sarjinder Singh","doi":"10.1111/stan.12318","DOIUrl":"https://doi.org/10.1111/stan.12318","url":null,"abstract":"We propose two types of estimators that are analogous to Franklin's model. One estimator is derived by concentrating on the row averages of the responses, and another is obtained by concentrating on the column averages of the observed responses. In the latter case we have two responses per respondent from a bi‐variate normal distribution. The proposed estimator based on row averages, by making use of negatively correlated random numbers from a multivariate density, is always more efficient than the corresponding Franklin's estimator. In the case of the proposed estimator based on column averages, we found that the use of positively correlated random numbers from a bivariate density can lead to the most efficient estimator. We also discuss results which are observed by making use of three responses per respondent. When the three responses are recorded, three independent normal densities are derived from three correlated variables. The findings are supported based on analytical, numerical and simulation studies. A simulation study was done to determine the minimum sample size required to produce non‐negative estimates of the population proportion of a sensitive characteristic, and to investigate the 95% nominal coverage by the interval estimates. Ultimately at the end, one best estimator is suggested. A very neat and clean derivations of theoretical results and discussion of numerical and simulation studies are documented in online supplementary material.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"29 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89826934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An efficient automatic clustering algorithm for probability density functions and its applications in surface material classification 一种高效的概率密度函数自动聚类算法及其在表面材料分类中的应用

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-08-07 DOI: 10.1111/stan.12315

Thao Nguyen-Trang, Tai Vo-Van, Ha Che-Ngoc

Clustering is a technique used to partition a dataset into groups of similar elements. In addition to traditional clustering methods, clustering for probability density functions (CDF) has been studied to capture data uncertainty. In CDF, automatic clustering is a clever technique that can determine the number of clusters automatically. However, current automatic clustering algorithms update the new probability density function (pdf) fi(t) based on the weighted mean of all previous pdfs fj(t − 1), j = 1, 2, …, N, resulting in slow convergence. This paper proposes an efficient automatic clustering algorithm for pdfs. In the proposed approach, the update of fi(t) is based on the weighted mean of {f1(t), f2(t),…, fi − 1(t), fi(t − 1), fi+1(t − 1),…,fN(t − 1)}, where N is the number of pdfs and i = 1,2,…, N. This technique allows for the incorporation of recently updated pdfs, leading to faster convergence. This paper also pioneers the applications of certain CDF algorithms in the field of surface image recognition. The numerical examples demonstrate that the proposed method can result in a rapid convergence at some early iterations. It also outperforms other state‐of‐the‐art automatic clustering methods in terms of the Adjusted Rand Index (ARI) and the Normalized Mutual Information (NMI). Additionally, the proposed algorithm proves to be competitive when clustering material images contaminated by noise. These results highlight the applicability of the proposed method in the problem of surface image recognition.This article is protected by copyright. All rights reserved.

聚类是一种用于将数据集划分为相似元素组的技术。除了传统的聚类方法外，人们还研究了概率密度函数聚类来捕捉数据的不确定性。在CDF中，自动聚类是一种聪明的技术，它可以自动确定集群的数量。然而，目前的自动聚类算法基于之前所有pdf函数fj(t−1)，j = 1,2，…，N的加权平均值来更新新的概率密度函数(pdf) fi(t)，导致收敛缓慢。提出了一种高效的pdf文件自动聚类算法。在提出的方法中，fi(t)的更新基于{f1(t)， f2(t)，…，fi−1(t)， fi(t−1)，fi+1(t−1)，…，fN(t−1)}的加权平均值，其中N是pdf的数量，i = 1,2，…，N。这种技术允许合并最近更新的pdf，从而加快收敛速度。本文还介绍了某些CDF算法在表面图像识别领域的应用。数值算例表明，该方法在早期迭代时具有较快的收敛速度。在调整兰德指数(ARI)和标准化互信息(NMI)方面，它也优于其他最先进的自动聚类方法。此外，该算法在被噪声污染的材料图像聚类时具有一定的竞争力。这些结果突出了该方法在表面图像识别问题中的适用性。这篇文章受版权保护。版权所有。

{"title":"An efficient automatic clustering algorithm for probability density functions and its applications in surface material classification","authors":"Thao Nguyen-Trang, Tai Vo-Van, Ha Che-Ngoc","doi":"10.1111/stan.12315","DOIUrl":"https://doi.org/10.1111/stan.12315","url":null,"abstract":"Clustering is a technique used to partition a dataset into groups of similar elements. In addition to traditional clustering methods, clustering for probability density functions (CDF) has been studied to capture data uncertainty. In CDF, automatic clustering is a clever technique that can determine the number of clusters automatically. However, current automatic clustering algorithms update the new probability density function (pdf) fi(t) based on the weighted mean of all previous pdfs fj(t − 1), j = 1, 2, …, N, resulting in slow convergence. This paper proposes an efficient automatic clustering algorithm for pdfs. In the proposed approach, the update of fi(t) is based on the weighted mean of {f1(t), f2(t),…, fi − 1(t), fi(t − 1), fi+1(t − 1),…,fN(t − 1)}, where N is the number of pdfs and i = 1,2,…, N. This technique allows for the incorporation of recently updated pdfs, leading to faster convergence. This paper also pioneers the applications of certain CDF algorithms in the field of surface image recognition. The numerical examples demonstrate that the proposed method can result in a rapid convergence at some early iterations. It also outperforms other state‐of‐the‐art automatic clustering methods in terms of the Adjusted Rand Index (ARI) and the Normalized Mutual Information (NMI). Additionally, the proposed algorithm proves to be competitive when clustering material images contaminated by noise. These results highlight the applicability of the proposed method in the problem of surface image recognition.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"26 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83291722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Poisson average maximum likelihood‐centred penalized estimator: A new estimator to better address multicollinearity in Poisson regression 泊松平均最大似然中心惩罚估计量:一种新的估计量，可以更好地解决泊松回归中的多重共线性问题

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-06-22 DOI: 10.1111/stan.12313

Sheng Li, Wen Wang, Menghan Yao, Junyu Wang, Qianqian Du, Xuelin Li, Xinyue Tian, Jing Zeng, Ying Deng, Zhang Tao, F. Yin, Yue Ma

The Poisson ridge estimator (PRE) is a commonly used parameter estimation method to address multicollinearity in Poisson regression (PR). However, PRE shrinks the parameters toward zero, contradicting the real association. In such cases, PRE tends to become an insufficient solution for multicollinearity. In this work, we proposed a new estimator called the Poisson average maximum likelihood‐centered penalized estimator (PAMLPE), which shrinks the parameters toward the weighted average of the maximum likelihood estimators. We conducted a simulation study and case study to compare PAMLPE with existing estimators in terms of mean squared error (MSE) and predictive mean squared error (PMSE). These results suggest that PAMLPE can obtain smaller MSE and PMSE (i.e., more accurate estimates) than the Poisson ridge estimator, Poisson Liu estimator, and Poisson K‐L estimator when the true β$$ beta $$ s have the same sign and small variation. Therefore, we recommend using PAMLPE to address multicollinearity in PR when the signs of the true β$$ beta $$ s are known to be identical in advance.

{"title":"Poisson average maximum likelihood‐centred penalized estimator: A new estimator to better address multicollinearity in Poisson regression","authors":"Sheng Li, Wen Wang, Menghan Yao, Junyu Wang, Qianqian Du, Xuelin Li, Xinyue Tian, Jing Zeng, Ying Deng, Zhang Tao, F. Yin, Yue Ma","doi":"10.1111/stan.12313","DOIUrl":"https://doi.org/10.1111/stan.12313","url":null,"abstract":"The Poisson ridge estimator (PRE) is a commonly used parameter estimation method to address multicollinearity in Poisson regression (PR). However, PRE shrinks the parameters toward zero, contradicting the real association. In such cases, PRE tends to become an insufficient solution for multicollinearity. In this work, we proposed a new estimator called the Poisson average maximum likelihood‐centered penalized estimator (PAMLPE), which shrinks the parameters toward the weighted average of the maximum likelihood estimators. We conducted a simulation study and case study to compare PAMLPE with existing estimators in terms of mean squared error (MSE) and predictive mean squared error (PMSE). These results suggest that PAMLPE can obtain smaller MSE and PMSE (i.e., more accurate estimates) than the Poisson ridge estimator, Poisson Liu estimator, and Poisson K‐L estimator when the true β$$ beta $$ s have the same sign and small variation. Therefore, we recommend using PAMLPE to address multicollinearity in PR when the signs of the true β$$ beta $$ s are known to be identical in advance.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"57 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79135996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A case study of Gulf Securities Market in the last 20 years: A Long Short‐Term Memory approach 海湾证券市场近20年的案例研究:长短期记忆方法

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY

Statistica Neerlandica

Pub Date : 2023-06-18 DOI: 10.1111/stan.12309

Abhibasu Sen, Karabi Dutta Choudhury

Various researches have been conducted on forecasting stock prices. Several tools ranging from statistical techniques to quantitative methods have been used by researchers to forecast the market. But so far, very little research has been done on forecasting the stock markets of the Gulf countries such as Saudi Arabia, United Arab Emirates, Oman, Kuwait, Bahrain, and Qatar. Our approach is to predict the market indices of the Gulf countries using Long Short‐Term Memory (LSTM) techniques. Thereafter, we optimized the hyperparameters of the LSTM technique using various optimization methods such as Grid Search and Bayesian Optimization with Gaussian Process and found out the best‐suited hyperparameter for the LSTM model. We tried the LSTM method for predicting the indices using data from the last twenty years.

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Statistica Neerlandica

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀