首页 > 最新文献

Journal of Applied Statistics最新文献

英文 中文
Hierarchical Bayesian models for small area estimation with GB2 distribution. GB2分布下小面积估计的层次贝叶斯模型。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-10 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2475349
Binod Manandhar, Balgobin Nandram

We present predictive hierarchical Bayesian models to fit continuous, and positively skewed size data from small areas with the generalized beta of the second kind (GB2) distribution. We discuss three different GB2 mixture models. In the models, we have implemented the technique of small areas estimation. The posterior distributions of these models are complex. We have used Taylor series approximations, grid sampling and Metropolis samplers to fit the models. We have applied our models to the per-capita consumption size data from the second Nepal Living Standards Survey. We choose the best fitted model from the three GB2 mixture models. With the best fitted model, we provide small area estimation of poverty indicators by linking the survey data with the census data. A simulation study is provided.

我们提出了预测层次贝叶斯模型来拟合来自小区域的连续的、正偏斜的尺寸数据,该模型具有第二类(GB2)分布的广义beta。我们讨论了三种不同的GB2混合模型。在模型中,我们实现了小面积估计技术。这些模型的后验分布是复杂的。我们使用泰勒级数近似、网格采样和Metropolis采样器来拟合模型。我们将我们的模型应用于第二次尼泊尔生活水平调查的人均消费规模数据。我们从三个GB2混合模型中选择最适合的模型。利用最佳拟合模型,我们将调查数据与人口普查数据联系起来,提供了贫困指标的小区域估计。并进行了仿真研究。
{"title":"Hierarchical Bayesian models for small area estimation with GB2 distribution.","authors":"Binod Manandhar, Balgobin Nandram","doi":"10.1080/02664763.2025.2475349","DOIUrl":"https://doi.org/10.1080/02664763.2025.2475349","url":null,"abstract":"<p><p>We present predictive hierarchical Bayesian models to fit continuous, and positively skewed size data from small areas with the generalized beta of the second kind (GB2) distribution. We discuss three different GB2 mixture models. In the models, we have implemented the technique of small areas estimation. The posterior distributions of these models are complex. We have used Taylor series approximations, grid sampling and Metropolis samplers to fit the models. We have applied our models to the per-capita consumption size data from the second Nepal Living Standards Survey. We choose the best fitted model from the three GB2 mixture models. With the best fitted model, we provide small area estimation of poverty indicators by linking the survey data with the census data. A simulation study is provided.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2448-2477"},"PeriodicalIF":1.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490410/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial similarity index for scouting in football. 足球球探的空间相似性指数。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-10 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2473542
V Gómez-Rubio, J Lagos, F Palmí-Perales

Finding players with similar profiles is an important problem in sports such as football (also known as soccer in some countries). Scouting for new players requires a wealth of information about the available players so that similar profiles to that of a target player can be identified. However, information about the position of the players in the field is seldom employed. For this reason, a novel approach based on spatial data analysis is introduced to produce a spatial similarity index that can help to identify similar players. The use of this new spatial similarity index is illustrated with an example from the Spanish competition 'La Liga', season 2019-2020, in which hundreds of players are clustered according to their position in the field.

在足球(在某些国家也被称为soccer)等运动中,寻找具有相似个人资料的球员是一个重要问题。寻找新球员需要关于可用球员的大量信息,这样才能识别出与目标球员相似的个人资料。然而,关于球员在场上位置的信息很少被使用。因此,本文提出了一种基于空间数据分析的新方法来生成空间相似性指数,以帮助识别相似的参与者。以2019-2020赛季的西甲联赛为例说明了这种新的空间相似性指数的使用,在西甲联赛中,数百名球员根据他们在场上的位置进行分组。
{"title":"Spatial similarity index for scouting in football.","authors":"V Gómez-Rubio, J Lagos, F Palmí-Perales","doi":"10.1080/02664763.2025.2473542","DOIUrl":"10.1080/02664763.2025.2473542","url":null,"abstract":"<p><p>Finding players with similar profiles is an important problem in sports such as football (also known as soccer in some countries). Scouting for new players requires a wealth of information about the available players so that similar profiles to that of a target player can be identified. However, information about the position of the players in the field is seldom employed. For this reason, a novel approach based on spatial data analysis is introduced to produce a spatial similarity index that can help to identify similar players. The use of this new spatial similarity index is illustrated with an example from the Spanish competition 'La Liga', season 2019-2020, in which hundreds of players are clustered according to their position in the field.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2745-2758"},"PeriodicalIF":1.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581730/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayes factors for two-group comparisons in Cox regression with an application for reverse-engineering raw data from summary statistics. Cox回归中两组比较的贝叶斯因子与来自汇总统计的逆向工程原始数据的应用。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-01 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2472150
Maximilian Linde, Jorge N Tendeiro, Don van Ravenzwaaij

The use of Cox proportional hazards regression to analyze time-to-event data is ubiquitous in biomedical research. Typically, the frequentist framework is used to draw conclusions about whether hazards are different between patients in an experimental and a control condition. We offer a procedure to compute Bayes factors for simple Cox models, both for the scenario where the full data are available and for the scenario where only summary statistics are available. The procedure is implemented in our 'baymedr' R package. The usage of Bayes factors remedies some shortcomings of frequentist inference and has the potential to save scarce resources.

使用Cox比例风险回归分析事件时间数据在生物医学研究中是普遍存在的。通常,频率论的框架被用来得出结论,关于在实验和控制条件下的病人之间的危险是否不同。我们提供了一个程序来计算简单Cox模型的贝叶斯因子,既适用于完整数据可用的场景,也适用于只有摘要统计数据可用的场景。这个过程在我们的‘baymedr’ R包中实现。贝叶斯因子的使用弥补了频率推理的一些不足,并有可能节省稀缺资源。
{"title":"Bayes factors for two-group comparisons in Cox regression with an application for reverse-engineering raw data from summary statistics.","authors":"Maximilian Linde, Jorge N Tendeiro, Don van Ravenzwaaij","doi":"10.1080/02664763.2025.2472150","DOIUrl":"10.1080/02664763.2025.2472150","url":null,"abstract":"<p><p>The use of Cox proportional hazards regression to analyze time-to-event data is ubiquitous in biomedical research. Typically, the frequentist framework is used to draw conclusions about whether hazards are different between patients in an experimental and a control condition. We offer a procedure to compute Bayes factors for simple Cox models, both for the scenario where the full data are available and for the scenario where only summary statistics are available. The procedure is implemented in our 'baymedr' R package. The usage of Bayes factors remedies some shortcomings of frequentist inference and has the potential to save scarce resources.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2413-2437"},"PeriodicalIF":1.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490364/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An integrated change point detection and online monitoring approach for the ratio of two variables using clustering-based control charts. 基于聚类控制图的两变量比值变化点检测与在线监测方法。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-02-14 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2455625
Adel Ahmadi Nadi, Ali Yeganeh, Sandile Charles Shongwe, Alireza Shadman

Online monitoring of the ratio of two random characteristics rather than monitoring their individual behaviors has many applications. For this aim, there are various control charts, known as RZ charts in the literature, e.g. Shewhart, memory-type and adaptive monitoring schemes, have been designed to detect the ratio's abnormal patterns as soon as possible. Most of the existing RZ charts rely on two assumptions about the process: (i) both individual characteristics are normally distributed, and (ii) the direction (upward or downward) of the RZ's deviation from its in-control (IC) state to an out-of-control (OC) condition is known. However, these assumptions can be violated in many practical situations. In recent years, applying the machine learning (ML) models in the Statistical Process Monitoring (SPM) area has provided several contributions compared to traditional statistical methods. However, ML-based control charts have not yet been discussed in the RZ monitoring literature. To this end, this study introduces a novel clustering-based control chart for monitoring RZ in Phase II. This method avoids making any assumptions about the direction of RZ's deviation and does not need to assume a specific distribution for the two random characteristics. Furthermore, it can estimate the Change Point (CP) in the process.

在线监测两个随机特征的比值而不是监测它们的个体行为具有许多应用。为此,人们设计了各种控制图,即文献中所称的RZ图,如Shewhart、memory-type和adaptive monitoring scheme等,以尽早发现比率的异常模式。大多数现有的RZ图依赖于两个关于过程的假设:(i)两个个体特征都是正态分布的,(ii) RZ从控制(IC)状态到失控(OC)状态的偏离方向(向上或向下)是已知的。然而,在许多实际情况下,这些假设可能会被违反。近年来,机器学习(ML)模型在统计过程监控(SPM)领域的应用与传统的统计方法相比做出了许多贡献。然而,基于ml的控制图尚未在RZ监测文献中讨论。为此,本研究引入了一种新的基于聚类的控制图来监测二期RZ。该方法避免了对RZ偏差的方向做任何假设,也不需要对两个随机特征的具体分布做假设。此外,它还可以估计过程中的变更点(CP)。
{"title":"An integrated change point detection and online monitoring approach for the ratio of two variables using clustering-based control charts.","authors":"Adel Ahmadi Nadi, Ali Yeganeh, Sandile Charles Shongwe, Alireza Shadman","doi":"10.1080/02664763.2025.2455625","DOIUrl":"10.1080/02664763.2025.2455625","url":null,"abstract":"<p><p>Online monitoring of the ratio of two random characteristics rather than monitoring their individual behaviors has many applications. For this aim, there are various control charts, known as RZ charts in the literature, e.g. Shewhart, memory-type and adaptive monitoring schemes, have been designed to detect the ratio's abnormal patterns as soon as possible. Most of the existing RZ charts rely on two assumptions about the process: (<i>i</i>) both individual characteristics are normally distributed, and (<i>ii</i>) the direction (upward or downward) of the RZ's deviation from its in-control (IC) state to an out-of-control (OC) condition is known. However, these assumptions can be violated in many practical situations. In recent years, applying the machine learning (ML) models in the Statistical Process Monitoring (SPM) area has provided several contributions compared to traditional statistical methods. However, ML-based control charts have not yet been discussed in the RZ monitoring literature. To this end, this study introduces a novel clustering-based control chart for monitoring RZ in Phase II. This method avoids making any assumptions about the direction of RZ's deviation and does not need to assume a specific distribution for the two random characteristics. Furthermore, it can estimate the Change Point (CP) in the process.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 11","pages":"2060-2093"},"PeriodicalIF":1.1,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12404067/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144992745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BSTPP: a python package for Bayesian spatiotemporal point processes. 贝叶斯时空点处理的python包。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-02-11 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2462969
Isaac Manring, Honglang Wang, George Mohler, Xenia Miscouridou

Spatiotemporal point process models have a rich history of effectively modeling event data in space and time. However, they are sometimes neglected due to the difficulty of implementing them. There is a lack of packages with the ability to perform inference for these models, particularly in python. Thus we present BSTPP a python package for Bayesian inference on spatiotemporal point processes. It offers three different kinds of models: space-time separable Log Gaussian Cox, Hawkes, and Cox Hawkes. Users may employ the predefined trigger parameterizations for the Hawkes models, or they may implement their own trigger functions with the extendable Trigger module. For the Cox models, posterior inference on the Gaussian processes is sped up with a pre-trained Variational Auto Encoder (VAE). The package includes a new flexible pre-trained VAE. We validate the model through simulation studies and then explore it by applying it to shooting data in Chicago.

时空点过程模型在有效地对空间和时间上的事件数据进行建模方面有着丰富的历史。然而,由于执行困难,它们有时被忽视。缺乏能够对这些模型执行推理的包,特别是在python中。因此,我们提出了BSTPP一个python包贝叶斯推理的时空点过程。它提供了三种不同的模型:时空可分离对数高斯Cox、Hawkes和Cox Hawkes。用户可以为Hawkes模型使用预定义的触发器参数化,也可以使用可扩展的trigger模块实现自己的触发器函数。对于Cox模型,使用预训练的变分自编码器(VAE)加速高斯过程的后验推理。该方案包括一个新的灵活的预训练VAE。我们通过仿真研究对模型进行了验证,并将其应用于芝加哥的射击数据中进行了探索。
{"title":"BSTPP: a python package for Bayesian spatiotemporal point processes.","authors":"Isaac Manring, Honglang Wang, George Mohler, Xenia Miscouridou","doi":"10.1080/02664763.2025.2462969","DOIUrl":"https://doi.org/10.1080/02664763.2025.2462969","url":null,"abstract":"<p><p>Spatiotemporal point process models have a rich history of effectively modeling event data in space and time. However, they are sometimes neglected due to the difficulty of implementing them. There is a lack of packages with the ability to perform inference for these models, particularly in python. Thus we present BSTPP a python package for Bayesian inference on spatiotemporal point processes. It offers three different kinds of models: space-time separable Log Gaussian Cox, Hawkes, and Cox Hawkes. Users may employ the predefined trigger parameterizations for the Hawkes models, or they may implement their own trigger functions with the extendable Trigger module. For the Cox models, posterior inference on the Gaussian processes is sped up with a pre-trained Variational Auto Encoder (VAE). The package includes a new flexible pre-trained VAE. We validate the model through simulation studies and then explore it by applying it to shooting data in Chicago.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2524-2543"},"PeriodicalIF":1.1,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490397/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the improved estimation of the normal mixture components for longitudinal data. 纵向数据法向混合分量的改进估计。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-02-07 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2459293
Tapio Nummi, Jyrki Möttönen, Pasi Väkeväinen, Janne Salonen, Timothy E O'Brien

When analyzing real data sets, statisticians often face the question that the data are heterogeneous and it may not necessarily be possible to model this heterogeneity directly. One natural option in this case is to use the methods based on finite mixtures. The key question in these techniques often is what is the best number of mixtures or, depending on the focus of the analysis, the best number of sub-populations when the model is otherwise fixed. Moreover, when the distribution of the response variable deviates from meeting the assumptions, it's common to employ an appropriate transformation to align the distribution with the model's requirements. To solve the problem in the mixture regression context we propose a technique based on the scaled Box-Cox transformation for normal mixtures. The specific focus here is on mixture regression for longitudinal data, the so-called trajectory analysis. We present interesting practical results as well as simulation experiments to demonstrate that our method yields reasonable results. Associated R-programs are also provided.

在分析真实数据集时,统计学家经常面临这样的问题,即数据是异构的,并且可能不一定能够直接对这种异质性进行建模。在这种情况下,一个自然的选择是使用基于有限混合的方法。这些技术中的关键问题通常是,当模型是固定的时候,混合的最佳数量是多少,或者根据分析的重点,是子种群的最佳数量。此外,当响应变量的分布偏离满足假设时,通常会采用适当的转换来使分布与模型的需求保持一致。为了解决混合回归背景下的问题,我们提出了一种基于标准混合的缩放Box-Cox变换的技术。这里的重点是纵向数据的混合回归,即所谓的轨迹分析。我们给出了有趣的实际结果和仿真实验,以证明我们的方法产生了合理的结果。还提供了相关的r程序。
{"title":"On the improved estimation of the normal mixture components for longitudinal data.","authors":"Tapio Nummi, Jyrki Möttönen, Pasi Väkeväinen, Janne Salonen, Timothy E O'Brien","doi":"10.1080/02664763.2025.2459293","DOIUrl":"10.1080/02664763.2025.2459293","url":null,"abstract":"<p><p>When analyzing real data sets, statisticians often face the question that the data are heterogeneous and it may not necessarily be possible to model this heterogeneity directly. One natural option in this case is to use the methods based on finite mixtures. The key question in these techniques often is what is the best number of mixtures or, depending on the focus of the analysis, the best number of sub-populations when the model is otherwise fixed. Moreover, when the distribution of the response variable deviates from meeting the assumptions, it's common to employ an appropriate transformation to align the distribution with the model's requirements. To solve the problem in the mixture regression context we propose a technique based on the scaled Box-Cox transformation for normal mixtures. The specific focus here is on mixture regression for longitudinal data, the so-called trajectory analysis. We present interesting practical results as well as simulation experiments to demonstrate that our method yields reasonable results. Associated R-programs are also provided.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 12","pages":"2271-2290"},"PeriodicalIF":1.1,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145029994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On use of adaptive cluster sampling for variance estimation. 自适应聚类抽样在方差估计中的应用。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-02-05 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2460072
Shameem Alam, Javid Shabbir, Malaika Nadeem

Adaptive cluster sampling is particularly helpful whenever the target population is unique, dispersed unevenly, concealed or difficult to find. In the current investigation, under an adaptive cluster sampling approach, we propose a ratio-product-logarithmic type estimator employing a single auxiliary variable for the estimation of finite population variance. The bias and mean square error of the proposed estimator are developed by using simulation as well as real data sets. The study results show that for estimating the finite population variance, the proposed estimator outperforms the competing estimators.

当目标群体是唯一的、分散不均匀的、隐藏的或难以找到的时候,自适应聚类抽样特别有用。在目前的研究中,在自适应聚类抽样方法下,我们提出了一种使用单个辅助变量估计有限总体方差的比率-乘积-对数型估计器。利用仿真和实际数据集对该估计器的偏差和均方误差进行了分析。研究结果表明,对于有限总体方差的估计,所提出的估计器优于同类估计器。
{"title":"On use of adaptive cluster sampling for variance estimation.","authors":"Shameem Alam, Javid Shabbir, Malaika Nadeem","doi":"10.1080/02664763.2025.2460072","DOIUrl":"https://doi.org/10.1080/02664763.2025.2460072","url":null,"abstract":"<p><p>Adaptive cluster sampling is particularly helpful whenever the target population is unique, dispersed unevenly, concealed or difficult to find. In the current investigation, under an adaptive cluster sampling approach, we propose a ratio-product-logarithmic type estimator employing a single auxiliary variable for the estimation of finite population variance. The bias and mean square error of the proposed estimator are developed by using simulation as well as real data sets. The study results show that for estimating the finite population variance, the proposed estimator outperforms the competing estimators.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 12","pages":"2291-2305"},"PeriodicalIF":1.1,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416028/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145029941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influence diagnostics in the Heckman selection models based on EM algorithms. 基于EM算法的Heckman选择模型中的影响诊断。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-02-05 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2461715
Marcos S Oliveira, Marcos O Prates, Christian E Galarza, Victor H Lachos

This study presents diagnostic techniques for Heckman selection models estimated using the EM algorithm. The focus is on the selection t and normal models, based on the bivariate Student's-t and bivariate normal distributions, respectively. The Heckman selection model is a key econometric tool for estimating relationships while addressing selection bias. Relying on the EM-type algorithm, we develop global and local influence analyses based on the conditional expectation of the complete-data log-likelihood function, exploring four perturbation schemes for local influence analysis. To assess the effectiveness of the proposed diagnostic measures in identifying influential observations, we conducted a simulation study, complemented by two real-data applications that demonstrate how these techniques can effectively identify influential points. The proposed algorithms and methodologies are incorporated into the R package HeckmanEM.

本研究提出了使用EM算法估计的Heckman选择模型的诊断技术。重点是选择t和正态模型,分别基于二元Student's-t和二元正态分布。赫克曼选择模型是一个关键的计量经济学工具,用于估计关系,同时解决选择偏差。基于em型算法,基于完整数据对数似然函数的条件期望,我们开发了全局和局部影响分析,探索了四种局部影响分析的摄动方案。为了评估建议的诊断措施在识别有影响的观测值方面的有效性,我们进行了一项模拟研究,并辅以两个实际数据应用,展示了这些技术如何有效地识别有影响的点。提出的算法和方法被合并到R包HeckmanEM中。
{"title":"Influence diagnostics in the Heckman selection models based on EM algorithms.","authors":"Marcos S Oliveira, Marcos O Prates, Christian E Galarza, Victor H Lachos","doi":"10.1080/02664763.2025.2461715","DOIUrl":"https://doi.org/10.1080/02664763.2025.2461715","url":null,"abstract":"<p><p>This study presents diagnostic techniques for Heckman selection models estimated using the EM algorithm. The focus is on the selection <i>t</i> and normal models, based on the bivariate Student's-<i>t</i> and bivariate normal distributions, respectively. The Heckman selection model is a key econometric tool for estimating relationships while addressing selection bias. Relying on the EM-type algorithm, we develop global and local influence analyses based on the conditional expectation of the complete-data log-likelihood function, exploring four perturbation schemes for local influence analysis. To assess the effectiveness of the proposed diagnostic measures in identifying influential observations, we conducted a simulation study, complemented by two real-data applications that demonstrate how these techniques can effectively identify influential points. The proposed algorithms and methodologies are incorporated into the R package HeckmanEM.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2384-2412"},"PeriodicalIF":1.1,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490367/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Objective Bayesian trend filtering via adaptive piecewise polynomial regression. 目的基于自适应分段多项式回归的贝叶斯趋势滤波。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-02-04 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2461186
Sang Gil Kang, Yongku Kim

Several methods have been developed for nonparametric regression problems, including classical approaches such as kernels, local polynomials, smoothing splines, sieves, and wavelets, as well as relatively new methods such as lasso, generalized lasso, and trend filtering. This study proposes an objective Bayesian trend filtering method based on model selection. The procedure followed in this study estimates the functions based on adaptive piecewise polynomial regression models with two components. First, we determine the intervals with varying trends using Bayesian binary segmentation and then evaluate the most reasonable trend via Bayesian model selection at these intervals. This trend filtering procedure follows Bayesian model selection that uses intrinsic priors, which eliminated any subjective input. Additionally, we prove that the proposed method using these intrinsic priors was consistent when applied to large sample sizes. The behavior of the proposed Bayesian trend filtering procedure is compared with the trend filtering using a simulation study and real examples. Finally, we apply the proposed method to detect the variance change points under mean changes, whereas the existing methods yielded inaccurate estimates of the variance change points when the mean varied smoothly, as the sudden-change assumption was violated in such cases.

对于非参数回归问题,已经开发了几种方法,包括经典方法,如核、局部多项式、平滑样条、筛子和小波,以及相对较新的方法,如lasso、广义lasso和趋势滤波。提出了一种基于模型选择的客观贝叶斯趋势过滤方法。本文采用自适应分段多项式回归模型对函数进行估计。首先,利用贝叶斯二值分割确定具有变化趋势的区间,然后在这些区间内通过贝叶斯模型选择来评估最合理的趋势。这种趋势过滤过程遵循贝叶斯模型选择,使用内在先验,消除了任何主观输入。此外,我们证明了使用这些固有先验的方法在应用于大样本量时是一致的。通过仿真研究和实例,比较了所提出的贝叶斯趋势滤波方法与趋势滤波方法的性能。最后,我们将该方法应用于均值变化下的方差变化点检测,而现有方法在均值平稳变化时的方差变化点估计不准确,因为这种情况违反了突变假设。
{"title":"Objective Bayesian trend filtering via adaptive piecewise polynomial regression.","authors":"Sang Gil Kang, Yongku Kim","doi":"10.1080/02664763.2025.2461186","DOIUrl":"https://doi.org/10.1080/02664763.2025.2461186","url":null,"abstract":"<p><p>Several methods have been developed for nonparametric regression problems, including classical approaches such as kernels, local polynomials, smoothing splines, sieves, and wavelets, as well as relatively new methods such as lasso, generalized lasso, and trend filtering. This study proposes an objective Bayesian trend filtering method based on model selection. The procedure followed in this study estimates the functions based on adaptive piecewise polynomial regression models with two components. First, we determine the intervals with varying trends using Bayesian binary segmentation and then evaluate the most reasonable trend via Bayesian model selection at these intervals. This trend filtering procedure follows Bayesian model selection that uses intrinsic priors, which eliminated any subjective input. Additionally, we prove that the proposed method using these intrinsic priors was consistent when applied to large sample sizes. The behavior of the proposed Bayesian trend filtering procedure is compared with the trend filtering using a simulation study and real examples. Finally, we apply the proposed method to detect the variance change points under mean changes, whereas the existing methods yielded inaccurate estimates of the variance change points when the mean varied smoothly, as the sudden-change assumption was violated in such cases.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2357-2383"},"PeriodicalIF":1.1,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490381/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parametric estimation of quantile versions of Zenga and D inequality curves: methodology and application to Weibull distribution. Zenga和D不等式曲线分位数版本的参数估计:方法及其在威布尔分布中的应用。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-02-03 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2458126
Sylwester Pia̧tek

Inequality (concentration) curves such as Lorenz, Bonferroni, Zenga curves, as well as a new inequality curve - the D curve, are broadly used to analyse inequalities in wealth and income distribution in certain populations. Quantile versions of these inequality curves are more robust to outliers. We discuss several parametric estimators of quantile versions of the Zenga and D curves. A minimum distance (MD) estimator is proposed for these two curves and the indices related to them. The consistency and asymptotic normality of the MD estimator is proved. The MD estimator can also be used to estimate the inequality measures corresponding to the quantile versions of the inequality curves. The estimation methods considered are illustrated in the case of the Weibull model, which has many applications in life sciences, for example, to fit the precipitation data. In econometrics it is also considered to fit incomes, especially in the case when a significant share of population have low incomes, for example, in less developed countries or among low-paid jobs.

不平等(集中)曲线,如Lorenz、Bonferroni、Zenga曲线,以及一种新的不平等曲线——D曲线,被广泛用于分析某些人群中财富和收入分配的不平等。这些不平等曲线的分位数版本对异常值更为稳健。我们讨论了Zenga和D曲线的分位数版本的几个参数估计。提出了这两条曲线及其相关指标的最小距离估计。证明了MD估计量的相合性和渐近正态性。MD估计器也可用于估计与不等式曲线的分位数版本相对应的不等式测度。以威布尔模型为例说明了所考虑的估计方法,威布尔模型在生命科学中有许多应用,例如,用于拟合降水数据。在计量经济学中,它也被认为与收入相匹配,特别是在很大一部分人口收入较低的情况下,例如,在欠发达国家或从事低薪工作。
{"title":"Parametric estimation of quantile versions of Zenga and D inequality curves: methodology and application to Weibull distribution.","authors":"Sylwester Pia̧tek","doi":"10.1080/02664763.2025.2458126","DOIUrl":"https://doi.org/10.1080/02664763.2025.2458126","url":null,"abstract":"<p><p>Inequality (concentration) curves such as Lorenz, Bonferroni, Zenga curves, as well as a new inequality curve - the <i>D</i> curve, are broadly used to analyse inequalities in wealth and income distribution in certain populations. Quantile versions of these inequality curves are more robust to outliers. We discuss several parametric estimators of quantile versions of the Zenga and <i>D</i> curves. A minimum distance (MD) estimator is proposed for these two curves and the indices related to them. The consistency and asymptotic normality of the MD estimator is proved. The MD estimator can also be used to estimate the inequality measures corresponding to the quantile versions of the inequality curves. The estimation methods considered are illustrated in the case of the Weibull model, which has many applications in life sciences, for example, to fit the precipitation data. In econometrics it is also considered to fit incomes, especially in the case when a significant share of population have low incomes, for example, in less developed countries or among low-paid jobs.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 12","pages":"2226-2246"},"PeriodicalIF":1.1,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416017/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145029907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Applied Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1