首页 > 最新文献

Journal of Applied Statistics最新文献

英文 中文
Bootstrap-based inference for multiple variance changepoint models. 多方差变点模型的自举推理。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-25 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2481454
Yang Li, Qijing Yan, Mixia Wu, Aiyi Liu

Variance changepoints in economics, finance, biomedicine, oceanography, etc. are frequent and significant. To better detect these changepoints, we propose a new technique for constructing confidence intervals for the variances of a noisy sequence with multiple changepoints by combining bootstrapping with the weighted sequential binary segmentation (WSBS) algorithm and the Bayesian information criterion (BIC). The intensity score obtained from the bootstrap replications is introduced to reflect the possibility that each location is, or is close to, one of the changepoints. On this basis, a new changepoint estimation is proposed, and its asymptotic properties are derived. The simulated results show that the proposed method has superior performance in comparison with the state-of-the-art segmentation methods. Finally, the method is applied to weekly stock prices, oceanographic data, DNA copy number data and traffic flow data.

在经济学、金融学、生物医学、海洋学等领域,方差变化点频繁且显著。为了更好地检测这些变化点,我们提出了一种将自举与加权顺序二值分割(WSBS)算法和贝叶斯信息准则(BIC)相结合的方法来构造具有多个变化点的噪声序列方差置信区间的新技术。引入了从自举复制获得的强度评分,以反映每个位置是或接近一个变化点的可能性。在此基础上,提出了一种新的变点估计,并给出了其渐近性质。仿真结果表明,与现有的分割方法相比,该方法具有更好的分割性能。最后,将该方法应用于每周股票价格、海洋数据、DNA拷贝数数据和交通流量数据。
{"title":"Bootstrap-based inference for multiple variance changepoint models.","authors":"Yang Li, Qijing Yan, Mixia Wu, Aiyi Liu","doi":"10.1080/02664763.2025.2481454","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481454","url":null,"abstract":"<p><p>Variance changepoints in economics, finance, biomedicine, oceanography, etc. are frequent and significant. To better detect these changepoints, we propose a new technique for constructing confidence intervals for the variances of a noisy sequence with multiple changepoints by combining bootstrapping with the weighted sequential binary segmentation (WSBS) algorithm and the Bayesian information criterion (BIC). The intensity score obtained from the bootstrap replications is introduced to reflect the possibility that each location is, or is close to, one of the changepoints. On this basis, a new changepoint estimation is proposed, and its asymptotic properties are derived. The simulated results show that the proposed method has superior performance in comparison with the state-of-the-art segmentation methods. Finally, the method is applied to weekly stock prices, oceanographic data, DNA copy number data and traffic flow data.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2636-2671"},"PeriodicalIF":1.1,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581773/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An R tool for computing and evaluating Fuzzy poverty indices: The package FuzzyPovertyR. 一个计算和评价模糊贫困指数的R工具:软件包FuzzyPovertyR。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-24 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2481461
F Crescenzi, L Mori, G Betti, F Gagliardi, A D'Agostino, L Neri

Fuzzy set theory has become increasingly popular for deriving uni- and multi-dimensional poverty estimates. In recent years, various authors have proposed different approaches to defining membership functions, resulting in the development of various fuzzy poverty indices. This paper introduces a new R package called FuzzyPovertyR, designed for estimating fuzzy poverty indices. The package is demonstrated by using it to estimate three fuzzy poverty indices - one multi- and two uni-dimensional - at the regional level (NUTS 2) in Italy. The package allows users to select from a range of membership functions and includes tools for estimating the variance of these indices by the ad-hoc Jack-Knife repeated replication procedure or by naive and calibrated non-parametric bootstrap methods.

模糊集理论已成为越来越流行的单一和多维贫困估计。近年来,不同的作者提出了不同的隶属度函数定义方法,导致了各种模糊贫困指数的发展。本文介绍了一个新的R包FuzzyPovertyR,用于模糊贫困指数的估计。该方案通过在意大利区域一级(NUTS 2)使用它来估计三个模糊贫困指数(一个多维和两个一维)来证明。该软件包允许用户从一系列成员函数中进行选择,并包括通过特别的杰克刀重复复制程序或通过幼稚和校准的非参数引导方法估计这些指标方差的工具。
{"title":"An R tool for computing and evaluating Fuzzy poverty indices: The package FuzzyPovertyR.","authors":"F Crescenzi, L Mori, G Betti, F Gagliardi, A D'Agostino, L Neri","doi":"10.1080/02664763.2025.2481461","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481461","url":null,"abstract":"<p><p>Fuzzy set theory has become increasingly popular for deriving uni- and multi-dimensional poverty estimates. In recent years, various authors have proposed different approaches to defining membership functions, resulting in the development of various fuzzy poverty indices. This paper introduces a new R package called FuzzyPovertyR, designed for estimating fuzzy poverty indices. The package is demonstrated by using it to estimate three fuzzy poverty indices - one multi- and two uni-dimensional - at the regional level (NUTS 2) in Italy. The package allows users to select from a range of membership functions and includes tools for estimating the variance of these indices by the ad-hoc Jack-Knife repeated replication procedure or by naive and calibrated non-parametric bootstrap methods.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 15","pages":"2958-2971"},"PeriodicalIF":1.1,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12671421/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145668606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New insights into multicollinearity in the Cox proportional hazard models: the Kibria-Lukman estimator and its application. 对Cox比例风险模型多重共线性的新认识:Kibria-Lukman估计量及其应用。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-21 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2481456
Solmaz Seifollahi, Zakariya Yahya Algamal, Mohammad Arashi

This paper examines the Cox proportional hazards model (CPHM) in the presence of multicollinearity. Typically, the maximum partial likelihood estimator (MPLE) is employed to estimate the model coefficients, which works well when the covariates are uncorrelated. However, in various scenarios, covariates are correlated, leading to unstable coefficient estimates with the MPLE. To address this challenge, Liu and ridge estimators have been introduced in the CPHMs. In this paper, we present the Kibria-Lukman estimator as an advancement over existing alternatives and explore its properties. We evaluate the performance of the proposed estimator through Monte Carlo simulations, utilizing mean squared error and mean absolute error as criteria for comparison. Additionally, we demonstrate our proposal advantages through analyzing a medical dataset.

本文研究了多重共线性情况下的Cox比例风险模型。通常使用极大偏似然估计(MPLE)来估计模型系数,当协变量不相关时效果很好。然而,在各种情况下,协变量是相关的,导致与MPLE的系数估计不稳定。为了解决这一挑战,在cphm中引入了Liu和ridge估计器。在本文中,我们提出了Kibria-Lukman估计量作为一种改进,并探讨了它的性质。我们通过蒙特卡罗模拟来评估所提出的估计器的性能,利用均方误差和平均绝对误差作为比较标准。此外,我们通过分析医疗数据集来证明我们的提议的优势。
{"title":"New insights into multicollinearity in the Cox proportional hazard models: the Kibria-Lukman estimator and its application.","authors":"Solmaz Seifollahi, Zakariya Yahya Algamal, Mohammad Arashi","doi":"10.1080/02664763.2025.2481456","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481456","url":null,"abstract":"<p><p>This paper examines the Cox proportional hazards model (CPHM) in the presence of multicollinearity. Typically, the maximum partial likelihood estimator (MPLE) is employed to estimate the model coefficients, which works well when the covariates are uncorrelated. However, in various scenarios, covariates are correlated, leading to unstable coefficient estimates with the MPLE. To address this challenge, Liu and ridge estimators have been introduced in the CPHMs. In this paper, we present the Kibria-Lukman estimator as an advancement over existing alternatives and explore its properties. We evaluate the performance of the proposed estimator through Monte Carlo simulations, utilizing mean squared error and mean absolute error as criteria for comparison. Additionally, we demonstrate our proposal advantages through analyzing a medical dataset.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2672-2685"},"PeriodicalIF":1.1,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581747/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-inflated Poisson mixed model for longitudinal count data with informative dropouts. 具有信息缺失的纵向计数数据的零膨胀泊松混合模型。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-20 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2481458
Sanjoy K Sinha

Zero-inflated Poisson (ZIP) models are typically used for analyzing count data with excess zeros. If the data are collected longitudinally, then repeated observations from a given subject are correlated by nature. The ZIP mixed model may be used to deal with excess zeros and correlations among the repeated observations. Also, it is often the case that some follow-up measurements in a longitudinal study are missing. If the missing data are informative or nonignorable, it is necessary to incorporate a missingness mechanism into the observed likelihood function for a valid inference. In this paper, we propose and explore an efficient method for analyzing count data by addressing the complex issues of excess zeros, correlations among repeated observations, and missing responses due to dropouts. The empirical properties of the proposed estimators are studied based on Monte Carlo simulations. An application is provided using some real data obtained from a health study.

零膨胀泊松(ZIP)模型通常用于分析带有多余零的计数数据。如果数据是纵向收集的,那么从一个给定的主题重复观察是自然相关的。ZIP混合模型可用于处理重复观测值之间的多余零和相关性。此外,在纵向研究中经常缺少一些后续测量。如果缺失的数据是信息性的或不可忽略的,则有必要将缺失机制合并到观察到的似然函数中以进行有效的推断。在本文中,我们提出并探索了一种分析计数数据的有效方法,该方法解决了过多零、重复观测之间的相关性以及由于辍学而导致的缺失响应等复杂问题。基于蒙特卡罗仿真研究了所提估计量的经验性质。应用程序提供了一些从健康研究中获得的真实数据。
{"title":"Zero-inflated Poisson mixed model for longitudinal count data with informative dropouts.","authors":"Sanjoy K Sinha","doi":"10.1080/02664763.2025.2481458","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481458","url":null,"abstract":"<p><p>Zero-inflated Poisson (ZIP) models are typically used for analyzing count data with excess zeros. If the data are collected longitudinally, then repeated observations from a given subject are correlated by nature. The ZIP mixed model may be used to deal with excess zeros and correlations among the repeated observations. Also, it is often the case that some follow-up measurements in a longitudinal study are missing. If the missing data are informative or nonignorable, it is necessary to incorporate a missingness mechanism into the observed likelihood function for a valid inference. In this paper, we propose and explore an efficient method for analyzing count data by addressing the complex issues of excess zeros, correlations among repeated observations, and missing responses due to dropouts. The empirical properties of the proposed estimators are studied based on Monte Carlo simulations. An application is provided using some real data obtained from a health study.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2686-2706"},"PeriodicalIF":1.1,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581761/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantile regression model for interval-censored data with competing risks. 具有竞争风险的区间截尾数据的分位数回归模型。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-15 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2474627
Amirah Afiqah Binti Che Ramli, Yang-Jin Kim

Our interest is to provide the methodology for estimating quantile regression model for interval-censored competing risk data. Lee and Kim [Analysis of interval censored competing risk data via nonparametric multiple imputation. Stat. Biopharm. Res. 13 (2020), pp. 367-374.] applied a censoring complete data concept suggested by Ruan and Gray [Analyses of cumulative incidence function via non-parametric multiple imputation. Sta. Med. 27 (2008), pp. 5709-5724.] to recover a missing information related with competing events. In this paper, we also applied it to a quantile regression model. The simulated censoring times of the competing events are generated with a multiple imputation technique and the survival function of right censoring times. The performance of suggested methods is evaluated by comparing with the result of a simple imputation method under several distributions and sample sizes. The AIDS dataset is analyzed to estimate the effect of several covariates on the quantiles of cause-specific CIF as a real data analysis.

我们的兴趣是提供估计区间审查竞争风险数据的分位数回归模型的方法。区间删减竞争风险数据的非参数多重插值分析[j]。统计,Biopharm。Res. 13(2020),第367-374页。]采用了Ruan和Gray[通过非参数多重imputation对累积关联函数的分析]提出的审查完备数据概念。Sta。医学。27 (2008),pp. 5709-5724。来恢复与比赛有关的丢失信息。在本文中,我们还将其应用于分位数回归模型。采用多重插值技术和正确的滤波时间生存函数,生成了模拟的赛事滤波时间。在不同的分布和样本量下,通过与简单的插值方法的结果进行比较,评价了所提方法的性能。对艾滋病数据集进行分析,以估计几个协变量对病因特异性CIF分位数的影响,作为实际数据分析。
{"title":"Quantile regression model for interval-censored data with competing risks.","authors":"Amirah Afiqah Binti Che Ramli, Yang-Jin Kim","doi":"10.1080/02664763.2025.2474627","DOIUrl":"https://doi.org/10.1080/02664763.2025.2474627","url":null,"abstract":"<p><p>Our interest is to provide the methodology for estimating quantile regression model for interval-censored competing risk data. Lee and Kim [<i>Analysis of interval censored competing risk data via nonparametric multiple imputation</i>. Stat. Biopharm. Res. 13 (2020), pp. 367-374.] applied a censoring complete data concept suggested by Ruan and Gray [<i>Analyses of cumulative incidence function via non-parametric multiple imputation</i>. Sta. Med. 27 (2008), pp. 5709-5724.] to recover a missing information related with competing events. In this paper, we also applied it to a quantile regression model. The simulated censoring times of the competing events are generated with a multiple imputation technique and the survival function of right censoring times. The performance of suggested methods is evaluated by comparing with the result of a simple imputation method under several distributions and sample sizes. The AIDS dataset is analyzed to estimate the effect of several covariates on the quantiles of cause-specific CIF as a real data analysis.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2438-2447"},"PeriodicalIF":1.1,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490390/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust parameter estimation and variable selection in regression models for asymmetric heteroscedastic data. 非对称异方差数据回归模型的鲁棒参数估计和变量选择。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-13 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2477726
Y Güney, O Arslan

In many real-world scenarios, not only the location but also the scale and even the skewness of the response variable may be influenced by explanatory variables. To achieve accurate predictions in such cases, it is essential to model location, scale, and skewness simultaneously. The joint location, scale, and skewness model of the skew-normal distribution is particularly useful for such data, as it relaxes the normality assumption, allowing for skewness. However, the estimation methods commonly used in these models tend to rely on classical approaches that are sensitive to outliers. Another challenge is selecting relevant variables. This study addresses these issues by first employing the maximum Lq-likelihood estimation method, which provides robust parameter estimation across the model. We then introduce the penalized Lq-likelihood method to select significant variables in the three sub-models. To obtain parameter estimates efficiently, we use the expectation-maximization algorithm. Through simulation studies and applications to real datasets, we demonstrate that the proposed methods outperform classical approaches, especially in the presence of outliers.

在许多现实场景中,不仅响应变量的位置,而且响应变量的规模甚至偏度都可能受到解释变量的影响。为了在这种情况下实现准确的预测,必须同时对位置、规模和偏度进行建模。斜正态分布的联合位置、比例和偏度模型对这类数据特别有用,因为它放宽了正态性假设,允许偏度。然而,这些模型中常用的估计方法往往依赖于对异常值敏感的经典方法。另一个挑战是选择相关变量。本研究通过首先采用最大lq -似然估计方法解决了这些问题,该方法提供了整个模型的鲁棒参数估计。然后,我们引入惩罚lq -似然方法来选择三个子模型中的显著变量。为了有效地获得参数估计,我们使用了期望最大化算法。通过对真实数据集的模拟研究和应用,我们证明了所提出的方法优于经典方法,特别是在存在异常值的情况下。
{"title":"Robust parameter estimation and variable selection in regression models for asymmetric heteroscedastic data.","authors":"Y Güney, O Arslan","doi":"10.1080/02664763.2025.2477726","DOIUrl":"https://doi.org/10.1080/02664763.2025.2477726","url":null,"abstract":"<p><p>In many real-world scenarios, not only the location but also the scale and even the skewness of the response variable may be influenced by explanatory variables. To achieve accurate predictions in such cases, it is essential to model location, scale, and skewness simultaneously. The joint location, scale, and skewness model of the skew-normal distribution is particularly useful for such data, as it relaxes the normality assumption, allowing for skewness. However, the estimation methods commonly used in these models tend to rely on classical approaches that are sensitive to outliers. Another challenge is selecting relevant variables. This study addresses these issues by first employing the maximum Lq-likelihood estimation method, which provides robust parameter estimation across the model. We then introduce the penalized Lq-likelihood method to select significant variables in the three sub-models. To obtain parameter estimates efficiently, we use the expectation-maximization algorithm. Through simulation studies and applications to real datasets, we demonstrate that the proposed methods outperform classical approaches, especially in the presence of outliers.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2559-2596"},"PeriodicalIF":1.1,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581768/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diagnostic analytics for the mixed Poisson INGARCH model with applications. 混合泊松INGARCH模型的诊断分析及其应用。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-12 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2476658
Wenjie Dang, Fukang Zhu, Nuo Xu, Shuangzhe Liu

In statistical diagnosis and sensitivity analysis, the local influence method plays a crucial role and is sometimes more advantageous than other methods. The mixed Poisson integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) model is built on a flexible family of mixed Poisson distributions. It not only encompasses the negative binomial INGARCH model but also allows for the introduction of the Poisson-inverse Gaussian INGARCH model and the Poisson generalized hyperbolic secant INGARCH model. This paper applies the local influence analysis method to count time series data within the framework of the mixed Poisson INGARCH model. For parameter estimation, the Expectation-Maximization algorithm is utilized. In the context of local influence analysis, two global influence methods (generalized Cook distance and Q-distance) and four perturbations-case weights perturbation, data perturbation, additive perturbation, and scale perturbation-are considered to identify influential points. Finally, the feasibility and effectiveness of the proposed methods are demonstrated through simulations and analysis of a real data set.

在统计诊断和敏感性分析中,局部影响法起着至关重要的作用,有时比其他方法更有优势。在混合泊松分布的柔性族基础上建立了混合泊松整值广义自回归条件异方差(INGARCH)模型。它不仅包含负二项INGARCH模型,而且还允许引入泊松逆高斯INGARCH模型和泊松广义双曲正割INGARCH模型。本文采用局部影响分析方法对混合泊松INGARCH模型框架内的时间序列数据进行计数。参数估计采用期望最大化算法。在局部影响分析的背景下,考虑了两种全局影响方法(广义Cook距离和q距离)和四种摄动-情况加权摄动、数据摄动、加性摄动和尺度摄动-来确定影响点。最后,通过对实际数据集的仿真和分析,验证了所提方法的可行性和有效性。
{"title":"Diagnostic analytics for the mixed Poisson INGARCH model with applications.","authors":"Wenjie Dang, Fukang Zhu, Nuo Xu, Shuangzhe Liu","doi":"10.1080/02664763.2025.2476658","DOIUrl":"https://doi.org/10.1080/02664763.2025.2476658","url":null,"abstract":"<p><p>In statistical diagnosis and sensitivity analysis, the local influence method plays a crucial role and is sometimes more advantageous than other methods. The mixed Poisson integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) model is built on a flexible family of mixed Poisson distributions. It not only encompasses the negative binomial INGARCH model but also allows for the introduction of the Poisson-inverse Gaussian INGARCH model and the Poisson generalized hyperbolic secant INGARCH model. This paper applies the local influence analysis method to count time series data within the framework of the mixed Poisson INGARCH model. For parameter estimation, the Expectation-Maximization algorithm is utilized. In the context of local influence analysis, two global influence methods (generalized Cook distance and Q-distance) and four perturbations-case weights perturbation, data perturbation, additive perturbation, and scale perturbation-are considered to identify influential points. Finally, the feasibility and effectiveness of the proposed methods are demonstrated through simulations and analysis of a real data set.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2495-2523"},"PeriodicalIF":1.1,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning causal effect of physical activity distribution: an application of functional treatment effect estimation with unmeasured confounding. 身体活动分布的学习因果效应:未测量混杂的功能治疗效果估计的应用。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-12 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2474611
Zhuoxin Long, Xiaoke Zhang

The National Health and Nutrition Examination Survey (NHANES) collects minute-level physical activity data by accelerometers as an important component of the survey to assess the health and nutritional status of adults and children in the US. In this paper, we analyze the NHANES accelerometry data to study the causal effect of physical activity distribution on body fat percentage, where the treatment is a function/distribution. In the presence of unmeasured confounding, we propose to integrate cross-fitting with two methods under the proximal causal inference framework to estimate the functional treatment effect. The two methods are shown practically appealing via both simulation and an NHANES accelerometry data analysis. In the analysis of the NHANES accelerometry data, the two methods also lead to a more intuitive and interpretable causal relationship between physical activity distribution and body fat percentage.

国家健康和营养检查调查(NHANES)通过加速度计收集分钟级的身体活动数据,作为调查的重要组成部分,以评估美国成人和儿童的健康和营养状况。在本文中,我们分析了NHANES加速度计数据,以研究身体活动分布对体脂率的因果关系,其中治疗是一个函数/分布。在存在无法测量的混杂因素的情况下,我们提出在近端因果推理框架下,将交叉拟合与两种方法相结合来估计功能治疗效果。通过仿真和NHANES加速度计数据分析,表明这两种方法具有实际吸引力。在对NHANES加速度计数据的分析中,这两种方法也导致了身体活动分布与体脂率之间更直观和可解释的因果关系。
{"title":"Learning causal effect of physical activity distribution: an application of functional treatment effect estimation with unmeasured confounding.","authors":"Zhuoxin Long, Xiaoke Zhang","doi":"10.1080/02664763.2025.2474611","DOIUrl":"https://doi.org/10.1080/02664763.2025.2474611","url":null,"abstract":"<p><p>The National Health and Nutrition Examination Survey (NHANES) collects minute-level physical activity data by accelerometers as an important component of the survey to assess the health and nutritional status of adults and children in the US. In this paper, we analyze the NHANES accelerometry data to study the causal effect of physical activity distribution on body fat percentage, where the treatment is a function/distribution. In the presence of unmeasured confounding, we propose to integrate cross-fitting with two methods under the proximal causal inference framework to estimate the functional treatment effect. The two methods are shown practically appealing via both simulation and an NHANES accelerometry data analysis. In the analysis of the NHANES accelerometry data, the two methods also lead to a more intuitive and interpretable causal relationship between physical activity distribution and body fat percentage.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2759-2776"},"PeriodicalIF":1.1,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581750/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the within-node estimation of survival trees while retaining interpretability. 在保留可解释性的同时,改进生存树的节点内估计。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-11 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2473535
Haolin Li, Yiyang Fan, Jianwen Cai

In statistical learning for survival data, survival trees are favored for their capacity to detect complex relationships beyond parametric and semiparametric models. Despite this, their prediction accuracy is often suboptimal. In this paper, we propose a new method based on super learning to improve the within-node estimation and overall survival prediction accuracy, while preserving the interpretability of the survival tree. Simulation studies reveal the proposed method's superior finite sample performance compared to conventional approaches for within-node estimation in survival trees. Furthermore, we apply this method to analyze the North Central Cancer Treatment Group Lung Cancer Data, cardiovascular medical records from the Faisalabad Institute of Cardiology, and the integrated genomic data of ovarian carcinoma with The Cancer Genome Atlas project.

在生存数据的统计学习中,生存树因其检测参数和半参数模型之外的复杂关系的能力而受到青睐。尽管如此,它们的预测精度往往不是最优的。在保留生存树可解释性的前提下,提出了一种基于超学习的节点内估计和整体生存预测精度提高的新方法。仿真研究表明,与传统的生存树节点内估计方法相比,该方法具有优越的有限样本性能。此外,我们应用该方法分析了中北部癌症治疗组的肺癌数据、费萨拉巴德心脏病研究所的心血管医疗记录以及卵巢癌基因组图谱项目的整合基因组数据。
{"title":"Improving the within-node estimation of survival trees while retaining interpretability.","authors":"Haolin Li, Yiyang Fan, Jianwen Cai","doi":"10.1080/02664763.2025.2473535","DOIUrl":"https://doi.org/10.1080/02664763.2025.2473535","url":null,"abstract":"<p><p>In statistical learning for survival data, survival trees are favored for their capacity to detect complex relationships beyond parametric and semiparametric models. Despite this, their prediction accuracy is often suboptimal. In this paper, we propose a new method based on super learning to improve the within-node estimation and overall survival prediction accuracy, while preserving the interpretability of the survival tree. Simulation studies reveal the proposed method's superior finite sample performance compared to conventional approaches for within-node estimation in survival trees. Furthermore, we apply this method to analyze the North Central Cancer Treatment Group Lung Cancer Data, cardiovascular medical records from the Faisalabad Institute of Cardiology, and the integrated genomic data of ovarian carcinoma with The Cancer Genome Atlas project.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2544-2558"},"PeriodicalIF":1.1,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490394/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating an executive summary of a time series: the tendency. 估计时间序列的执行摘要:趋势。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-03-10 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2475351
Caio Alves, Juan M Restrepo, Jorge M Ramirez

In this paper, we revisit the problem of decomposing a signal into a tendency and a residual. The tendency describes an executive summary of a signal that encapsulates its notable characteristics while disregarding seemingly random, less interesting aspects. Building upon the Intrinsic Time Decomposition (ITD) and information-theoretical analysis, we introduce two alternative procedures for selecting the tendency from the ITD baselines. The first is based on the maximum extrema prominence, namely the maximum difference between extrema within each baseline. Specifically this method selects the tendency as the baseline from which an ITD step would produce the largest decline of the maximum prominence. The second method uses the rotations from the ITD and selects the tendency as the last baseline for which the associated rotation is statistically stationary. We delve into a comparative analysis of the information content and interpretability of the tendencies obtained by our proposed methods and those obtained through conventional low-pass filtering schemes, particularly the Hodrik-Prescott (HP) filter. Our findings underscore a fundamental distinction in the nature and interpretability of these tendencies, highlighting their context-dependent utility with emphasis in multi-scale signals. Through a series of real-world applications, we demonstrate the computational robustness and practical utility of our proposed tendencies, emphasizing their adaptability and relevance in diverse time series contexts.

在本文中,我们重新讨论了将信号分解为趋势和残差的问题。这种趋势描述了一个信号的执行摘要,它包含了它的显著特征,而忽略了看似随机的、不那么有趣的方面。在固有时间分解(ITD)和信息理论分析的基础上,我们介绍了从ITD基线中选择趋势的两种替代方法。第一种方法是基于最大极值日珥,即每个基线内极值之间的最大差值。具体来说,该方法选择过渡段阶跃产生最大日珥下降幅度最大的趋势作为基线。第二种方法使用过渡段的旋转,并选择趋势作为相关旋转在统计上平稳的最后基线。我们深入研究了通过我们提出的方法和通过传统低通滤波方案,特别是Hodrik-Prescott (HP)滤波器获得的趋势的信息内容和可解释性的比较分析。我们的研究结果强调了这些趋势的本质和可解释性的根本区别,强调了它们在多尺度信号中的上下文依赖效用。通过一系列现实世界的应用,我们展示了我们提出的趋势的计算鲁棒性和实用价值,强调了它们在不同时间序列背景下的适应性和相关性。
{"title":"Estimating an executive summary of a time series: the tendency.","authors":"Caio Alves, Juan M Restrepo, Jorge M Ramirez","doi":"10.1080/02664763.2025.2475351","DOIUrl":"10.1080/02664763.2025.2475351","url":null,"abstract":"<p><p>In this paper, we revisit the problem of decomposing a signal into a tendency and a residual. The tendency describes an executive summary of a signal that encapsulates its notable characteristics while disregarding seemingly random, less interesting aspects. Building upon the Intrinsic Time Decomposition (ITD) and information-theoretical analysis, we introduce two alternative procedures for selecting the tendency from the ITD baselines. The first is based on the maximum extrema prominence, namely the maximum difference between extrema within each baseline. Specifically this method selects the tendency as the baseline from which an ITD step would produce the largest decline of the maximum prominence. The second method uses the rotations from the ITD and selects the tendency as the last baseline for which the associated rotation is statistically stationary. We delve into a comparative analysis of the information content and interpretability of the tendencies obtained by our proposed methods and those obtained through conventional low-pass filtering schemes, particularly the Hodrik-Prescott (HP) filter. Our findings underscore a fundamental distinction in the nature and interpretability of these tendencies, highlighting their context-dependent utility with emphasis in multi-scale signals. Through a series of real-world applications, we demonstrate the computational robustness and practical utility of our proposed tendencies, emphasizing their adaptability and relevance in diverse time series contexts.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2478-2494"},"PeriodicalIF":1.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490379/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Applied Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1