Pub Date : 2025-03-25eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2481454
Yang Li, Qijing Yan, Mixia Wu, Aiyi Liu
Variance changepoints in economics, finance, biomedicine, oceanography, etc. are frequent and significant. To better detect these changepoints, we propose a new technique for constructing confidence intervals for the variances of a noisy sequence with multiple changepoints by combining bootstrapping with the weighted sequential binary segmentation (WSBS) algorithm and the Bayesian information criterion (BIC). The intensity score obtained from the bootstrap replications is introduced to reflect the possibility that each location is, or is close to, one of the changepoints. On this basis, a new changepoint estimation is proposed, and its asymptotic properties are derived. The simulated results show that the proposed method has superior performance in comparison with the state-of-the-art segmentation methods. Finally, the method is applied to weekly stock prices, oceanographic data, DNA copy number data and traffic flow data.
{"title":"Bootstrap-based inference for multiple variance changepoint models.","authors":"Yang Li, Qijing Yan, Mixia Wu, Aiyi Liu","doi":"10.1080/02664763.2025.2481454","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481454","url":null,"abstract":"<p><p>Variance changepoints in economics, finance, biomedicine, oceanography, etc. are frequent and significant. To better detect these changepoints, we propose a new technique for constructing confidence intervals for the variances of a noisy sequence with multiple changepoints by combining bootstrapping with the weighted sequential binary segmentation (WSBS) algorithm and the Bayesian information criterion (BIC). The intensity score obtained from the bootstrap replications is introduced to reflect the possibility that each location is, or is close to, one of the changepoints. On this basis, a new changepoint estimation is proposed, and its asymptotic properties are derived. The simulated results show that the proposed method has superior performance in comparison with the state-of-the-art segmentation methods. Finally, the method is applied to weekly stock prices, oceanographic data, DNA copy number data and traffic flow data.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2636-2671"},"PeriodicalIF":1.1,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581773/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-24eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2481461
F Crescenzi, L Mori, G Betti, F Gagliardi, A D'Agostino, L Neri
Fuzzy set theory has become increasingly popular for deriving uni- and multi-dimensional poverty estimates. In recent years, various authors have proposed different approaches to defining membership functions, resulting in the development of various fuzzy poverty indices. This paper introduces a new R package called FuzzyPovertyR, designed for estimating fuzzy poverty indices. The package is demonstrated by using it to estimate three fuzzy poverty indices - one multi- and two uni-dimensional - at the regional level (NUTS 2) in Italy. The package allows users to select from a range of membership functions and includes tools for estimating the variance of these indices by the ad-hoc Jack-Knife repeated replication procedure or by naive and calibrated non-parametric bootstrap methods.
{"title":"An R tool for computing and evaluating Fuzzy poverty indices: The package FuzzyPovertyR.","authors":"F Crescenzi, L Mori, G Betti, F Gagliardi, A D'Agostino, L Neri","doi":"10.1080/02664763.2025.2481461","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481461","url":null,"abstract":"<p><p>Fuzzy set theory has become increasingly popular for deriving uni- and multi-dimensional poverty estimates. In recent years, various authors have proposed different approaches to defining membership functions, resulting in the development of various fuzzy poverty indices. This paper introduces a new R package called FuzzyPovertyR, designed for estimating fuzzy poverty indices. The package is demonstrated by using it to estimate three fuzzy poverty indices - one multi- and two uni-dimensional - at the regional level (NUTS 2) in Italy. The package allows users to select from a range of membership functions and includes tools for estimating the variance of these indices by the ad-hoc Jack-Knife repeated replication procedure or by naive and calibrated non-parametric bootstrap methods.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 15","pages":"2958-2971"},"PeriodicalIF":1.1,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12671421/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145668606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-21eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2481456
Solmaz Seifollahi, Zakariya Yahya Algamal, Mohammad Arashi
This paper examines the Cox proportional hazards model (CPHM) in the presence of multicollinearity. Typically, the maximum partial likelihood estimator (MPLE) is employed to estimate the model coefficients, which works well when the covariates are uncorrelated. However, in various scenarios, covariates are correlated, leading to unstable coefficient estimates with the MPLE. To address this challenge, Liu and ridge estimators have been introduced in the CPHMs. In this paper, we present the Kibria-Lukman estimator as an advancement over existing alternatives and explore its properties. We evaluate the performance of the proposed estimator through Monte Carlo simulations, utilizing mean squared error and mean absolute error as criteria for comparison. Additionally, we demonstrate our proposal advantages through analyzing a medical dataset.
{"title":"New insights into multicollinearity in the Cox proportional hazard models: the Kibria-Lukman estimator and its application.","authors":"Solmaz Seifollahi, Zakariya Yahya Algamal, Mohammad Arashi","doi":"10.1080/02664763.2025.2481456","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481456","url":null,"abstract":"<p><p>This paper examines the Cox proportional hazards model (CPHM) in the presence of multicollinearity. Typically, the maximum partial likelihood estimator (MPLE) is employed to estimate the model coefficients, which works well when the covariates are uncorrelated. However, in various scenarios, covariates are correlated, leading to unstable coefficient estimates with the MPLE. To address this challenge, Liu and ridge estimators have been introduced in the CPHMs. In this paper, we present the Kibria-Lukman estimator as an advancement over existing alternatives and explore its properties. We evaluate the performance of the proposed estimator through Monte Carlo simulations, utilizing mean squared error and mean absolute error as criteria for comparison. Additionally, we demonstrate our proposal advantages through analyzing a medical dataset.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2672-2685"},"PeriodicalIF":1.1,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581747/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-20eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2481458
Sanjoy K Sinha
Zero-inflated Poisson (ZIP) models are typically used for analyzing count data with excess zeros. If the data are collected longitudinally, then repeated observations from a given subject are correlated by nature. The ZIP mixed model may be used to deal with excess zeros and correlations among the repeated observations. Also, it is often the case that some follow-up measurements in a longitudinal study are missing. If the missing data are informative or nonignorable, it is necessary to incorporate a missingness mechanism into the observed likelihood function for a valid inference. In this paper, we propose and explore an efficient method for analyzing count data by addressing the complex issues of excess zeros, correlations among repeated observations, and missing responses due to dropouts. The empirical properties of the proposed estimators are studied based on Monte Carlo simulations. An application is provided using some real data obtained from a health study.
{"title":"Zero-inflated Poisson mixed model for longitudinal count data with informative dropouts.","authors":"Sanjoy K Sinha","doi":"10.1080/02664763.2025.2481458","DOIUrl":"https://doi.org/10.1080/02664763.2025.2481458","url":null,"abstract":"<p><p>Zero-inflated Poisson (ZIP) models are typically used for analyzing count data with excess zeros. If the data are collected longitudinally, then repeated observations from a given subject are correlated by nature. The ZIP mixed model may be used to deal with excess zeros and correlations among the repeated observations. Also, it is often the case that some follow-up measurements in a longitudinal study are missing. If the missing data are informative or nonignorable, it is necessary to incorporate a missingness mechanism into the observed likelihood function for a valid inference. In this paper, we propose and explore an efficient method for analyzing count data by addressing the complex issues of excess zeros, correlations among repeated observations, and missing responses due to dropouts. The empirical properties of the proposed estimators are studied based on Monte Carlo simulations. An application is provided using some real data obtained from a health study.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2686-2706"},"PeriodicalIF":1.1,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581761/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-15eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2474627
Amirah Afiqah Binti Che Ramli, Yang-Jin Kim
Our interest is to provide the methodology for estimating quantile regression model for interval-censored competing risk data. Lee and Kim [Analysis of interval censored competing risk data via nonparametric multiple imputation. Stat. Biopharm. Res. 13 (2020), pp. 367-374.] applied a censoring complete data concept suggested by Ruan and Gray [Analyses of cumulative incidence function via non-parametric multiple imputation. Sta. Med. 27 (2008), pp. 5709-5724.] to recover a missing information related with competing events. In this paper, we also applied it to a quantile regression model. The simulated censoring times of the competing events are generated with a multiple imputation technique and the survival function of right censoring times. The performance of suggested methods is evaluated by comparing with the result of a simple imputation method under several distributions and sample sizes. The AIDS dataset is analyzed to estimate the effect of several covariates on the quantiles of cause-specific CIF as a real data analysis.
{"title":"Quantile regression model for interval-censored data with competing risks.","authors":"Amirah Afiqah Binti Che Ramli, Yang-Jin Kim","doi":"10.1080/02664763.2025.2474627","DOIUrl":"https://doi.org/10.1080/02664763.2025.2474627","url":null,"abstract":"<p><p>Our interest is to provide the methodology for estimating quantile regression model for interval-censored competing risk data. Lee and Kim [<i>Analysis of interval censored competing risk data via nonparametric multiple imputation</i>. Stat. Biopharm. Res. 13 (2020), pp. 367-374.] applied a censoring complete data concept suggested by Ruan and Gray [<i>Analyses of cumulative incidence function via non-parametric multiple imputation</i>. Sta. Med. 27 (2008), pp. 5709-5724.] to recover a missing information related with competing events. In this paper, we also applied it to a quantile regression model. The simulated censoring times of the competing events are generated with a multiple imputation technique and the survival function of right censoring times. The performance of suggested methods is evaluated by comparing with the result of a simple imputation method under several distributions and sample sizes. The AIDS dataset is analyzed to estimate the effect of several covariates on the quantiles of cause-specific CIF as a real data analysis.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2438-2447"},"PeriodicalIF":1.1,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490390/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-13eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2477726
Y Güney, O Arslan
In many real-world scenarios, not only the location but also the scale and even the skewness of the response variable may be influenced by explanatory variables. To achieve accurate predictions in such cases, it is essential to model location, scale, and skewness simultaneously. The joint location, scale, and skewness model of the skew-normal distribution is particularly useful for such data, as it relaxes the normality assumption, allowing for skewness. However, the estimation methods commonly used in these models tend to rely on classical approaches that are sensitive to outliers. Another challenge is selecting relevant variables. This study addresses these issues by first employing the maximum Lq-likelihood estimation method, which provides robust parameter estimation across the model. We then introduce the penalized Lq-likelihood method to select significant variables in the three sub-models. To obtain parameter estimates efficiently, we use the expectation-maximization algorithm. Through simulation studies and applications to real datasets, we demonstrate that the proposed methods outperform classical approaches, especially in the presence of outliers.
{"title":"Robust parameter estimation and variable selection in regression models for asymmetric heteroscedastic data.","authors":"Y Güney, O Arslan","doi":"10.1080/02664763.2025.2477726","DOIUrl":"https://doi.org/10.1080/02664763.2025.2477726","url":null,"abstract":"<p><p>In many real-world scenarios, not only the location but also the scale and even the skewness of the response variable may be influenced by explanatory variables. To achieve accurate predictions in such cases, it is essential to model location, scale, and skewness simultaneously. The joint location, scale, and skewness model of the skew-normal distribution is particularly useful for such data, as it relaxes the normality assumption, allowing for skewness. However, the estimation methods commonly used in these models tend to rely on classical approaches that are sensitive to outliers. Another challenge is selecting relevant variables. This study addresses these issues by first employing the maximum Lq-likelihood estimation method, which provides robust parameter estimation across the model. We then introduce the penalized Lq-likelihood method to select significant variables in the three sub-models. To obtain parameter estimates efficiently, we use the expectation-maximization algorithm. Through simulation studies and applications to real datasets, we demonstrate that the proposed methods outperform classical approaches, especially in the presence of outliers.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2559-2596"},"PeriodicalIF":1.1,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581768/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2476658
Wenjie Dang, Fukang Zhu, Nuo Xu, Shuangzhe Liu
In statistical diagnosis and sensitivity analysis, the local influence method plays a crucial role and is sometimes more advantageous than other methods. The mixed Poisson integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) model is built on a flexible family of mixed Poisson distributions. It not only encompasses the negative binomial INGARCH model but also allows for the introduction of the Poisson-inverse Gaussian INGARCH model and the Poisson generalized hyperbolic secant INGARCH model. This paper applies the local influence analysis method to count time series data within the framework of the mixed Poisson INGARCH model. For parameter estimation, the Expectation-Maximization algorithm is utilized. In the context of local influence analysis, two global influence methods (generalized Cook distance and Q-distance) and four perturbations-case weights perturbation, data perturbation, additive perturbation, and scale perturbation-are considered to identify influential points. Finally, the feasibility and effectiveness of the proposed methods are demonstrated through simulations and analysis of a real data set.
{"title":"Diagnostic analytics for the mixed Poisson INGARCH model with applications.","authors":"Wenjie Dang, Fukang Zhu, Nuo Xu, Shuangzhe Liu","doi":"10.1080/02664763.2025.2476658","DOIUrl":"https://doi.org/10.1080/02664763.2025.2476658","url":null,"abstract":"<p><p>In statistical diagnosis and sensitivity analysis, the local influence method plays a crucial role and is sometimes more advantageous than other methods. The mixed Poisson integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) model is built on a flexible family of mixed Poisson distributions. It not only encompasses the negative binomial INGARCH model but also allows for the introduction of the Poisson-inverse Gaussian INGARCH model and the Poisson generalized hyperbolic secant INGARCH model. This paper applies the local influence analysis method to count time series data within the framework of the mixed Poisson INGARCH model. For parameter estimation, the Expectation-Maximization algorithm is utilized. In the context of local influence analysis, two global influence methods (generalized Cook distance and Q-distance) and four perturbations-case weights perturbation, data perturbation, additive perturbation, and scale perturbation-are considered to identify influential points. Finally, the feasibility and effectiveness of the proposed methods are demonstrated through simulations and analysis of a real data set.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2495-2523"},"PeriodicalIF":1.1,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2474611
Zhuoxin Long, Xiaoke Zhang
The National Health and Nutrition Examination Survey (NHANES) collects minute-level physical activity data by accelerometers as an important component of the survey to assess the health and nutritional status of adults and children in the US. In this paper, we analyze the NHANES accelerometry data to study the causal effect of physical activity distribution on body fat percentage, where the treatment is a function/distribution. In the presence of unmeasured confounding, we propose to integrate cross-fitting with two methods under the proximal causal inference framework to estimate the functional treatment effect. The two methods are shown practically appealing via both simulation and an NHANES accelerometry data analysis. In the analysis of the NHANES accelerometry data, the two methods also lead to a more intuitive and interpretable causal relationship between physical activity distribution and body fat percentage.
{"title":"Learning causal effect of physical activity distribution: an application of functional treatment effect estimation with unmeasured confounding.","authors":"Zhuoxin Long, Xiaoke Zhang","doi":"10.1080/02664763.2025.2474611","DOIUrl":"https://doi.org/10.1080/02664763.2025.2474611","url":null,"abstract":"<p><p>The National Health and Nutrition Examination Survey (NHANES) collects minute-level physical activity data by accelerometers as an important component of the survey to assess the health and nutritional status of adults and children in the US. In this paper, we analyze the NHANES accelerometry data to study the causal effect of physical activity distribution on body fat percentage, where the treatment is a function/distribution. In the presence of unmeasured confounding, we propose to integrate cross-fitting with two methods under the proximal causal inference framework to estimate the functional treatment effect. The two methods are shown practically appealing via both simulation and an NHANES accelerometry data analysis. In the analysis of the NHANES accelerometry data, the two methods also lead to a more intuitive and interpretable causal relationship between physical activity distribution and body fat percentage.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 14","pages":"2759-2776"},"PeriodicalIF":1.1,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12581750/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-11eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2473535
Haolin Li, Yiyang Fan, Jianwen Cai
In statistical learning for survival data, survival trees are favored for their capacity to detect complex relationships beyond parametric and semiparametric models. Despite this, their prediction accuracy is often suboptimal. In this paper, we propose a new method based on super learning to improve the within-node estimation and overall survival prediction accuracy, while preserving the interpretability of the survival tree. Simulation studies reveal the proposed method's superior finite sample performance compared to conventional approaches for within-node estimation in survival trees. Furthermore, we apply this method to analyze the North Central Cancer Treatment Group Lung Cancer Data, cardiovascular medical records from the Faisalabad Institute of Cardiology, and the integrated genomic data of ovarian carcinoma with The Cancer Genome Atlas project.
{"title":"Improving the within-node estimation of survival trees while retaining interpretability.","authors":"Haolin Li, Yiyang Fan, Jianwen Cai","doi":"10.1080/02664763.2025.2473535","DOIUrl":"https://doi.org/10.1080/02664763.2025.2473535","url":null,"abstract":"<p><p>In statistical learning for survival data, survival trees are favored for their capacity to detect complex relationships beyond parametric and semiparametric models. Despite this, their prediction accuracy is often suboptimal. In this paper, we propose a new method based on super learning to improve the within-node estimation and overall survival prediction accuracy, while preserving the interpretability of the survival tree. Simulation studies reveal the proposed method's superior finite sample performance compared to conventional approaches for within-node estimation in survival trees. Furthermore, we apply this method to analyze the North Central Cancer Treatment Group Lung Cancer Data, cardiovascular medical records from the Faisalabad Institute of Cardiology, and the integrated genomic data of ovarian carcinoma with The Cancer Genome Atlas project.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2544-2558"},"PeriodicalIF":1.1,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490394/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-10eCollection Date: 2025-01-01DOI: 10.1080/02664763.2025.2475351
Caio Alves, Juan M Restrepo, Jorge M Ramirez
In this paper, we revisit the problem of decomposing a signal into a tendency and a residual. The tendency describes an executive summary of a signal that encapsulates its notable characteristics while disregarding seemingly random, less interesting aspects. Building upon the Intrinsic Time Decomposition (ITD) and information-theoretical analysis, we introduce two alternative procedures for selecting the tendency from the ITD baselines. The first is based on the maximum extrema prominence, namely the maximum difference between extrema within each baseline. Specifically this method selects the tendency as the baseline from which an ITD step would produce the largest decline of the maximum prominence. The second method uses the rotations from the ITD and selects the tendency as the last baseline for which the associated rotation is statistically stationary. We delve into a comparative analysis of the information content and interpretability of the tendencies obtained by our proposed methods and those obtained through conventional low-pass filtering schemes, particularly the Hodrik-Prescott (HP) filter. Our findings underscore a fundamental distinction in the nature and interpretability of these tendencies, highlighting their context-dependent utility with emphasis in multi-scale signals. Through a series of real-world applications, we demonstrate the computational robustness and practical utility of our proposed tendencies, emphasizing their adaptability and relevance in diverse time series contexts.
{"title":"Estimating an executive summary of a time series: the tendency.","authors":"Caio Alves, Juan M Restrepo, Jorge M Ramirez","doi":"10.1080/02664763.2025.2475351","DOIUrl":"10.1080/02664763.2025.2475351","url":null,"abstract":"<p><p>In this paper, we revisit the problem of decomposing a signal into a tendency and a residual. The tendency describes an executive summary of a signal that encapsulates its notable characteristics while disregarding seemingly random, less interesting aspects. Building upon the Intrinsic Time Decomposition (ITD) and information-theoretical analysis, we introduce two alternative procedures for selecting the tendency from the ITD baselines. The first is based on the maximum extrema prominence, namely the maximum difference between extrema within each baseline. Specifically this method selects the tendency as the baseline from which an ITD step would produce the largest decline of the maximum prominence. The second method uses the rotations from the ITD and selects the tendency as the last baseline for which the associated rotation is statistically stationary. We delve into a comparative analysis of the information content and interpretability of the tendencies obtained by our proposed methods and those obtained through conventional low-pass filtering schemes, particularly the Hodrik-Prescott (HP) filter. Our findings underscore a fundamental distinction in the nature and interpretability of these tendencies, highlighting their context-dependent utility with emphasis in multi-scale signals. Through a series of real-world applications, we demonstrate the computational robustness and practical utility of our proposed tendencies, emphasizing their adaptability and relevance in diverse time series contexts.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2478-2494"},"PeriodicalIF":1.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490379/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}