Pub Date : 2025-12-19DOI: 10.1007/s10182-025-00552-3
Philipp Otto, Osman Doğan, Süleyman Taşpınar
This short paper explores the estimation of a dynamic spatiotemporal autoregressive conditional heteroscedasticity (ARCH) model. The log-volatility term in this model can depend on (i) the spatial lag of the log-squared outcome variable, (ii) the time-lag of the log-squared outcome variable, (iii) the spatiotemporal lag of the log-squared outcome variable, (iv) exogenous variables, and (v) the unobserved heterogeneity across regions and time, i.e., the regional and time fixed effects. We examine the small- and large-sample properties of two quasi-maximum likelihood estimators and a generalised method of moments estimator for this model. We first summarize the theoretical properties of these estimators and then compare their finite sample properties through Monte Carlo simulations.
{"title":"A note on dynamic spatiotemporal ARCH models: small- and large-sample results","authors":"Philipp Otto, Osman Doğan, Süleyman Taşpınar","doi":"10.1007/s10182-025-00552-3","DOIUrl":"10.1007/s10182-025-00552-3","url":null,"abstract":"<div><p>This short paper explores the estimation of a dynamic spatiotemporal autoregressive conditional heteroscedasticity (ARCH) model. The log-volatility term in this model can depend on (i) the spatial lag of the log-squared outcome variable, (ii) the time-lag of the log-squared outcome variable, (iii) the spatiotemporal lag of the log-squared outcome variable, (iv) exogenous variables, and (v) the unobserved heterogeneity across regions and time, i.e., the regional and time fixed effects. We examine the small- and large-sample properties of two quasi-maximum likelihood estimators and a generalised method of moments estimator for this model. We first summarize the theoretical properties of these estimators and then compare their finite sample properties through Monte Carlo simulations.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 4","pages":"811 - 828"},"PeriodicalIF":1.4,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00552-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145915717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-12DOI: 10.1007/s10182-025-00551-4
Philipp Otto, Janine Illian
{"title":"Advances in spatial econometrics and geostatistics: methods, theory, and applications","authors":"Philipp Otto, Janine Illian","doi":"10.1007/s10182-025-00551-4","DOIUrl":"10.1007/s10182-025-00551-4","url":null,"abstract":"","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 4","pages":"633 - 636"},"PeriodicalIF":1.4,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145915692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-21DOI: 10.1007/s10182-025-00547-0
Pierpaolo D’Urso, Livia De Giovanni, Lorenzo Federico, Vincenzina Vitale
Clustering categorical data presents unique challenges that traditional techniques do not adequately address. This paper proposes an extension of the fuzzy C-modes algorithm. By incorporating a noise cluster and integrating spatial contiguity relationships among units, the algorithm’s robustness is significantly enhanced. Performance evaluations using synthetic data demonstrate the efficacy of the proposed algorithm in handling both global and local outliers. Furthermore, the paper discusses the application of the algorithm to real-world data on sustainable urban mobility in the Italian provincial capitals during 2021, highlighting its practical relevance and potential impact in real-world scenarios.
{"title":"Fuzzy C-modes clustering with spatial regularization and noise cluster","authors":"Pierpaolo D’Urso, Livia De Giovanni, Lorenzo Federico, Vincenzina Vitale","doi":"10.1007/s10182-025-00547-0","DOIUrl":"10.1007/s10182-025-00547-0","url":null,"abstract":"<div><p>Clustering categorical data presents unique challenges that traditional techniques do not adequately address. This paper proposes an extension of the fuzzy C-modes algorithm. By incorporating a noise cluster and integrating spatial contiguity relationships among units, the algorithm’s robustness is significantly enhanced. Performance evaluations using synthetic data demonstrate the efficacy of the proposed algorithm in handling both global and local outliers. Furthermore, the paper discusses the application of the algorithm to real-world data on sustainable urban mobility in the Italian provincial capitals during 2021, highlighting its practical relevance and potential impact in real-world scenarios.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 4","pages":"771 - 809"},"PeriodicalIF":1.4,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00547-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145915722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.1007/s10182-025-00544-3
Wisdom Aselisewine, Suvra Pal
We introduce a novel two-component framework for analyzing mixed case interval censored (MCIC) data featuring a cured subgroup. In such data, the time-to-event is known only within certain intervals determined by multiple random examination time points. Moreover, a portion of the subjects will never experience the event. The first component of our model focuses on estimating the likelihood of being cured (incidence), departing from the conventional generalized linear model to adopt a more adaptable support vector machine (SVM) approach capable of accommodating complex or non-linear covariate effects. The second component addresses the survival distribution of the uncured individuals (latency) and employs a Cox proportional hazards structure to maintain the straightforward interpretation of covariate effects. We develop an expectation maximization algorithm, incorporating the Platt scaling method, to estimate the probability of being cured. Our simulation study demonstrates that our model outperforms both logit-based and spline-based models in capturing complex classification boundaries, leading to more accurate estimates of cured/uncured probabilities and enhanced predictive accuracy for cure. We emphasize that enhancing the estimation accuracy regarding incidence subsequently improves the estimation outcomes concerning latency. Finally, we illustrate the efficacy of our methodology by applying it to the NASA's Hypobaric Decompression Sickness Data.
{"title":"Machine Learning Approach for Analyzing Mixed Case Interval Censored Data with a Cured Subgroup.","authors":"Wisdom Aselisewine, Suvra Pal","doi":"10.1007/s10182-025-00544-3","DOIUrl":"10.1007/s10182-025-00544-3","url":null,"abstract":"<p><p>We introduce a novel two-component framework for analyzing mixed case interval censored (MCIC) data featuring a cured subgroup. In such data, the time-to-event is known only within certain intervals determined by multiple random examination time points. Moreover, a portion of the subjects will never experience the event. The first component of our model focuses on estimating the likelihood of being cured (incidence), departing from the conventional generalized linear model to adopt a more adaptable support vector machine (SVM) approach capable of accommodating complex or non-linear covariate effects. The second component addresses the survival distribution of the uncured individuals (latency) and employs a Cox proportional hazards structure to maintain the straightforward interpretation of covariate effects. We develop an expectation maximization algorithm, incorporating the Platt scaling method, to estimate the probability of being cured. Our simulation study demonstrates that our model outperforms both logit-based and spline-based models in capturing complex classification boundaries, leading to more accurate estimates of cured/uncured probabilities and enhanced predictive accuracy for cure. We emphasize that enhancing the estimation accuracy regarding incidence subsequently improves the estimation outcomes concerning latency. Finally, we illustrate the efficacy of our methodology by applying it to the NASA's Hypobaric Decompression Sickness Data.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12514071/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145281887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-28DOI: 10.1007/s10182-025-00535-4
Paola Vicard, Paola Maria Vittoria Rancoita, Federica Cugnata, Alberto Briganti, Fulvia Mecatti, Clelia Di Serio, Pier Luigi Conti
This paper proposes a new statistical approach for assessing treatment effect using Bayesian Networks (BNs). The goal is to draw causal inferences from observational data with a binary outcome and discrete covariates. The BNs are here used to estimate the propensity score, which enables flexible modeling and ensures maximum likelihood properties. When the propensity score is estimated by BNs, two point estimators are considered—Hájek and Horvitz–Thompson—based on inverse probability weighting, and their main distributional properties are derived for constructing confidence intervals and testing hypotheses about the absence of the treatment effect. Empirical evidence is presented to show the good behavior of the proposed methodology through a simulation study mimicking the characteristics of a real dataset of prostate cancer patients from Milan San Raffaele Hospital.
{"title":"Testing for causal effect for binary data when propensity scores are estimated through Bayesian Networks","authors":"Paola Vicard, Paola Maria Vittoria Rancoita, Federica Cugnata, Alberto Briganti, Fulvia Mecatti, Clelia Di Serio, Pier Luigi Conti","doi":"10.1007/s10182-025-00535-4","DOIUrl":"10.1007/s10182-025-00535-4","url":null,"abstract":"<div><p>This paper proposes a new statistical approach for assessing treatment effect using Bayesian Networks (BNs). The goal is to draw causal inferences from observational data with a binary outcome and discrete covariates. The BNs are here used to estimate the propensity score, which enables flexible modeling and ensures maximum likelihood properties. When the propensity score is estimated by BNs, two point estimators are considered—Hájek and Horvitz–Thompson—based on inverse probability weighting, and their main distributional properties are derived for constructing confidence intervals and testing hypotheses about the absence of the treatment effect. Empirical evidence is presented to show the good behavior of the proposed methodology through a simulation study mimicking the characteristics of a real dataset of prostate cancer patients from Milan San Raffaele Hospital.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"483 - 508"},"PeriodicalIF":1.4,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00535-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-28DOI: 10.1007/s10182-025-00533-6
Ambra Macis
Performance measurement is of paramount importance in the context of sports analytics. A great variety of data analysis methods has been exploited to this aim. All these proposals almost never include resorting to survival analysis techniques, although time-to-event data are suitable for addressing this issue. This work aims to identify the main achievements of a National Basketball Association player that affect the time it takes for him to exceed a given threshold of points. In order to identify nonlinear effects and possible interactions among the predictors, the analysis is carried out with machine learning methods, specifically survival trees and random survival forests.
{"title":"Basketball players performance measurement with algorithmic survival data analysis","authors":"Ambra Macis","doi":"10.1007/s10182-025-00533-6","DOIUrl":"10.1007/s10182-025-00533-6","url":null,"abstract":"<div><p>Performance measurement is of paramount importance in the context of sports analytics. A great variety of data analysis methods has been exploited to this aim. All these proposals almost never include resorting to survival analysis techniques, although time-to-event data are suitable for addressing this issue. This work aims to identify the main achievements of a National Basketball Association player that affect the time it takes for him to exceed a given threshold of points. In order to identify nonlinear effects and possible interactions among the predictors, the analysis is carried out with machine learning methods, specifically survival trees and random survival forests.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"529 - 555"},"PeriodicalIF":1.4,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00533-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We discuss the estimation and forecast of long-memory models for count data time series. We first demonstrate by Monte Carlo simulations that the Whittle estimator is the most appropriate for recovering the memory degree of a count data time series. In the following, we introduce the possibility of forecasting count data by exploiting the infinite autoregressive representation of the model. We complete our analysis with an empirical example in which we verify the predictability of the price jump numbers.
{"title":"Forecasting time series by long-memory models for count data with an application to price jumps","authors":"Luisa Bisaglia, Massimiliano Caporin, Matteo Grigoletto","doi":"10.1007/s10182-025-00538-1","DOIUrl":"10.1007/s10182-025-00538-1","url":null,"abstract":"<div><p>We discuss the estimation and forecast of long-memory models for count data time series. We first demonstrate by Monte Carlo simulations that the Whittle estimator is the most appropriate for recovering the memory degree of a count data time series. In the following, we introduce the possibility of forecasting count data by exploiting the infinite autoregressive representation of the model. We complete our analysis with an empirical example in which we verify the predictability of the price jump numbers.\u0000</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"417 - 441"},"PeriodicalIF":1.4,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00538-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-18DOI: 10.1007/s10182-025-00531-8
Nihan Acar-Denizli, Pedro Delicado
Wearable devices and sensors have recently become a popular way to collect data, especially in the health sciences. The use of sensors allows patients to be monitored over a period of time with a high observation frequency. Due to the continuous-on-time structure of the data, novel statistical methods are recommended for the analysis of sensor data. One of the popular approaches in the analysis of wearable sensor data is functional data analysis. The main objective of this paper is to review functional data analysis methods applied to wearable device data according to the type of sensor. In addition, we introduce several freely available software packages and open databases of wearable device data to facilitate access to sensor data in different fields.
{"title":"Functional data analysis for wearable sensor data: a systematic review","authors":"Nihan Acar-Denizli, Pedro Delicado","doi":"10.1007/s10182-025-00531-8","DOIUrl":"10.1007/s10182-025-00531-8","url":null,"abstract":"<div><p>Wearable devices and sensors have recently become a popular way to collect data, especially in the health sciences. The use of sensors allows patients to be monitored over a period of time with a high observation frequency. Due to the continuous-on-time structure of the data, novel statistical methods are recommended for the analysis of sensor data. One of the popular approaches in the analysis of wearable sensor data is functional data analysis. The main objective of this paper is to review functional data analysis methods applied to wearable device data according to the type of sensor. In addition, we introduce several freely available software packages and open databases of wearable device data to facilitate access to sensor data in different fields.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"591 - 631"},"PeriodicalIF":1.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00531-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-10DOI: 10.1007/s10182-025-00529-2
Christoph Muehlmann, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen
This paper aims to introduce a novel approach to spatial blind source separation (SBSS) that addresses the limitations of existing methods. Current SBSS techniques rely on the joint diagonalization of multiple local covariance functions, all of which assume isotropy. To overcome this constraint, anisotropic local covariance matrices that relax the isotropy assumption are proposed. A simulation study and an application on real-world data demonstrate the performance improvement obtained by incorporating these anisotropic covariance matrices into the SBSS framework and highlight the potential of this new approach for more accurate and flexible source separation in spatial data analysis.
{"title":"Anisotropic local covariance matrices for spatial blind source separation","authors":"Christoph Muehlmann, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen","doi":"10.1007/s10182-025-00529-2","DOIUrl":"10.1007/s10182-025-00529-2","url":null,"abstract":"<div><p>This paper aims to introduce a novel approach to spatial blind source separation (SBSS) that addresses the limitations of existing methods. Current SBSS techniques rely on the joint diagonalization of multiple local covariance functions, all of which assume isotropy. To overcome this constraint, anisotropic local covariance matrices that relax the isotropy assumption are proposed. A simulation study and an application on real-world data demonstrate the performance improvement obtained by incorporating these anisotropic covariance matrices into the SBSS framework and highlight the potential of this new approach for more accurate and flexible source separation in spatial data analysis.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 4","pages":"753 - 770"},"PeriodicalIF":1.4,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00529-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145915721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}