We analyze the convergence rates for a family of auto‐regressive Markov chains on Euclidean space depending on a parameter , where at each step a randomly chosen coordinate is replaced by a noisy damped weighted average of the others. The interest in the model comes from the connection with a certain Bayesian scheme introduced by de Finetti in the analysis of partially exchangeable data. Our main result shows that, when n gets large (corresponding to a vanishing noise), a cutoff phenomenon occurs.
我们分析了欧几里得空间上的自动回归马尔可夫链的收敛率,该链取决于一个参数 ,其中每一步随机选择的坐标都由其他坐标的噪声阻尼加权平均值代替。该模型与德菲内蒂(de Finetti)在分析部分可交换数据时引入的某种贝叶斯方案有关,因而引起了人们的兴趣。我们的主要结果表明,当 n 变大时(对应于噪声消失),就会出现截断现象。
{"title":"Cutoff for a class of auto‐regressive models with vanishing additive noise","authors":"Balázs Gerencsér, Andrea Ottolini","doi":"10.1111/sjos.12748","DOIUrl":"https://doi.org/10.1111/sjos.12748","url":null,"abstract":"We analyze the convergence rates for a family of auto‐regressive Markov chains on Euclidean space depending on a parameter , where at each step a randomly chosen coordinate is replaced by a noisy damped weighted average of the others. The interest in the model comes from the connection with a certain Bayesian scheme introduced by de Finetti in the analysis of partially exchangeable data. Our main result shows that, when <jats:italic>n</jats:italic> gets large (corresponding to a vanishing noise), a cutoff phenomenon occurs.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"10 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142186968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tomasz Ca̧kała, Błażej Miasojedow, Wojciech Rejchel, Maryia Shpak
Continuous time Bayesian networks (CTBNs) represent a class of stochastic processes, which can be used to model complex phenomena, for instance, they can describe interactions occurring in living processes, social science models or medicine. The literature on this topic is usually focused on a case when a dependence structure of a system is known and we are to determine conditional transition intensities (parameters of a network). In the paper, we study a structure learning problem, which is a more challenging task and the existing research on this topic is limited. The approach, which we propose, is based on a penalized likelihood method. We prove that our algorithm, under mild regularity conditions, recognizes a dependence structure of a graph with high probability. We also investigate properties of the procedure in numerical studies.
{"title":"Structure learning for continuous time Bayesian networks via penalized likelihood","authors":"Tomasz Ca̧kała, Błażej Miasojedow, Wojciech Rejchel, Maryia Shpak","doi":"10.1111/sjos.12747","DOIUrl":"https://doi.org/10.1111/sjos.12747","url":null,"abstract":"Continuous time Bayesian networks (CTBNs) represent a class of stochastic processes, which can be used to model complex phenomena, for instance, they can describe interactions occurring in living processes, social science models or medicine. The literature on this topic is usually focused on a case when a dependence structure of a system is known and we are to determine conditional transition intensities (parameters of a network). In the paper, we study a structure learning problem, which is a more challenging task and the existing research on this topic is limited. The approach, which we propose, is based on a penalized likelihood method. We prove that our algorithm, under mild regularity conditions, recognizes a dependence structure of a graph with high probability. We also investigate properties of the procedure in numerical studies.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"50 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142186967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network or matrix reconstruction is a general problem that occurs if the row‐ and column sums of a matrix are given, and the matrix entries need to be predicted conditional on the aggregated information. In this paper, we show that the predictions obtained from the iterative proportional fitting procedure (IPFP) or equivalently maximum entropy (ME) can be obtained by restricted maximum likelihood estimation relying on augmented Lagrangian optimization. Based on this equivalence, we extend the framework of network reconstruction, conditional on row and column sums, toward regression, which allows the inclusion of exogenous covariates and bootstrap‐based uncertainty quantification. More specifically, the mean of the regression model leads to the observed row and column margins. To exemplify the approach, we provide a simulation study and investigate interbank lending data, provided by the Bank for International Settlement. This dataset provides full knowledge of the real network and is, therefore, suitable to evaluate the predictions of our approach. It is shown that the inclusion of exogenous information leads to superior predictions in terms of and errors.
{"title":"Regression‐based network‐flow and inner‐matrix reconstruction","authors":"Michael Lebacher, Göran Kauermann","doi":"10.1111/sjos.12742","DOIUrl":"https://doi.org/10.1111/sjos.12742","url":null,"abstract":"Network or matrix reconstruction is a general problem that occurs if the row‐ and column sums of a matrix are given, and the matrix entries need to be predicted conditional on the aggregated information. In this paper, we show that the predictions obtained from the iterative proportional fitting procedure (IPFP) or equivalently maximum entropy (ME) can be obtained by restricted maximum likelihood estimation relying on augmented Lagrangian optimization. Based on this equivalence, we extend the framework of network reconstruction, conditional on row and column sums, toward regression, which allows the inclusion of exogenous covariates and bootstrap‐based uncertainty quantification. More specifically, the mean of the regression model leads to the observed row and column margins. To exemplify the approach, we provide a simulation study and investigate interbank lending data, provided by the Bank for International Settlement. This dataset provides full knowledge of the real network and is, therefore, suitable to evaluate the predictions of our approach. It is shown that the inclusion of exogenous information leads to superior predictions in terms of and errors.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"46 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Louise Alamichel, Daria Bystrova, Julyan Arbel, Guillaume Kon Kam King
Bayesian nonparametric mixture models are common for modeling complex data. While these models are well‐suited for density estimation, recent results proved posterior inconsistency of the number of clusters when the true number of components is finite, for the Dirichlet process and Pitman–Yor process mixture models. We extend these results to additional Bayesian nonparametric priors such as Gibbs‐type processes and finite‐dimensional representations thereof. The latter include the Dirichlet multinomial process, the recently proposed Pitman–Yor, and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a postprocessing algorithm introduced for the Dirichlet process can be extended to more general models and provides a consistent method to estimate the number of components.
{"title":"Bayesian mixture models (in)consistency for the number of clusters","authors":"Louise Alamichel, Daria Bystrova, Julyan Arbel, Guillaume Kon Kam King","doi":"10.1111/sjos.12739","DOIUrl":"https://doi.org/10.1111/sjos.12739","url":null,"abstract":"Bayesian nonparametric mixture models are common for modeling complex data. While these models are well‐suited for density estimation, recent results proved posterior inconsistency of the number of clusters when the true number of components is finite, for the Dirichlet process and Pitman–Yor process mixture models. We extend these results to additional Bayesian nonparametric priors such as Gibbs‐type processes and finite‐dimensional representations thereof. The latter include the Dirichlet multinomial process, the recently proposed Pitman–Yor, and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a postprocessing algorithm introduced for the Dirichlet process can be extended to more general models and provides a consistent method to estimate the number of components.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"7 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A special section honoring Nils Lid Hjort","authors":"Ørnulf Borgan, Ingrid K. Glad","doi":"10.1111/sjos.12745","DOIUrl":"https://doi.org/10.1111/sjos.12745","url":null,"abstract":"","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"126 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georgios Aristotelous, Theodore Kypraios, Philip D. O'Neill
This paper addresses the problem of assessing the homogeneity of the disease transmission process in stochastic epidemic models in populations that are partitioned into social groups. We develop a classical hypothesis test for completed epidemics which assesses whether or not there is significant within‐group transmission during an outbreak. The test is based on time‐ordered group labels of individuals. The null hypothesis is that of homogeneity of disease transmission among individuals, a hypothesis under which the discrete random vector of groups labels has a known sampling distribution that is independent of any model parameters. The test exhibits excellent performance when applied to various scenarios of simulated data and is also illustrated using two real‐life epidemic data sets. We develop some asymptotic theory including a central limit theorem. The test is practically very appealing, being computationally cheap and straightforward to implement, as well as being applicable to a wide range of real‐life outbreak settings and to related problems in other fields.
{"title":"A classical hypothesis test for assessing the homogeneity of disease transmission in stochastic epidemic models","authors":"Georgios Aristotelous, Theodore Kypraios, Philip D. O'Neill","doi":"10.1111/sjos.12743","DOIUrl":"https://doi.org/10.1111/sjos.12743","url":null,"abstract":"This paper addresses the problem of assessing the homogeneity of the disease transmission process in stochastic epidemic models in populations that are partitioned into social groups. We develop a classical hypothesis test for completed epidemics which assesses whether or not there is significant within‐group transmission during an outbreak. The test is based on time‐ordered group labels of individuals. The null hypothesis is that of homogeneity of disease transmission among individuals, a hypothesis under which the discrete random vector of groups labels has a known sampling distribution that is independent of any model parameters. The test exhibits excellent performance when applied to various scenarios of simulated data and is also illustrated using two real‐life epidemic data sets. We develop some asymptotic theory including a central limit theorem. The test is practically very appealing, being computationally cheap and straightforward to implement, as well as being applicable to a wide range of real‐life outbreak settings and to related problems in other fields.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"61 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the problem of the nonparametric estimation for the density of the stationary distribution of a ‐dimensional stochastic differential equation . From the continuous observation of the sampling path on , we study the estimation of as goes to infinity. For , we characterize the minimax rate for the ‐risk in pointwise estimation over a class of anisotropic Hölder functions with regularity . For , our finding is that, having ordered the smoothness such that , the minimax rate depends on whether or . In the first case, this rate is , and in the second case, it is , where is an explicit exponent dependent on the dimension and , the harmonic mean of smoothness over the directions after excluding and , the smallest ones. We also demonstrate that kernel‐based estimators achieve the optimal minimax rate. Furthermore, we propose an adaptive procedure for both integrated and pointwise risk. In the two‐dimensional case, we show that kernel density estimators achieve the rate , which is optimal in the minimax sense. Finally we illustrate the validity of our theoretical findings by proposing numerical results.
{"title":"Minimax rate of estimation for invariant densities associated to continuous stochastic differential equations over anisotropic Hölder classes","authors":"Chiara Amorino, Arnaud Gloter","doi":"10.1111/sjos.12735","DOIUrl":"https://doi.org/10.1111/sjos.12735","url":null,"abstract":"We study the problem of the nonparametric estimation for the density of the stationary distribution of a ‐dimensional stochastic differential equation . From the continuous observation of the sampling path on , we study the estimation of as goes to infinity. For , we characterize the minimax rate for the ‐risk in pointwise estimation over a class of anisotropic Hölder functions with regularity . For , our finding is that, having ordered the smoothness such that , the minimax rate depends on whether or . In the first case, this rate is , and in the second case, it is , where is an explicit exponent dependent on the dimension and , the harmonic mean of smoothness over the directions after excluding and , the smallest ones. We also demonstrate that kernel‐based estimators achieve the optimal minimax rate. Furthermore, we propose an adaptive procedure for both integrated and pointwise risk. In the two‐dimensional case, we show that kernel density estimators achieve the rate , which is optimal in the minimax sense. Finally we illustrate the validity of our theoretical findings by proposing numerical results.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"32 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141614551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The win ratio has in the recent decade gained popularity for analyzing prioritized multiple event data in clinical cohort studies, in particular within cardiovascular research. The literature on estimation of the win ratio using censored event data is however sparse. The methods that have been suggested have either an insufficient adjustment of the censoring or by assuming the the win and loss probabilities are proportional over time. The assumption of proportional win and loss probabilities will often in practice not be satisfied. In this paper, we present estimates for the win ratio, and win and loss probabilities, under independent right‐censoring and derive the asymptotic distribution of the estimates. The proposed win ratio estimate does not require the assumption of proportional win and loss probabilities. The small sample properties of the proposed method are studied in a simulation study showing that the variance formula is accurate even for small samples. The method is applied on two data sets.
{"title":"Estimation of win, loss probabilities, and win ratio based on right‐censored event data","authors":"Erik T. Parner, Morten Overgaard","doi":"10.1111/sjos.12734","DOIUrl":"https://doi.org/10.1111/sjos.12734","url":null,"abstract":"The win ratio has in the recent decade gained popularity for analyzing prioritized multiple event data in clinical cohort studies, in particular within cardiovascular research. The literature on estimation of the win ratio using censored event data is however sparse. The methods that have been suggested have either an insufficient adjustment of the censoring or by assuming the the win and loss probabilities are proportional over time. The assumption of proportional win and loss probabilities will often in practice not be satisfied. In this paper, we present estimates for the win ratio, and win and loss probabilities, under independent right‐censoring and derive the asymptotic distribution of the estimates. The proposed win ratio estimate does not require the assumption of proportional win and loss probabilities. The small sample properties of the proposed method are studied in a simulation study showing that the variance formula is accurate even for small samples. The method is applied on two data sets.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"19 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141548567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Commentary on “Pitfalls of amateur regression: The Dutch New Herring controversies”","authors":"Jan C. Van Ours, Ben Vollaard","doi":"10.1111/sjos.12741","DOIUrl":"https://doi.org/10.1111/sjos.12741","url":null,"abstract":"","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"37 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141548566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperspherical kernel density estimators (KDE), which use a parametric distribution as a guide, are studied in this paper. The main benefit is that these estimators improve the bias of nonguided kernel density estimators when the parametric guiding distribution is not too far from the true density, while preserving the variance. When using a von Mises‐Fisher density as guide, the proposal performs as well as the classical KDE, even when the guiding model is incorrect, and far from the true distribution. This benefit is particular for the hyperspherical setting given its compact support, and is in contrast to similar methods for real valued data. Moreover, we deal with the important issue of data‐driven selection of the smoothing parameter. Simulations and real data examples illustrate the finite‐sample performance of the proposed method, also in comparison with other recently proposed estimation methods.
本文研究了使用参数分布作为导向的超球核密度估计器(KDE)。其主要优点是,当参数指导分布与真实密度相差不大时,这些估计器可以改善非指导核密度估计器的偏差,同时保留方差。当使用 von Mises-Fisher 密度作为指导时,即使指导模型不正确且与真实分布相差甚远,该提案的性能也不亚于经典的 KDE。考虑到超球面的紧凑支持,这种优势在超球面设置中尤为明显,这与用于实值数据的类似方法形成了鲜明对比。此外,我们还处理了数据驱动的平滑参数选择这一重要问题。模拟和真实数据实例说明了所提方法的有限样本性能,同时也与最近提出的其他估计方法进行了比较。
{"title":"Nonparametric estimation of densities on the hypersphere using a parametric guide","authors":"María Alonso‐Pena, Gerda Claeskens, Irène Gijbels","doi":"10.1111/sjos.12737","DOIUrl":"https://doi.org/10.1111/sjos.12737","url":null,"abstract":"Hyperspherical kernel density estimators (KDE), which use a parametric distribution as a guide, are studied in this paper. The main benefit is that these estimators improve the bias of nonguided kernel density estimators when the parametric guiding distribution is not too far from the true density, while preserving the variance. When using a von Mises‐Fisher density as guide, the proposal performs as well as the classical KDE, even when the guiding model is incorrect, and far from the true distribution. This benefit is particular for the hyperspherical setting given its compact support, and is in contrast to similar methods for real valued data. Moreover, we deal with the important issue of data‐driven selection of the smoothing parameter. Simulations and real data examples illustrate the finite‐sample performance of the proposed method, also in comparison with other recently proposed estimation methods.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"24 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141548569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}