{"title":"Correction The probabilities of large deviations for a certain class of statistics associated with multinomial distributions","authors":"S. Mirakhmedov","doi":"10.1051/ps/2021002","DOIUrl":"https://doi.org/10.1051/ps/2021002","url":null,"abstract":"","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"35 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78960411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We construct unbiased estimators for the distribution of the number of points inside random stopping sets based on a Poisson point process. Our approach is based on moment identities for stopping sets, showing that the random count of points inside the complement S̅ of a stopping set S has a Poisson distribution conditionally to S. The proofs do not require the use of set-indexed martingales, and our estimators have a lower variance when compared to standard sampling. Numerical simulations are presented for examples such as the convex hull and the Voronoi flower of a Poisson point process, and their complements.
{"title":"Cardinality estimation for random stopping sets based on Poisson point processes","authors":"Nicolas Privault","doi":"10.1051/PS/2021004","DOIUrl":"https://doi.org/10.1051/PS/2021004","url":null,"abstract":"We construct unbiased estimators for the distribution of the number of points inside random stopping sets based on a Poisson point process. Our approach is based on moment identities for stopping sets, showing that the random count of points inside the complement S̅ of a stopping set S has a Poisson distribution conditionally to S. The proofs do not require the use of set-indexed martingales, and our estimators have a lower variance when compared to standard sampling. Numerical simulations are presented for examples such as the convex hull and the Voronoi flower of a Poisson point process, and their complements.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"52 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91271802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Expectation-Maximization Algorithm (EM) is a widely used method allowing to estimate the maximum likelihood of models involving latent variables. When the Expectation step cannot be computed easily, one can use stochastic versions of the EM such as the Stochastic Approximation EM. This algorithm, however, has the drawback to require the joint likelihood to belong to the curved exponential family. To overcome this problem, [16] introduced a rewriting of the model which “exponentializes” it by considering the parameter as an additional latent variable following a Normal distribution centered on the newly defined parameters and with fixed variance. The likelihood of this new exponentialized model now belongs to the curved exponential family. Although often used, there is no guarantee that the estimated mean is close to the maximum likelihood estimate of the initial model. In this paper, we quantify the error done in this estimation while considering the exponentialized model instead of the initial one. By verifying those results on an example, we see that a trade-off must be made between the speed of convergence and the tolerated error. Finally, we propose a new algorithm allowing a better estimation of the parameter in a reasonable computation time to reduce the bias.
{"title":"On the curved exponential family in the Stochastic Approximation Expectation Maximization Algorithm","authors":"Vianney Debavelaere, S. Allassonnière","doi":"10.1051/ps/2021015","DOIUrl":"https://doi.org/10.1051/ps/2021015","url":null,"abstract":"The Expectation-Maximization Algorithm (EM) is a widely used method allowing to estimate the maximum likelihood of models involving latent variables. When the Expectation step cannot be computed easily, one can use stochastic versions of the EM such as the Stochastic Approximation EM. This algorithm, however, has the drawback to require the joint likelihood to belong to the curved exponential family. To overcome this problem, [16] introduced a rewriting of the model which “exponentializes” it by considering the parameter as an additional latent variable following a Normal distribution centered on the newly defined parameters and with fixed variance. The likelihood of this new exponentialized model now belongs to the curved exponential family. Although often used, there is no guarantee that the estimated mean is close to the maximum likelihood estimate of the initial model. In this paper, we quantify the error done in this estimation while considering the exponentialized model instead of the initial one. By verifying those results on an example, we see that a trade-off must be made between the speed of convergence and the tolerated error. Finally, we propose a new algorithm allowing a better estimation of the parameter in a reasonable computation time to reduce the bias.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"41 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85400171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In their fundamental paper on cubic variance functions (VFs), Letac and Mora (The Annals of Statistics, 1990) presented a systematic, rigorous and comprehensive study of natural exponential families (NEFs) on the real line, their characterization through their VFs and mean value parameterization. They presented a section that for some reason has been left unnoticed. This section deals with the construction of VFs associated with NEFs of counting distributions on the set of nonnegative integers and allows to find the corresponding generating measures. As EDMs are based on NEFs, we introduce in this paper two new classes of EDMs based on their results. For these classes, which are associated with simple VFs, we derive their mean value parameterization and their associated generating measures. We also prove that they have some desirable properties. Both classes are shown to be overdispersed and zero inflated in ascending order, making them as competitive statistical models for those in use in both, statistical and actuarial modeling. To our best knowledge, the classes of counting distributions we present in this paper, have not been introduced or discussed before in the literature. To show that our classes can serve as competitive statistical models for those in use (e.g., Poisson, Negative binomial), we include a numerical example of real data. In this example, we compare the performance of our classes with relevant competitive models.
Letac和Mora (The Annals of Statistics, 1990)在其关于三次方差函数(VFs)的基础论文中,系统、严格和全面地研究了实线上的自然指数族(nef),并通过它们的VFs和均值参数化对它们进行了表征。他们展示了由于某种原因没有被注意到的部分。本节讨论与非负整数集合上计数分布的nef相关的vf的构造,并允许找到相应的生成措施。由于电火花加工是基于nef的,本文根据它们的结果介绍了两类新的电火花加工。对于这些与简单vf相关的类,我们推导了它们的均值参数化及其相关的生成度量。我们还证明了它们具有一些理想的性质。这两类都显示出过度分散和零膨胀的升序,使它们成为统计和精算模型中使用的统计模型的竞争对手。据我们所知,我们在本文中提出的计数分布的类别,在以前的文献中没有被介绍或讨论过。为了表明我们的类可以作为那些正在使用的竞争性统计模型(例如,泊松,负二项),我们包括一个实际数据的数值例子。在这个例子中,我们将类的性能与相关的竞争模型进行比较。
{"title":"New exponential dispersion models for count data: the ABM and LM classes","authors":"S. Bar-Lev, Ad Ridder","doi":"10.1051/PS/2021001","DOIUrl":"https://doi.org/10.1051/PS/2021001","url":null,"abstract":"In their fundamental paper on cubic variance functions (VFs), Letac and Mora (The Annals of Statistics, 1990) presented a systematic, rigorous and comprehensive study of natural exponential families (NEFs) on the real line, their characterization through their VFs and mean value parameterization. They presented a section that for some reason has been left unnoticed. This section deals with the construction of VFs associated with NEFs of counting distributions on the set of nonnegative integers and allows to find the corresponding generating measures. As EDMs are based on NEFs, we introduce in this paper two new classes of EDMs based on their results. For these classes, which are associated with simple VFs, we derive their mean value parameterization and their associated generating measures. We also prove that they have some desirable properties. Both classes are shown to be overdispersed and zero inflated in ascending order, making them as competitive statistical models for those in use in both, statistical and actuarial modeling. To our best knowledge, the classes of counting distributions we present in this paper, have not been introduced or discussed before in the literature. To show that our classes can serve as competitive statistical models for those in use (e.g., Poisson, Negative binomial), we include a numerical example of real data. In this example, we compare the performance of our classes with relevant competitive models.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"47 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83775653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We prove a Law of Large Numbers (LLN) for the domination number of class cover catch digraphs (CCCD) generated by random points in two (or higher) dimensions. DeVinney and Wierman (2002) proved the Strong Law of Large Numbers (SLLN) for the uniform distribution in one dimension, and Wierman and Xiang (2008) extended the SLLN to the case of general distributions in one dimension. In this article, using subadditive processes, we prove a SLLN result for the domination number generated by Poisson points in ℝ2. From this we obtain a Weak Law of Large Numbers (WLLN) for the domination number generated by random points in [0, 1]2 from uniform distribution first, and then extend these result to the case of bounded continuous distributions. We also extend the results to higher dimensions. The domination number of CCCDs and related digraphs have applications in statistical pattern classification and spatial data analysis.
我们证明了一类覆盖捕获有向图(CCCD)的支配数的大数定律(LLN),这些有向图是由两个(或更高)维的随机点生成的。DeVinney和Wierman(2002)证明了一维均匀分布的强大数定律(Strong Law of Large Numbers, SLLN), Wierman和Xiang(2008)将SLLN推广到一维一般分布的情况。本文利用次加性过程,证明了由泊松点生成的控制数的一个SLLN结果。由此首先得到了均匀分布下[0,1]2中随机点生成的支配数的一个弱大数定律,并将此结果推广到有界连续分布的情况。我们还将结果扩展到更高的维度。cccd及其相关有向图的支配数在统计模式分类和空间数据分析中具有广泛的应用。
{"title":"Law of large numbers for a two-dimensional class cover problem","authors":"E. Ceyhan, J. Wierman, Pengfei Xiang","doi":"10.1051/ps/2021013","DOIUrl":"https://doi.org/10.1051/ps/2021013","url":null,"abstract":"We prove a Law of Large Numbers (LLN) for the domination number of class cover catch digraphs (CCCD) generated by random points in two (or higher) dimensions. DeVinney and Wierman (2002) proved the Strong Law of Large Numbers (SLLN) for the uniform distribution in one dimension, and Wierman and Xiang (2008) extended the SLLN to the case of general distributions in one dimension. In this article, using subadditive processes, we prove a SLLN result for the domination number generated by Poisson points in ℝ2. From this we obtain a Weak Law of Large Numbers (WLLN) for the domination number generated by random points in [0, 1]2 from uniform distribution first, and then extend these result to the case of bounded continuous distributions. We also extend the results to higher dimensions. The domination number of CCCDs and related digraphs have applications in statistical pattern classification and spatial data analysis.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"17 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86152847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The original problem of group testing consists in the identification of defective items in a collection, by applying tests on groups of items that detect the presence of at least one defective element in the group. The aim is then to identify all defective items of the collection with as few tests as possible. This problem is relevant in several fields, among which biology and computer sciences. In the present article we consider that the tests applied to groups of items returns a load , measuring how defective the most defective item of the group is. In this setting, we propose a simple non-adaptative algorithm allowing the detection of all defective items of the collection. Items are put on an n×n grid and pools are organised as lines, columns and diagonals of this grid. This method improves on classical group testing algorithms using only the binary response of the test.
{"title":"A tractable non-adaptative group testing method for non-binary measurements","authors":"Émilien Joly, Bastien Mallein","doi":"10.1051/ps/2022007","DOIUrl":"https://doi.org/10.1051/ps/2022007","url":null,"abstract":"The original problem of group testing consists in the identification of defective items in a collection, by applying tests on groups of items that detect the presence of at least one defective element in the group. The aim is then to identify all defective items of the collection with as few tests as possible. This problem is relevant in several fields, among which biology and computer sciences. In the present article we consider that the tests applied to groups of items returns a load , measuring how defective the most defective item of the group is. In this setting, we propose a simple non-adaptative algorithm allowing the detection of all defective items of the collection. Items are put on an n×n grid and pools are organised as lines, columns and diagonals of this grid. This method improves on classical group testing algorithms using only the binary response of the test.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"11 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2020-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77252661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C'edric Rommel, J. Frédéric Bonnans, Baptiste Gregorutti, P. Martinon
In this paper, we tackle the problem of quantifying the closeness of a newly observed curve to a given sample of random functions, supposed to have been sampled from the same distribution. We define a probabilistic criterion for such a purpose, based on the marginal density functions of an underlying random process. For practical applications, a class of estimators based on the aggregation of multivariate density estimators is introduced and proved to be consistent. We illustrate the effectiveness of our estimators, as well as the practical usefulness of the proposed criterion, by applying our method to a dataset of real aircraft trajectories.
{"title":"Quantifying the closeness to a set of random curves via the mean marginal likelihood","authors":"C'edric Rommel, J. Frédéric Bonnans, Baptiste Gregorutti, P. Martinon","doi":"10.1051/PS/2020028","DOIUrl":"https://doi.org/10.1051/PS/2020028","url":null,"abstract":"In this paper, we tackle the problem of quantifying the closeness of a newly observed curve to a given sample of random functions, supposed to have been sampled from the same distribution. We define a probabilistic criterion for such a purpose, based on the marginal density functions of an underlying random process. For practical applications, a class of estimators based on the aggregation of multivariate density estimators is introduced and proved to be consistent. We illustrate the effectiveness of our estimators, as well as the practical usefulness of the proposed criterion, by applying our method to a dataset of real aircraft trajectories.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"10 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75190469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract. In this paper, we establish a probabilistic representation as well as some integration by parts formulae for the marginal law at a given time maturity of some stochastic volatility model with unbounded drift. Relying on a perturbation technique for Markov semigroups, our formulae are based on a simple Markov chain evolving on a random time grid for which we develop a tailor-made Malliavin calculus. Among other applications, an unbiased Monte Carlo path simulation method stems from our formulas so that it can be used in order to numerically compute with optimal complexity option prices as well as their sensitivities with respect to the initial values or Greeks in finance, namely the Delta and Vega , for a large class of non-smooth European payoff. Numerical results are proposed to illustrate the efficiency of the method.
{"title":"Probabilistic representation of integration by parts formulae for some stochastic volatility models with unbounded drift","authors":"Junchao Chen, N. Frikha, Houzhi Li","doi":"10.1051/ps/2022008","DOIUrl":"https://doi.org/10.1051/ps/2022008","url":null,"abstract":"Abstract. In this paper, we establish a probabilistic representation as well as some integration by parts formulae for the marginal law at a given time maturity of some stochastic volatility model with unbounded drift. Relying on a perturbation technique for Markov semigroups, our formulae are based on a simple Markov chain evolving on a random time grid for which we develop a tailor-made Malliavin calculus. Among other applications, an unbiased Monte Carlo path simulation method stems from our formulas so that it can be used in order to numerically compute with optimal complexity option prices as well as their sensitivities with respect to the initial values or Greeks in finance, namely the Delta and Vega , for a large class of non-smooth European payoff. Numerical results are proposed to illustrate the efficiency of the method.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"50 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2020-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85949562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We establish the convergences (with respect to the simulation time $t$; the number of particles $N$; the timestep $gamma$) of a Moran/Fleming-Viot type particle scheme toward the quasi-stationary distribution of a diffusion on the $d$-dimensional torus, killed at a smooth rate. In these conditions, quantitative bounds are obtained that, for each parameter ($trightarrow infty$, $Nrightarrow infty$ or $gammarightarrow 0$) are independent from the two others. p, li { white-space: pre-wrap; }
{"title":"Convergence of a particle approximation for the quasi-stationary distribution of a diffusion process: uniform estimates in a compact soft case.\u0000\u0000\u0000p, li { white-space: pre-wrap; }","authors":"Lucas Journel, Pierre Monmarch'e","doi":"10.1051/ps/2021017","DOIUrl":"https://doi.org/10.1051/ps/2021017","url":null,"abstract":"We establish the convergences (with respect to the simulation time $t$; the number of particles $N$; the timestep $gamma$) of a Moran/Fleming-Viot type particle scheme toward the quasi-stationary distribution of a diffusion on the $d$-dimensional torus, killed at a smooth rate. In these conditions, quantitative bounds are obtained that, for each parameter ($trightarrow infty$, $Nrightarrow infty$ or $gammarightarrow 0$) are independent from the two others.\u0000\u0000\u0000p, li { white-space: pre-wrap; }","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"7 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2020-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87481242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}