首页 > 最新文献

Statistical Science最新文献

英文 中文
Robustness by Reweighting for Kernel Estimators: An Overview 核估计的重加权鲁棒性:综述
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-11-01 DOI: 10.1214/20-sts816
K. De Brabanter, Joseph De Brabanter
Using least squares techniques, there is an awareness of the dangers posed by the occurrence of outliers present in the data. In general, outliers may totally spoil an ordinary least squares analysis. To cope with this problem, statistical techniques have been developed that are not so easily affected by outliers. These methods are called robust or resistant. In this overview paper we illustrate that robust solutions can be acquired by solving a reweighted least squares problem even though the initial solution is not robust. This overview paper relates classical results from robustness to the most recent advances of robustness in least squares kernel based regression, with an emphasis on theoretical results as well as practical examples. Software for iterative reweighting is also made freely available to the user.
使用最小二乘法,可以意识到数据中出现异常值所带来的危险。一般来说,异常值可能会完全破坏普通的最小二乘分析。为了解决这个问题,已经开发出了不那么容易受到异常值影响的统计技术。这些方法被称为鲁棒性或抵抗性。在这篇综述文章中,我们说明了通过求解重加权最小二乘问题可以获得鲁棒解,即使初始解不是鲁棒的。本文将稳健性的经典结果与基于最小二乘核的回归中稳健性的最新进展联系起来,重点介绍了理论结果和实例。用于迭代重新加权的软件也可免费提供给用户。
{"title":"Robustness by Reweighting for Kernel Estimators: An Overview","authors":"K. De Brabanter, Joseph De Brabanter","doi":"10.1214/20-sts816","DOIUrl":"https://doi.org/10.1214/20-sts816","url":null,"abstract":"Using least squares techniques, there is an awareness of the dangers posed by the occurrence of outliers present in the data. In general, outliers may totally spoil an ordinary least squares analysis. To cope with this problem, statistical techniques have been developed that are not so easily affected by outliers. These methods are called robust or resistant. In this overview paper we illustrate that robust solutions can be acquired by solving a reweighted least squares problem even though the initial solution is not robust. This overview paper relates classical results from robustness to the most recent advances of robustness in least squares kernel based regression, with an emphasis on theoretical results as well as practical examples. Software for iterative reweighting is also made freely available to the user.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46726047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Conversation with Don Dawson 与唐·道森的对话
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-11-01 DOI: 10.1214/21-sts821
Bouchra R. Nasri, B. Rémillard, B. Szyszkowicz, Jean Vaillancourt
. Donald Andrew Dawson (Don Dawson) was born in 1937. He received a bachelor’s degree in 1958 and a master’s degree in 1959 from McGill University and a Ph.D. in 1963 from M.I.T. under the supervision of Henry P. McKean, Jr. Following an appointment at McGill University as professor for 7 years, he joined Carleton University in 1970 where he remained for the rest of his career. Among his many contributions to the theory of stochastic processes, his work leading to the creation of the Dawson–Watanabe superprocess and the analysis of its remarkable properties in describing the evolution in space and time of populations, stand out as milestones of modern probability theory. His numerous papers span the whole gamut of contemporary hot areas, notably the study of stochastic evolution equations, measure-valued processes, McKean–Vlasov limits, hierarchical structures, super-Brownian motion, as well as branching, catalytic and historical processes. He has over 200 refereed publications and 8 monographs, with an impressive number of citations, more than 7000. He is elected Fellow of the Royal Society and of the Royal Society of Canada, as well as Gold medalist of the Statistical Society of Canada and elected Fellow of the Institute of Mathematical Statistics. We realized this interview to celebrate the outstanding contribution of Don Dawson to 50 years of Stochastics at Carleton University.
唐纳德·安德鲁·道森(唐·道森饰)生于1937年。他于1958年获得麦吉尔大学学士学位,1959年获得硕士学位,1963年在小亨利·P·麦肯的指导下获得麻省理工学院博士学位。在麦吉尔大学担任教授7年后,他于1970年加入卡尔顿大学,在那里度过了他的职业生涯。在他对随机过程理论的众多贡献中,他创造了道森-渡边超过程,并分析了其在描述种群空间和时间进化方面的显著特性,这些都是现代概率论的里程碑。他的众多论文涵盖了当代热门领域的所有领域,尤其是对随机进化方程、测度值过程、McKean–Vlasov极限、层次结构、超布朗运动以及分支、催化和历史过程的研究。他有200多篇参考出版物和8部专著,引用次数惊人,超过7000次。他被选为加拿大皇家学会和皇家学会院士,加拿大统计学会金牌得主和数学统计研究所院士。我们意识到这次采访是为了庆祝唐·道森对卡尔顿大学50年斯多葛学派的杰出贡献。
{"title":"A Conversation with Don Dawson","authors":"Bouchra R. Nasri, B. Rémillard, B. Szyszkowicz, Jean Vaillancourt","doi":"10.1214/21-sts821","DOIUrl":"https://doi.org/10.1214/21-sts821","url":null,"abstract":". Donald Andrew Dawson (Don Dawson) was born in 1937. He received a bachelor’s degree in 1958 and a master’s degree in 1959 from McGill University and a Ph.D. in 1963 from M.I.T. under the supervision of Henry P. McKean, Jr. Following an appointment at McGill University as professor for 7 years, he joined Carleton University in 1970 where he remained for the rest of his career. Among his many contributions to the theory of stochastic processes, his work leading to the creation of the Dawson–Watanabe superprocess and the analysis of its remarkable properties in describing the evolution in space and time of populations, stand out as milestones of modern probability theory. His numerous papers span the whole gamut of contemporary hot areas, notably the study of stochastic evolution equations, measure-valued processes, McKean–Vlasov limits, hierarchical structures, super-Brownian motion, as well as branching, catalytic and historical processes. He has over 200 refereed publications and 8 monographs, with an impressive number of citations, more than 7000. He is elected Fellow of the Royal Society and of the Royal Society of Canada, as well as Gold medalist of the Statistical Society of Canada and elected Fellow of the Institute of Mathematical Statistics. We realized this interview to celebrate the outstanding contribution of Don Dawson to 50 years of Stochastics at Carleton University.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41927993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Symmetrical and Non-symmetrical Variants of Three-Way Correspondence Analysis for Ordered Variables 有序变量三向对应分析的对称与非对称变体
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-11-01 DOI: 10.1214/20-sts814
Rosaria Lombardo Eric J Beh, P. Kroonenberg
. In the framework of multi-way data analysis, this paper presents symmetrical and non-symmetrical variants of three-way correspondence analysis that are suitable when a three-way contingency table is constructed from ordinal variables. In particular, such variables may be modelled using general recurrence formulae to generate orthogonal polynomial vectors in-stead of singular vectors coming from one of the possible three-way extensions of the singular value decomposition. As we shall see, these polynomials, that until now have been used to decompose two-way contingency tables with ordered variables, also constitute an alternative orthogonal basis for modelling symmetrical, non-symmetrical associations and predictabilities in three-way contingency tables. Consequences with respect to modelling and graphing will be highlighted.
在多维数据分析的框架下,本文提出了三元对应分析的对称和非对称变体,这些变体适用于由序数变量构造三元列联表的情况。特别地,可以使用一般的递推公式对这样的变量进行建模,以生成正交多项式向量,而不是来自奇异值分解的可能的三元扩展之一的奇异向量。正如我们将看到的,这些多项式,到目前为止一直被用来分解具有有序变量的双向列联表,也构成了一个替代的正交基础,用于建模三元列联表中的对称、非对称关联和可预测性。将强调建模和绘图的后果。
{"title":"Symmetrical and Non-symmetrical Variants of Three-Way Correspondence Analysis for Ordered Variables","authors":"Rosaria Lombardo Eric J Beh, P. Kroonenberg","doi":"10.1214/20-sts814","DOIUrl":"https://doi.org/10.1214/20-sts814","url":null,"abstract":". In the framework of multi-way data analysis, this paper presents symmetrical and non-symmetrical variants of three-way correspondence analysis that are suitable when a three-way contingency table is constructed from ordinal variables. In particular, such variables may be modelled using general recurrence formulae to generate orthogonal polynomial vectors in-stead of singular vectors coming from one of the possible three-way extensions of the singular value decomposition. As we shall see, these polynomials, that until now have been used to decompose two-way contingency tables with ordered variables, also constitute an alternative orthogonal basis for modelling symmetrical, non-symmetrical associations and predictabilities in three-way contingency tables. Consequences with respect to modelling and graphing will be highlighted.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48886926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Variational Inference for Cutting Feedback in Misspecified Models 未指定模型中切削反馈的变分推理
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-08-25 DOI: 10.1214/23-sts886
Xue Yu, D. Nott, M. Smith
Bayesian analyses combine information represented by different terms in a joint Bayesian model. When one or more of the terms is misspecified, it can be helpful to restrict the use of information from suspect model components to modify posterior inference. This is called"cutting feedback", and both the specification and computation of the posterior for such"cut models"is challenging. In this paper, we define cut posterior distributions as solutions to constrained optimization problems, and propose optimization-based variational methods for their computation. These methods are faster than existing Markov chain Monte Carlo (MCMC) approaches for computing cut posterior distributions by an order of magnitude. It is also shown that variational methods allow for the evaluation of computationally intensive conflict checks that can be used to decide whether or not feedback should be cut. Our methods are illustrated in a number of simulated and real examples, including an application where recent methodological advances that combine variational inference and MCMC within the variational optimization are used.
贝叶斯分析将由不同术语表示的信息组合在联合贝叶斯模型中。当一个或多个术语被错误指定时,限制使用来自可疑模型组件的信息来修改后验推理可能会很有帮助。这被称为“切割反馈”,并且这种“切割模型”的后验的规范和计算都是具有挑战性的。在本文中,我们将割后验分布定义为约束优化问题的解,并提出了基于优化的变分方法来计算它们。这些方法比现有的计算切后验分布的马尔可夫链蒙特卡罗(MCMC)方法快一个数量级。还表明,变分方法允许评估计算密集型冲突检查,该冲突检查可用于决定是否应削减反馈。我们的方法在许多模拟和实际例子中得到了说明,包括一个应用,其中使用了在变分优化中结合变分推理和MCMC的最新方法学进展。
{"title":"Variational Inference for Cutting Feedback in Misspecified Models","authors":"Xue Yu, D. Nott, M. Smith","doi":"10.1214/23-sts886","DOIUrl":"https://doi.org/10.1214/23-sts886","url":null,"abstract":"Bayesian analyses combine information represented by different terms in a joint Bayesian model. When one or more of the terms is misspecified, it can be helpful to restrict the use of information from suspect model components to modify posterior inference. This is called\"cutting feedback\", and both the specification and computation of the posterior for such\"cut models\"is challenging. In this paper, we define cut posterior distributions as solutions to constrained optimization problems, and propose optimization-based variational methods for their computation. These methods are faster than existing Markov chain Monte Carlo (MCMC) approaches for computing cut posterior distributions by an order of magnitude. It is also shown that variational methods allow for the evaluation of computationally intensive conflict checks that can be used to decide whether or not feedback should be cut. Our methods are illustrated in a number of simulated and real examples, including an application where recent methodological advances that combine variational inference and MCMC within the variational optimization are used.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43472951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Seven Principles for Rapid-Response Data Science: Lessons Learned from Covid-19 Forecasting 快速响应数据科学的七项原则:从新冠肺炎预测中吸取的经验教训
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-08-19 DOI: 10.1214/22-sts855
Bin Yu, Chandan Singh
In this article, we take a step back to distill seven principles out of our experience in the spring of 2020, when our 12-person rapid-response team used skills of data science and beyond to help distribute Covid PPE. This process included tapping into domain knowledge of epidemiology and medical logistics chains, curating a relevant data repository, developing models for short-term county-level death forecasting in the US, and building a website for sharing visualization (an automated AI machine). The principles are described in the context of working with Response4Life, a then-new nonprofit organization, to illustrate their necessity. Many of these principles overlap with those in standard data-science teams, but an emphasis is put on dealing with problems that require rapid response, often resembling agile software development.
在本文中,我们回顾一下我们在2020年春季的经验,总结出7条原则,当时我们的12人快速反应团队使用数据科学及其他技能来帮助分发Covid - PPE。这个过程包括利用流行病学和医疗物流链的领域知识,管理相关的数据存储库,开发美国短期县级死亡预测模型,以及建立一个共享可视化的网站(一个自动化的人工智能机器)。这些原则是在与Response4Life(一个当时新成立的非营利组织)合作的背景下描述的,以说明它们的必要性。这些原则中有许多与标准数据科学团队中的原则重叠,但重点放在处理需要快速响应的问题上,通常类似于敏捷软件开发。
{"title":"Seven Principles for Rapid-Response Data Science: Lessons Learned from Covid-19 Forecasting","authors":"Bin Yu, Chandan Singh","doi":"10.1214/22-sts855","DOIUrl":"https://doi.org/10.1214/22-sts855","url":null,"abstract":"In this article, we take a step back to distill seven principles out of our experience in the spring of 2020, when our 12-person rapid-response team used skills of data science and beyond to help distribute Covid PPE. This process included tapping into domain knowledge of epidemiology and medical logistics chains, curating a relevant data repository, developing models for short-term county-level death forecasting in the US, and building a website for sharing visualization (an automated AI machine). The principles are described in the context of working with Response4Life, a then-new nonprofit organization, to illustrate their necessity. Many of these principles overlap with those in standard data-science teams, but an emphasis is put on dealing with problems that require rapid response, often resembling agile software development.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47846235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Random Matrix Theory and Its Applications 随机矩阵理论及其应用
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-08-01 DOI: 10.1142/9789814273121
A. Izenman
This article reviews the important ideas behind random matrix theory (RMT), which has become a major tool in a variety of disciplines, including mathematical physics, number theory, combinatorics and multivariate statistical analysis. Much of the theory involves ensembles of random matrices that are governed by some probability distribution. Examples include Gaussian ensembles and Wishart–Laguerre ensembles. Interest has centered on studying the spectrum of random matrices, especially the extreme eigenvalues, suitably normalized, for a single Wishart matrix and for two Wishart matrices, for finite and infinite sample sizes in the real and complex cases. The Tracy–Widom Laws for the probability distribution of a normalized largest eigenvalue of a random matrix have become very prominent in RMT. Limiting probability distributions of eigenvalues of a certain random matrix lead to Wigner’s Semicircle Law and Marc˘enko–Pastur’s Quarter-Circle Law. Several applications of these results in RMT are described in this article.
本文回顾了随机矩阵理论(RMT)背后的重要思想,它已经成为各种学科的主要工具,包括数学物理,数论,组合学和多元统计分析。许多理论涉及由某种概率分布支配的随机矩阵的集合。例子包括高斯系综和Wishart-Laguerre系综。兴趣集中在研究随机矩阵的谱,特别是对于单个Wishart矩阵和两个Wishart矩阵,在真实和复杂情况下的有限和无限样本量,适当归一化的极端特征值。随机矩阵的归一化最大特征值的概率分布的tracey - wisdom定律在RMT中已经变得非常突出。某随机矩阵特征值的极限概率分布导致Wigner的半圆定律和Marc × × enko-Pastur的四分之一圆定律。本文描述了这些结果在RMT中的几个应用。
{"title":"Random Matrix Theory and Its Applications","authors":"A. Izenman","doi":"10.1142/9789814273121","DOIUrl":"https://doi.org/10.1142/9789814273121","url":null,"abstract":"This article reviews the important ideas behind random matrix theory (RMT), which has become a major tool in a variety of disciplines, including mathematical physics, number theory, combinatorics and multivariate statistical analysis. Much of the theory involves ensembles of random matrices that are governed by some probability distribution. Examples include Gaussian ensembles and Wishart–Laguerre ensembles. Interest has centered on studying the spectrum of random matrices, especially the extreme eigenvalues, suitably normalized, for a single Wishart matrix and for two Wishart matrices, for finite and infinite sample sizes in the real and complex cases. The Tracy–Widom Laws for the probability distribution of a normalized largest eigenvalue of a random matrix have become very prominent in RMT. Limiting probability distributions of eigenvalues of a certain random matrix lead to Wigner’s Semicircle Law and Marc˘enko–Pastur’s Quarter-Circle Law. Several applications of these results in RMT are described in this article.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"1 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41736171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Khinchin’s 1929 Paper on Von Mises’ Frequency Theory of Probability 钦钦1929年关于冯·米塞斯概率频率理论的论文
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-08-01 DOI: 10.1214/20-sts798
L. Verburgt
In 1929, a few years prior to his colleague Kolmogorov’s Grundbegriffe, the leading Russian probabilist Khinchin published a paper in which he commented on the foundational ambitions of von Mises’ frequency theory of probability. This brief introduction provides background and context for the English translation of Khinchin’s historically revealing paper, published as an online supplement.
1929年,在他的同事Kolmogorov的Grundbergriffe之前几年,俄罗斯著名的概率学家Khinchin发表了一篇论文,他在论文中评论了von Mises的概率频率理论的基本野心。这篇简介为钦钦的历史启示论文的英文翻译提供了背景和背景,该论文作为在线增刊发表。
{"title":"Khinchin’s 1929 Paper on Von Mises’ Frequency Theory of Probability","authors":"L. Verburgt","doi":"10.1214/20-sts798","DOIUrl":"https://doi.org/10.1214/20-sts798","url":null,"abstract":"In 1929, a few years prior to his colleague Kolmogorov’s Grundbegriffe, the leading Russian probabilist Khinchin published a paper in which he commented on the foundational ambitions of von Mises’ frequency theory of probability. This brief introduction provides background and context for the English translation of Khinchin’s historically revealing paper, published as an online supplement.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49222524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical Modeling for Practical Pooled Testing During the COVID-19 Pandemic 新冠肺炎大流行期间实际汇集测试的统计模型
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-07-12 DOI: 10.1214/22-sts857
S. Comess, H. Wang, S. Holmes, Claire Donnat
Pooled testing offers an efficient solution to the unprecedented testing demands of the COVID-19 pandemic, although with potentially lower sensitivity and increased costs to implementation in some settings. Assessments of this trade-off typically assume pooled specimens are independent and identically distributed. Yet, in the context of COVID-19, these assumptions are often violated: testing done on networks (housemates, spouses, co-workers) captures correlated individuals, while infection risk varies substantially across time, place and individuals. Neglecting dependencies and heterogeneity may bias established optimality grids and induce a sub-optimal implementation of the procedure. As a lesson learned from this pandemic, this paper highlights the necessity of integrating field sampling information with statistical modeling to efficiently optimize pooled testing. Using real data, we show that (a) greater gains can be achieved at low logistical cost by exploiting natural correlations (non-independence) between samples -- allowing improvements in sensitivity and efficiency of up to 30% and 90% respectively;and (b) these gains are robust despite substantial heterogeneity across pools (non-identical). Our modeling results complement and extend the observations of Barak et al (2021) who report an empirical sensitivity well beyond expectations. Finally, we provide an interactive tool for selecting an optimal pool size using contextual information
汇集检测为新冠肺炎大流行前所未有的检测需求提供了有效的解决方案,尽管在某些情况下可能会降低灵敏度并增加实施成本。这种权衡的评估通常假设汇集的样本是独立的且分布相同。然而,在新冠肺炎的背景下,这些假设往往被违反:在网络(室友、配偶、同事)上进行的测试捕捉到了相关的个人,而感染风险因时间、地点和个人的不同而有很大差异。忽略依赖性和异质性可能会使已建立的最优性网格产生偏差,并导致程序的次优实现。作为从这场疫情中吸取的教训,本文强调了将现场采样信息与统计建模相结合以有效优化混合测试的必要性。使用真实数据,我们表明:(a)通过利用样本之间的自然相关性(非独立性),可以在低物流成本下获得更大的收益——灵敏度和效率分别提高30%和90%;以及(b)尽管池之间存在显著的异质性(不完全相同),但这些收益是稳健的。我们的建模结果补充和扩展了Barak等人(2021)的观察结果,他们报告了远远超出预期的经验敏感性。最后,我们提供了一个交互式工具,用于使用上下文信息选择最佳池大小
{"title":"Statistical Modeling for Practical Pooled Testing During the COVID-19 Pandemic","authors":"S. Comess, H. Wang, S. Holmes, Claire Donnat","doi":"10.1214/22-sts857","DOIUrl":"https://doi.org/10.1214/22-sts857","url":null,"abstract":"Pooled testing offers an efficient solution to the unprecedented testing demands of the COVID-19 pandemic, although with potentially lower sensitivity and increased costs to implementation in some settings. Assessments of this trade-off typically assume pooled specimens are independent and identically distributed. Yet, in the context of COVID-19, these assumptions are often violated: testing done on networks (housemates, spouses, co-workers) captures correlated individuals, while infection risk varies substantially across time, place and individuals. Neglecting dependencies and heterogeneity may bias established optimality grids and induce a sub-optimal implementation of the procedure. As a lesson learned from this pandemic, this paper highlights the necessity of integrating field sampling information with statistical modeling to efficiently optimize pooled testing. Using real data, we show that (a) greater gains can be achieved at low logistical cost by exploiting natural correlations (non-independence) between samples -- allowing improvements in sensitivity and efficiency of up to 30% and 90% respectively;and (b) these gains are robust despite substantial heterogeneity across pools (non-identical). Our modeling results complement and extend the observations of Barak et al (2021) who report an empirical sensitivity well beyond expectations. Finally, we provide an interactive tool for selecting an optimal pool size using contextual information","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46125680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Randomization-Based Test for Censored Outcomes: A New Look at the Logrank Test 基于随机化的检查结果检验:Logrank检验的新视角
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-07-06 DOI: 10.1214/22-sts851
Xinran Li, Dylan S. Small
Two-sample tests with censored outcomes are a classical topic in statistics with wide use even in cutting edge applications. There are at least two modes of inference used to justify two-sample tests. One is usual superpopulation inference assuming that units are independent and identically distributed (i.i.d.) samples from some superpopulation; the other is finite population inference that relies on the random assignments of units into different groups. When randomization is actually implemented, the latter has the advantage of avoiding distributional assumptions on the outcomes. In this paper, we focus on finite population inference for censored outcomes, which has been less explored in the literature. Moreover, we allow the censoring time to depend on treatment assignment, under which exact permutation inference is unachievable. We find that, surprisingly, the usual logrank test can also be justified by randomization. Specifically, under a Bernoulli randomized experiment with non-informative i.i.d. censoring, the logrank test is asymptotically valid for testing Fisher’s null hypothesis of no treatment effect on any unit. The asymptotic validity of the logrank test does not require any distributional assumption on the potential event times. We further extend the theory to the stratified logrank test, which is useful for randomized block designs and when censoring mechanisms vary across strata. In sum, the developed theory for the logrank test from finite population inference supplements its classical theory from usual superpopulation inference, and helps provide a broader justification for the logrank test.
具有截尾结果的两个样本测试是统计学中的一个经典话题,即使在前沿应用中也有广泛的应用。至少有两种推理模式用于证明两个样本测试的合理性。一种是通常的超种群推断,假设单元是来自某个超种群的独立且相同分布(i.i.d.)的样本;另一种是有限总体推理,它依赖于将单元随机分配到不同的组中。当实际实施随机化时,后者的优点是避免了对结果的分布假设。在本文中,我们关注的是审查结果的有限总体推断,这在文献中很少被探索。此外,我们允许审查时间取决于处理分配,在这种情况下,无法实现精确的排列推理。我们发现,令人惊讶的是,通常的logrank检验也可以通过随机化来证明。具体来说,在具有非信息性i.i.d.截尾的伯努利随机实验下,logrank检验对于检验Fisher对任何单位都没有治疗效果的零假设是渐近有效的。logrank检验的渐近有效性不需要对潜在事件时间进行任何分布假设。我们进一步将该理论扩展到分层logrank检验,这对于随机块设计以及当审查机制在不同层之间变化时是有用的。总之,有限总体推理的logrank检验的发展理论补充了通常超总体推理的经典理论,并有助于为logrank测试提供更广泛的理由。
{"title":"Randomization-Based Test for Censored Outcomes: A New Look at the Logrank Test","authors":"Xinran Li, Dylan S. Small","doi":"10.1214/22-sts851","DOIUrl":"https://doi.org/10.1214/22-sts851","url":null,"abstract":"Two-sample tests with censored outcomes are a classical topic in statistics with wide use even in cutting edge applications. There are at least two modes of inference used to justify two-sample tests. One is usual superpopulation inference assuming that units are independent and identically distributed (i.i.d.) samples from some superpopulation; the other is finite population inference that relies on the random assignments of units into different groups. When randomization is actually implemented, the latter has the advantage of avoiding distributional assumptions on the outcomes. In this paper, we focus on finite population inference for censored outcomes, which has been less explored in the literature. Moreover, we allow the censoring time to depend on treatment assignment, under which exact permutation inference is unachievable. We find that, surprisingly, the usual logrank test can also be justified by randomization. Specifically, under a Bernoulli randomized experiment with non-informative i.i.d. censoring, the logrank test is asymptotically valid for testing Fisher’s null hypothesis of no treatment effect on any unit. The asymptotic validity of the logrank test does not require any distributional assumption on the potential event times. We further extend the theory to the stratified logrank test, which is useful for randomized block designs and when censoring mechanisms vary across strata. In sum, the developed theory for the logrank test from finite population inference supplements its classical theory from usual superpopulation inference, and helps provide a broader justification for the logrank test.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47475631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Stein’s Method Meets Computational Statistics: A Review of Some Recent Developments 斯坦因的方法与计算统计:一些最新发展的回顾
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2021-05-07 DOI: 10.1214/22-sts863
Andreas Anastasiou, A. Barp, F. Briol, B. Ebner, Robert E. Gaunt, Fatemeh Ghaderinezhad, Jackson Gorham, A. Gretton, Christophe Ley, Qiang Liu, Lester W. Mackey, C. Oates, G. Reinert, Yvik Swan
Stein's method compares probability distributions through the study of a class of linear operators called Stein operators. While mainly studied in probability and used to underpin theoretical statistics, Stein's method has led to significant advances in computational statistics in recent years. The goal of this survey is to bring together some of these recent developments and, in doing so, to stimulate further research into the successful field of Stein's method and statistics. The topics we discuss include tools to benchmark and compare sampling methods such as approximate Markov chain Monte Carlo, deterministic alternatives to sampling methods, control variate techniques, parameter estimation and goodness-of-fit testing.
斯坦的方法通过研究一类叫做斯坦算子的线性算子来比较概率分布。虽然主要研究概率论并用于理论统计,但斯坦的方法近年来在计算统计方面取得了重大进展。这项调查的目的是汇集这些最新的发展,并在这样做的过程中,刺激对斯坦的方法和统计的成功领域的进一步研究。我们讨论的主题包括基准测试和比较采样方法的工具,如近似马尔可夫链蒙特卡罗,采样方法的确定性替代方案,控制变量技术,参数估计和拟合优度测试。
{"title":"Stein’s Method Meets Computational Statistics: A Review of Some Recent Developments","authors":"Andreas Anastasiou, A. Barp, F. Briol, B. Ebner, Robert E. Gaunt, Fatemeh Ghaderinezhad, Jackson Gorham, A. Gretton, Christophe Ley, Qiang Liu, Lester W. Mackey, C. Oates, G. Reinert, Yvik Swan","doi":"10.1214/22-sts863","DOIUrl":"https://doi.org/10.1214/22-sts863","url":null,"abstract":"Stein's method compares probability distributions through the study of a class of linear operators called Stein operators. While mainly studied in probability and used to underpin theoretical statistics, Stein's method has led to significant advances in computational statistics in recent years. The goal of this survey is to bring together some of these recent developments and, in doing so, to stimulate further research into the successful field of Stein's method and statistics. The topics we discuss include tools to benchmark and compare sampling methods such as approximate Markov chain Monte Carlo, deterministic alternatives to sampling methods, control variate techniques, parameter estimation and goodness-of-fit testing.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43029634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
期刊
Statistical Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1