首页 > 最新文献

Statistical Science最新文献

英文 中文
Methods to Compute Prediction Intervals: A Review and New Results 预测区间计算方法综述及新结果
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-11-05 DOI: 10.1214/21-sts842
Qinglong Tian, D. Nordman, W. Meeker
This paper reviews two main types of prediction interval methods under a parametric framework. First, we describe methods based on an (approximate) pivotal quantity. Examples include the plug-in, pivotal, and calibration methods. Then we describe methods based on a predictive distribution (sometimes derived based on the likelihood). Examples include Bayesian, fiducial, and direct-bootstrap methods. Several examples involving continuous distributions along with simulation studies to evaluate coverage probability properties are provided. We provide specific connections among different prediction interval methods for the (log-)location-scale family of distributions. This paper also discusses general prediction interval methods for discrete data, using the binomial and Poisson distributions as examples. We also overview methods for dependent data, with application to time series, spatial data, and Markov random fields, for example.
本文综述了参数框架下两种主要的预测区间方法。首先,我们描述了基于(近似)关键量的方法。示例包括插件、枢纽和校准方法。然后我们描述了基于预测分布的方法(有时基于似然)。例子包括贝叶斯方法、基准方法和直接自举方法。提供了几个涉及连续分布的例子以及评估覆盖概率特性的模拟研究。我们提供了(对数)位置尺度分布家族的不同预测区间方法之间的具体联系。本文还以二项分布和泊松分布为例,讨论了离散数据的一般预测区间方法。我们还概述了相关数据的方法,例如应用于时间序列,空间数据和马尔可夫随机场。
{"title":"Methods to Compute Prediction Intervals: A Review and New Results","authors":"Qinglong Tian, D. Nordman, W. Meeker","doi":"10.1214/21-sts842","DOIUrl":"https://doi.org/10.1214/21-sts842","url":null,"abstract":"This paper reviews two main types of prediction interval methods under a parametric framework. First, we describe methods based on an (approximate) pivotal quantity. Examples include the plug-in, pivotal, and calibration methods. Then we describe methods based on a predictive distribution (sometimes derived based on the likelihood). Examples include Bayesian, fiducial, and direct-bootstrap methods. Several examples involving continuous distributions along with simulation studies to evaluate coverage probability properties are provided. We provide specific connections among different prediction interval methods for the (log-)location-scale family of distributions. This paper also discusses general prediction interval methods for discrete data, using the binomial and Poisson distributions as examples. We also overview methods for dependent data, with application to time series, spatial data, and Markov random fields, for example.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46757725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Look at Robustness and Stability of $ell_{1}$-versus $ell_{0}$-Regularization: Discussion of Papers by Bertsimas et al. and Hastie et al. 关于$ell_{1}$-与$ell_{0}$-正则化的鲁棒性和稳定性:Bertsimas等人和Hastie等人的论文讨论。
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-11-01 DOI: 10.1214/20-sts809
Yuansi Chen, Armeen Taeb, P. Bühlmann
{"title":"A Look at Robustness and Stability of $ell_{1}$-versus $ell_{0}$-Regularization: Discussion of Papers by Bertsimas et al. and Hastie et al.","authors":"Yuansi Chen, Armeen Taeb, P. Bühlmann","doi":"10.1214/20-sts809","DOIUrl":"https://doi.org/10.1214/20-sts809","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"614-622"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45899156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Rejoinder: Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons 复辩手:最佳子集,前进逐步还是拉索?基于广泛比较的分析和建议
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-11-01 DOI: 10.1214/20-sts733rej
T. Hastie, R. Tibshirani, R. Tibshirani
{"title":"Rejoinder: Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons","authors":"T. Hastie, R. Tibshirani, R. Tibshirani","doi":"10.1214/20-sts733rej","DOIUrl":"https://doi.org/10.1214/20-sts733rej","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"579-592"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44684797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Modern Variable Selection in Action: Comment on the Papers by HTT and BPV 现代变量选择在行动中——评HTT和BPV的论文
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-11-01 DOI: 10.1214/20-sts808
E. George
Let me begin by congratulating the authors of these two papers, hereafter HTT and BPV, for their superb contributions to the comparisons of methods for variable selection problems in high dimensional regression. The methods considered are truly some of today’s leading contenders for coping with the size and complexity of big data problems of so much current importance. Not surprisingly, there is no clear winner here because the terrain of comparisons is so vast and complex, and no single method can dominate across all situations. The considered setups vary greatly in terms of the number of observations n, the number of predictors p, the number and relative sizes of the underlying nonzero regression coefficients, predictor correlation structures and signal-to-noise ratios (SNRs). And even these only scratch the surface of the infinite possibilities. Further, there is the additional issue as to which performance measure is most important. Is the goal of an analysis exact variable selection or prediction or both? And what about computational speed and scalability? All these considerations would naturally depend on the practical application at hand. The methods compared by HTT and BPV have been unleashed by extraordinary developments in computational speed, and so it is tempting to distinguish them primarily by their novel implementation algorithms. In particular, the recent integer optimization related algorithms for variable selection differ in fundamental ways from the now widely adopted coordinate ascent algorithms for the lasso related methods. Undoubtedly, the impressive improvements in computational speed unleashed by these algorithms are critical for the feasibility of practical applications. However, the more fundamental story behind the performance differences has to do with the differences between the criteria that their algorithms are seeking to optimize. In an important sense, they are being guided by different solutions to the general variable selection problem. Focusing first on the paper of HTT, its main thrust appears to have been kindled by the computational breakthrough of Bertsimas, King and Mazumder (2016) (hereafter BKM), which had proposed a mixed integer opti-
首先,让我祝贺这两篇论文(以下简称HTT和BPV)的作者,他们在高维回归中变量选择问题的方法比较方面做出了卓越的贡献。所考虑的方法确实是当今应对当前如此重要的大数据问题的规模和复杂性的主要竞争者。毫不奇怪,这里没有明确的赢家,因为比较的领域是如此广阔和复杂,没有一种单一的方法可以在所有情况下占据主导地位。所考虑的设置在观测次数n、预测因子数量p、潜在非零回归系数的数量和相对大小、预测因子相关性结构和信噪比(SNR)方面变化很大。即使是这些也只是触及了无限可能性的表面。此外,还有一个额外的问题,即哪种绩效衡量标准最重要。分析的目标是精确的变量选择还是预测,或者两者兼而有之?那么计算速度和可扩展性又如何呢?所有这些考虑自然取决于手头的实际应用。HTT和BPV所比较的方法是由计算速度的非凡发展释放出来的,因此主要通过它们新颖的实现算法来区分它们是很有吸引力的。特别地,最近用于变量选择的整数优化相关算法在基本方面与现在广泛采用的用于套索相关方法的坐标上升算法不同。毫无疑问,这些算法在计算速度上的显著提高对实际应用的可行性至关重要。然而,性能差异背后更根本的故事与他们的算法寻求优化的标准之间的差异有关。从一个重要的意义上说,他们受到一般变量选择问题的不同解决方案的指导。首先关注HTT的论文,它的主要推动力似乎是由Bertsimas、King和Mazumder(2016)(以下简称BKM)的计算突破点燃的,他们提出了一种混合整数光学-
{"title":"Modern Variable Selection in Action: Comment on the Papers by HTT and BPV","authors":"E. George","doi":"10.1214/20-sts808","DOIUrl":"https://doi.org/10.1214/20-sts808","url":null,"abstract":"Let me begin by congratulating the authors of these two papers, hereafter HTT and BPV, for their superb contributions to the comparisons of methods for variable selection problems in high dimensional regression. The methods considered are truly some of today’s leading contenders for coping with the size and complexity of big data problems of so much current importance. Not surprisingly, there is no clear winner here because the terrain of comparisons is so vast and complex, and no single method can dominate across all situations. The considered setups vary greatly in terms of the number of observations n, the number of predictors p, the number and relative sizes of the underlying nonzero regression coefficients, predictor correlation structures and signal-to-noise ratios (SNRs). And even these only scratch the surface of the infinite possibilities. Further, there is the additional issue as to which performance measure is most important. Is the goal of an analysis exact variable selection or prediction or both? And what about computational speed and scalability? All these considerations would naturally depend on the practical application at hand. The methods compared by HTT and BPV have been unleashed by extraordinary developments in computational speed, and so it is tempting to distinguish them primarily by their novel implementation algorithms. In particular, the recent integer optimization related algorithms for variable selection differ in fundamental ways from the now widely adopted coordinate ascent algorithms for the lasso related methods. Undoubtedly, the impressive improvements in computational speed unleashed by these algorithms are critical for the feasibility of practical applications. However, the more fundamental story behind the performance differences has to do with the differences between the criteria that their algorithms are seeking to optimize. In an important sense, they are being guided by different solutions to the general variable selection problem. Focusing first on the paper of HTT, its main thrust appears to have been kindled by the computational breakthrough of Bertsimas, King and Mazumder (2016) (hereafter BKM), which had proposed a mixed integer opti-","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"609-613"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45250262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Discussion of “Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons” 关于“最佳子集、逐步前进还是套索”的讨论基于广泛比较的分析与建议
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-11-01 DOI: 10.1214/20-sts807
R. Mazumder
I warmly congratulate the authors Hastie, Tibshirani and Tibshirani (HTT); and Bertsimas, Pauphilet and Van Parys (BPV) for their excellent contributions and important perspectives on sparse regression. Due to space constraints, and my greater familiarity with the content and context of HTT (I have had numerous fruitful discussions with the authors regarding their work), I will focus my discussion on the HTT paper. HTT nicely articulate the relative merits of three canonical estimators in sparse regression: L0, L1 and (forward)stepwise selection. I am humbled that a premise of their work is an article I wrote with Bertsimas and King [4] (BKM). BKM showed that current Mixed Integer Optimization (MIO) algorithms allow us to compute best subsets solutions for problem instances (p ≈ 1000 features) much larger than a previous benchmark (software for best subsets in the R package leaps) that could only handle instances with p ≈ 30. HTT by extending and refining the experiments performed by BKM, have helped clarify and deepen our understanding of L0, L1 and stepwise regression. They raise several intriguing questions that perhaps deserve further attention from the wider statistics and optimization communities. In this commentary, I will focus on some of the key points discussed in HTT, with a bias toward some of the recent work I have been involved in. There is a large and rich body of work in high-dimensional statistics and related optimization techniques that I will not be able to discuss within the limited scope of my commentary.
我热烈祝贺作者Hastie, Tibshirani和Tibshirani (HTT);以及Bertsimas、Pauphilet和Van Parys (BPV)对稀疏回归的杰出贡献和重要观点。由于篇幅限制,以及我对HTT的内容和上下文更加熟悉(我已经与作者就他们的工作进行了许多富有成效的讨论),我将重点讨论HTT论文。HTT很好地阐明了稀疏回归中三个典型估计量的相对优点:L0、L1和(前向)逐步选择。他们工作的前提是我与Bertsimas和King b[4] (BKM)共同撰写的一篇文章,这让我感到谦卑。BKM表明,当前的混合整数优化(MIO)算法允许我们计算问题实例(p≈1000个特征)的最佳子集解决方案,比以前的基准(R包中最佳子集的软件飞跃)大得多,后者只能处理p≈30的实例。HTT通过扩展和完善BKM所做的实验,帮助我们澄清和加深了对L0、L1和逐步回归的理解。他们提出了几个有趣的问题,也许值得更广泛的统计和优化社区进一步关注。在这篇评论中,我将集中讨论HTT中讨论的一些关键点,并偏向于我最近参与的一些工作。在高维统计和相关优化技术方面有大量丰富的工作,我将无法在我的评论的有限范围内讨论。
{"title":"Discussion of “Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons”","authors":"R. Mazumder","doi":"10.1214/20-sts807","DOIUrl":"https://doi.org/10.1214/20-sts807","url":null,"abstract":"I warmly congratulate the authors Hastie, Tibshirani and Tibshirani (HTT); and Bertsimas, Pauphilet and Van Parys (BPV) for their excellent contributions and important perspectives on sparse regression. Due to space constraints, and my greater familiarity with the content and context of HTT (I have had numerous fruitful discussions with the authors regarding their work), I will focus my discussion on the HTT paper. HTT nicely articulate the relative merits of three canonical estimators in sparse regression: L0, L1 and (forward)stepwise selection. I am humbled that a premise of their work is an article I wrote with Bertsimas and King [4] (BKM). BKM showed that current Mixed Integer Optimization (MIO) algorithms allow us to compute best subsets solutions for problem instances (p ≈ 1000 features) much larger than a previous benchmark (software for best subsets in the R package leaps) that could only handle instances with p ≈ 30. HTT by extending and refining the experiments performed by BKM, have helped clarify and deepen our understanding of L0, L1 and stepwise regression. They raise several intriguing questions that perhaps deserve further attention from the wider statistics and optimization communities. In this commentary, I will focus on some of the key points discussed in HTT, with a bias toward some of the recent work I have been involved in. There is a large and rich body of work in high-dimensional statistics and related optimization techniques that I will not be able to discuss within the limited scope of my commentary.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"602-608"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47846338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Conversation with J. Stuart (Stu) Hunter 与j·斯图尔特·亨特的对话
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-11-01 DOI: 10.1214/19-sts766
R. D. Veaux
{"title":"A Conversation with J. Stuart (Stu) Hunter","authors":"R. D. Veaux","doi":"10.1214/19-sts766","DOIUrl":"https://doi.org/10.1214/19-sts766","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"663-671"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43654798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rejoinder: Sparse Regression: Scalable Algorithms and Empirical Performance 反驳:稀疏回归:可扩展算法和经验性能
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-11-01 DOI: 10.1214/20-sts701rej
D. Bertsimas, J. Pauphilet, Bart P. G. Van Parys
their
他们的
{"title":"Rejoinder: Sparse Regression: Scalable Algorithms and Empirical Performance","authors":"D. Bertsimas, J. Pauphilet, Bart P. G. Van Parys","doi":"10.1214/20-sts701rej","DOIUrl":"https://doi.org/10.1214/20-sts701rej","url":null,"abstract":"their","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46343056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Parameter Restrictions for the Sake of Identification: Is There Utility in Asserting That Perhaps a Restriction Holds? 出于识别的参数限制:断言可能存在限制是否有用?
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-09-25 DOI: 10.1214/23-sts885
P. Gustafson
Statistical modeling can involve a tension between assumptions and statistical identification. The law of the observable data may not uniquely determine the value of a target parameter without invoking a key assumption, and, while plausible, this assumption may not be obviously true in the scientific context at hand. Moreover, there are many instances of key assumptions which are untestable, hence we cannot rely on the data to resolve the question of whether the target is legitimately identified. Working in the Bayesian paradigm, we consider the grey zone of situations where a key assumption, in the form of a parameter space restriction, is scientifically reasonable but not incontrovertible for the problem being tackled. Specifically, we investigate statistical properties that ensue if we structure a prior distribution to assert that `maybe' or `perhaps' the assumption holds. Technically this simply devolves to using a mixture prior distribution putting just some prior weight on the assumption, or one of several assumptions, holding. However, while the construct is straightforward, there is very little literature discussing situations where Bayesian model averaging is employed across a mix of fully identified and partially identified models.
统计建模可能涉及假设和统计识别之间的紧张关系。在不援引关键假设的情况下,可观测数据定律可能无法唯一地确定目标参数的值,尽管这一假设是合理的,但在当前的科学背景下,这一假设可能并不明显正确。此外,有许多关键假设是不稳定的,因此我们不能依靠数据来解决目标是否合法确定的问题。在贝叶斯范式中,我们考虑了一个灰色地带的情况,其中一个关键假设,以参数空间限制的形式,在科学上是合理的,但对正在解决的问题来说并不是无可争议的。具体来说,我们研究了如果我们构建先验分布来断言“可能”或“可能”假设成立时所产生的统计特性。从技术上讲,这只是简单地转化为使用混合先验分布,只对假设或几个假设中的一个假设施加一些先验权重。然而,尽管该结构很简单,但很少有文献讨论在完全识别和部分识别的模型的混合中使用贝叶斯模型平均的情况。
{"title":"Parameter Restrictions for the Sake of Identification: Is There Utility in Asserting That Perhaps a Restriction Holds?","authors":"P. Gustafson","doi":"10.1214/23-sts885","DOIUrl":"https://doi.org/10.1214/23-sts885","url":null,"abstract":"Statistical modeling can involve a tension between assumptions and statistical identification. The law of the observable data may not uniquely determine the value of a target parameter without invoking a key assumption, and, while plausible, this assumption may not be obviously true in the scientific context at hand. Moreover, there are many instances of key assumptions which are untestable, hence we cannot rely on the data to resolve the question of whether the target is legitimately identified. Working in the Bayesian paradigm, we consider the grey zone of situations where a key assumption, in the form of a parameter space restriction, is scientifically reasonable but not incontrovertible for the problem being tackled. Specifically, we investigate statistical properties that ensue if we structure a prior distribution to assert that `maybe' or `perhaps' the assumption holds. Technically this simply devolves to using a mixture prior distribution putting just some prior weight on the assumption, or one of several assumptions, holding. However, while the construct is straightforward, there is very little literature discussing situations where Bayesian model averaging is employed across a mix of fully identified and partially identified models.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48381504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Identification of Causal Effects Within Principal Strata Using Auxiliary Variables 利用辅助变量识别主地层内的因果关系
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-08-06 DOI: 10.1214/20-sts810
Zhichao Jiang, Peng Ding
In causal inference, principal stratification is a framework for dealing with a posttreatment intermediate variable between a treatment and an outcome, in which the principal strata are defined by the joint potential values of the intermediate variable. Because the principal strata are not fully observable, the causal effects within them, also known as the principal causal effects, are not identifiable without additional assumptions. Several previous empirical studies leveraged auxiliary variables to improve the inference of principal causal effects. We establish a general theory for identification and estimation of the principal causal effects with auxiliary variables, which provides a solid foundation for statistical inference and more insights for model building in empirical research. In particular, we consider two commonly-used strategies for principal stratification problems: principal ignorability, and the conditional independence between the auxiliary variable and the outcome given principal strata and covariates. For these two strategies, we give non-parametric and semi-parametric identification results without modeling assumptions on the outcome. When the assumptions for neither strategies are plausible, we propose a large class of flexible parametric and semi-parametric models for identifying principal causal effects. Our theory not only ensures formal identification results of several models that have been used in previous empirical studies but also generalizes them to allow for different types of outcomes and intermediate variables.
在因果推理中,主分层是一个框架,用于处理处理和结果之间的处理后中间变量,其中主分层由中间变量的联合潜在值定义。因为主层不能完全观察到,所以如果没有额外的假设,其中的因果效应,也称为主因果效应,是无法识别的。以前的一些实证研究利用辅助变量来提高对主要因果效应的推断。建立了辅助变量识别和估计主因果效应的一般理论,为统计推断提供了坚实的基础,并为实证研究中的模型构建提供了更多的见解。特别地,我们考虑了主分层问题的两种常用策略:主可忽略性,以及辅助变量与给定主分层和协变量的结果之间的条件独立性。对于这两种策略,我们给出了非参数和半参数识别结果,而没有对结果进行建模假设。当两种策略的假设都不合理时,我们提出了一大类灵活的参数和半参数模型来识别主要因果效应。我们的理论不仅保证了先前实证研究中使用的几个模型的正式识别结果,而且还对它们进行了概括,以允许不同类型的结果和中间变量。
{"title":"Identification of Causal Effects Within Principal Strata Using Auxiliary Variables","authors":"Zhichao Jiang, Peng Ding","doi":"10.1214/20-sts810","DOIUrl":"https://doi.org/10.1214/20-sts810","url":null,"abstract":"In causal inference, principal stratification is a framework for dealing with a posttreatment intermediate variable between a treatment and an outcome, in which the principal strata are defined by the joint potential values of the intermediate variable. Because the principal strata are not fully observable, the causal effects within them, also known as the principal causal effects, are not identifiable without additional assumptions. Several previous empirical studies leveraged auxiliary variables to improve the inference of principal causal effects. We establish a general theory for identification and estimation of the principal causal effects with auxiliary variables, which provides a solid foundation for statistical inference and more insights for model building in empirical research. In particular, we consider two commonly-used strategies for principal stratification problems: principal ignorability, and the conditional independence between the auxiliary variable and the outcome given principal strata and covariates. For these two strategies, we give non-parametric and semi-parametric identification results without modeling assumptions on the outcome. When the assumptions for neither strategies are plausible, we propose a large class of flexible parametric and semi-parametric models for identifying principal causal effects. Our theory not only ensures formal identification results of several models that have been used in previous empirical studies but also generalizes them to allow for different types of outcomes and intermediate variables.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48107960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Comment: Diagnostics and Kernel-based Extensions for Linear Mixed Effects Models with Endogenous Covariates 评论:具有内生协变量的线性混合效应模型的诊断和基于核的扩展
IF 5.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-08-01 DOI: 10.1214/20-sts782
Hunyong Cho, Joshua P. Zitovsky, Xinyi Li, Minxin Lu, K. Shah, John Sperger, Matthew C. B. Tsilimigras, M. Kosorok
We discuss “Linear mixed models with endogenous covariates: modeling sequential treatment effects with application to a mobile health study” by Qian, Klasnja and Murphy. In this discussion, we study when the linear mixed effects models with endogenous covariates are feasible to use by providing examples and diagnostic tools as well as discussing potential extensions. This includes evaluating feasibility of partial likelihood-based inference, checking the conditional independence assumption, estimation of marginal effects, and kernel extensions of the model.
我们讨论了钱、Klasnja和Murphy的“具有内生协变量的线性混合模型:序列治疗效果建模及其在移动健康研究中的应用”。在这场讨论中,我们通过提供例子和诊断工具以及讨论潜在的扩展,研究了具有内生协变量的线性混合效应模型何时可行。这包括评估基于偏似然推理的可行性,检查条件独立性假设,估计边际效应,以及模型的核扩展。
{"title":"Comment: Diagnostics and Kernel-based Extensions for Linear Mixed Effects Models with Endogenous Covariates","authors":"Hunyong Cho, Joshua P. Zitovsky, Xinyi Li, Minxin Lu, K. Shah, John Sperger, Matthew C. B. Tsilimigras, M. Kosorok","doi":"10.1214/20-sts782","DOIUrl":"https://doi.org/10.1214/20-sts782","url":null,"abstract":"We discuss “Linear mixed models with endogenous covariates: modeling sequential treatment effects with application to a mobile health study” by Qian, Klasnja and Murphy. In this discussion, we study when the linear mixed effects models with endogenous covariates are feasible to use by providing examples and diagnostic tools as well as discussing potential extensions. This includes evaluating feasibility of partial likelihood-based inference, checking the conditional independence assumption, estimation of marginal effects, and kernel extensions of the model.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"396-399"},"PeriodicalIF":5.7,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47084522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistical Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1