首页 > 最新文献

Biometrika最新文献

英文 中文
Explicit solutions for the asymptotically-optimal bandwidth in cross-validation 交叉验证中渐近最优带宽的显式解法
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-02-08 DOI: 10.1093/biomet/asae007
Karim M Abadir, Michel Lubrano
Summary We show that least squares cross-validation methods share a common structure which has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student t(ν) kernel, the cross-validation criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple and noniterative thus leading to very fast computations, their integrated squared-error dominates traditional cross-validation implementations, they alleviate the notorious sample variability of cross-validation, and overcome its breakdown in the case of repeated observations. We illustrate our method with univariate and bivariate applications, of density estimation and nonparametric regressions, to a large dataset of Michigan State University academic wages and experience.
摘要 我们证明了最小二乘交叉验证方法有一个共同的结构,当所选的核在带宽和数据上是渐进可分的时候,这个结构有一个明确的渐进解。对于使用多变量 Student t(ν) 核的密度估计,交叉验证准则在渐近上等价于一个只有三个项的多项式。我们的带宽计算公式简单且无需迭代,因此计算速度非常快,其综合平方误差在传统的交叉验证实现中占优势,缓解了交叉验证中众所周知的样本变异性,并克服了其在重复观测情况下的缺陷。我们在密歇根州立大学学术工资和经验的大型数据集上,用密度估计和非参数回归的单变量和双变量应用来说明我们的方法。
{"title":"Explicit solutions for the asymptotically-optimal bandwidth in cross-validation","authors":"Karim M Abadir, Michel Lubrano","doi":"10.1093/biomet/asae007","DOIUrl":"https://doi.org/10.1093/biomet/asae007","url":null,"abstract":"Summary We show that least squares cross-validation methods share a common structure which has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student t(ν) kernel, the cross-validation criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple and noniterative thus leading to very fast computations, their integrated squared-error dominates traditional cross-validation implementations, they alleviate the notorious sample variability of cross-validation, and overcome its breakdown in the case of repeated observations. We illustrate our method with univariate and bivariate applications, of density estimation and nonparametric regressions, to a large dataset of Michigan State University academic wages and experience.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"3 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139770360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the failure of the bootstrap for Chatterjee's rank correlation 关于查特吉秩相关自举法的失败
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-02-04 DOI: 10.1093/biomet/asae004
Zhexiao Lin, Fang Han
Summary While researchers commonly use the bootstrap to quantify the uncertainty of an estimator, it has been noticed that the standard bootstrap, in general, does not work for Chatterjee's rank correlation. In this paper, we provide proof of this issue under an additional independence assumption, and complement our theory with simulation evidence for general settings. Chatterjee's rank correlation thus falls into a category of statistics that are asymptotically normal but bootstrap inconsistent. Valid inferential methods in this case are Chatterjee's original proposal for testing independence and Lin & Han (2022) 's analytic asymptotic variance estimator for more general purposes.
摘要 虽然研究人员通常使用引导法来量化估计器的不确定性,但人们注意到标准引导法一般不适用于查特吉秩相关。在本文中,我们在额外的独立性假设下证明了这一问题,并用一般情况下的模拟证据补充了我们的理论。因此,查特吉秩相关属于渐近正态但自举不一致的统计类别。在这种情况下,有效的推论方法是 Chatterjee 最初提出的用于检验独立性的方法,以及 Lin & Han (2022) 用于更一般目的的解析渐近方差估计器。
{"title":"On the failure of the bootstrap for Chatterjee's rank correlation","authors":"Zhexiao Lin, Fang Han","doi":"10.1093/biomet/asae004","DOIUrl":"https://doi.org/10.1093/biomet/asae004","url":null,"abstract":"Summary While researchers commonly use the bootstrap to quantify the uncertainty of an estimator, it has been noticed that the standard bootstrap, in general, does not work for Chatterjee's rank correlation. In this paper, we provide proof of this issue under an additional independence assumption, and complement our theory with simulation evidence for general settings. Chatterjee's rank correlation thus falls into a category of statistics that are asymptotically normal but bootstrap inconsistent. Valid inferential methods in this case are Chatterjee's original proposal for testing independence and Lin & Han (2022) 's analytic asymptotic variance estimator for more general purposes.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"125 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139770362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Asymptotically constant risk estimator of the time-average variance constant 时间平均方差常数的渐近恒定风险估计器
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-02-03 DOI: 10.1093/biomet/asae003
K W Chan, C Y Yau
Summary Estimation of the time-average variance constant is important for statistical analyses involving dependent data. This problem is difficult as it relies on a bandwidth parameter. Specifically, the optimal choices of the bandwidths of all existing estimators depend on the estimand itself and another unknown parameter which is very difficult to estimate. Thus, optimal variance estimation is unachievable. In this paper, we introduce a concept of converging flat-top kernels for constructing variance estimators whose optimal bandwidths are free of unknown parameters asymptotically and hence can be computed easily. We prove that the new estimator has an asymptotically constant risk and is locally asymptotically minimax.
摘要 估算时间平均方差常数对于涉及从属数据的统计分析非常重要。这个问题很难解决,因为它依赖于一个带宽参数。具体来说,所有现有估计器带宽的最优选择都取决于估计变量本身和另一个很难估计的未知参数。因此,最优方差估计是无法实现的。在本文中,我们引入了收敛平顶核的概念,用于构建方差估计器,其最优带宽在渐近上不受未知参数的影响,因此可以轻松计算。我们证明了新的估计器具有渐近恒定的风险,并且是局部渐近最小的。
{"title":"Asymptotically constant risk estimator of the time-average variance constant","authors":"K W Chan, C Y Yau","doi":"10.1093/biomet/asae003","DOIUrl":"https://doi.org/10.1093/biomet/asae003","url":null,"abstract":"Summary Estimation of the time-average variance constant is important for statistical analyses involving dependent data. This problem is difficult as it relies on a bandwidth parameter. Specifically, the optimal choices of the bandwidths of all existing estimators depend on the estimand itself and another unknown parameter which is very difficult to estimate. Thus, optimal variance estimation is unachievable. In this paper, we introduce a concept of converging flat-top kernels for constructing variance estimators whose optimal bandwidths are free of unknown parameters asymptotically and hence can be computed easily. We prove that the new estimator has an asymptotically constant risk and is locally asymptotically minimax.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"16 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139678925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A note on minimax robustness of designs against correlated or heteroscedastic responses 关于针对相关或异方差响应的最小稳健性设计的说明
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-01-20 DOI: 10.1093/biomet/asae001
D P Wiens
Summary We present a result according to which certain functions of covariance matrices are maximized at scalar multiples of the identity matrix. This is used to show that experimental designs that are optimal under an assumption of independent, homoscedastic responses can be minimax robust, in broad classes of alternate covariance structures. In particular it can justify the common practice of disregarding possible dependence, or heteroscedasticity, at the design stage of an experiment.
摘要 我们提出了一个结果,根据这个结果,协方差矩阵的某些函数在同矩阵的标量倍数上达到最大。这一结果表明,在独立、同方差反应假设下为最优的实验设计,在各种交替协方差结构中具有最小稳健性。特别是,它可以证明在实验设计阶段忽略可能的依赖性或异方差性的常见做法是正确的。
{"title":"A note on minimax robustness of designs against correlated or heteroscedastic responses","authors":"D P Wiens","doi":"10.1093/biomet/asae001","DOIUrl":"https://doi.org/10.1093/biomet/asae001","url":null,"abstract":"Summary We present a result according to which certain functions of covariance matrices are maximized at scalar multiples of the identity matrix. This is used to show that experimental designs that are optimal under an assumption of independent, homoscedastic responses can be minimax robust, in broad classes of alternate covariance structures. In particular it can justify the common practice of disregarding possible dependence, or heteroscedasticity, at the design stage of an experiment.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"47 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient nonparametric estimation of Toeplitz covariance matrices 托普利兹协方差矩阵的高效非参数估计
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-01-17 DOI: 10.1093/biomet/asae002
K Klockmann, T Krivobokova
A new efficient nonparametric estimator for Toeplitz covariance matrices is proposed. This estimator is based on a data transformation that translates the problem of Toeplitz covariance matrix estimation to the problem of mean estimation in an approximate Gaussian regression. The resulting Toeplitz covariance matrix estimator is positive definite by construction, fully data-driven and computationally very fast. Moreover, this estimator is shown to be minimax optimal under the spectral norm for a large class of Toeplitz matrices. These results are readily extended to estimation of inverses of Toeplitz covariance matrices. Also, an alternative version of the Whittle likelihood for the spectral density based on the discrete cosine transform is proposed. The method is implemented in the R package vstdct that accompanies the paper.
本文提出了一种新的高效托普利兹协方差矩阵非参数估计器。该估计器基于数据转换,将托普利兹协方差矩阵估计问题转化为近似高斯回归中的均值估计问题。由此产生的托普利兹协方差矩阵估计器在构造上是正定的,完全由数据驱动,计算速度非常快。此外,对于一大类 Toeplitz 矩阵,该估计器在谱规范下是最小最优的。这些结果很容易扩展到对托普利兹协方差矩阵逆的估计。此外,还提出了基于离散余弦变换的谱密度惠特尔似然法的替代版本。本文附带的 R 软件包 vstdct 实现了该方法。
{"title":"Efficient nonparametric estimation of Toeplitz covariance matrices","authors":"K Klockmann, T Krivobokova","doi":"10.1093/biomet/asae002","DOIUrl":"https://doi.org/10.1093/biomet/asae002","url":null,"abstract":"A new efficient nonparametric estimator for Toeplitz covariance matrices is proposed. This estimator is based on a data transformation that translates the problem of Toeplitz covariance matrix estimation to the problem of mean estimation in an approximate Gaussian regression. The resulting Toeplitz covariance matrix estimator is positive definite by construction, fully data-driven and computationally very fast. Moreover, this estimator is shown to be minimax optimal under the spectral norm for a large class of Toeplitz matrices. These results are readily extended to estimation of inverses of Toeplitz covariance matrices. Also, an alternative version of the Whittle likelihood for the spectral density based on the discrete cosine transform is proposed. The method is implemented in the R package vstdct that accompanies the paper.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"13 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Selecting and Conditioning in Multiple Testing and Selective Inference 论多重测试和选择性推理中的选择和条件限制
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2023-12-22 DOI: 10.1093/biomet/asad078
Jelle J Goeman, Aldo Solari
We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include basic data splitting, as well as modern data carving methods and post-selection inference methods for lasso coefficients based on the polyhedral lemma. In this paper, we adopt a holistic view on such methods, considering the selection, conditioning, and final error control steps together as a single method. From this perspective, we demonstrate that multiple testing methods defined directly on the full universe of hypotheses are always at least as powerful as selective inference methods based on selection and conditioning. This result holds true even when the universe is potentially infinite and only implicitly defined, such as in the case of data splitting. We give general theory and intuitions before investigating in detail several case studies where a shift to a non-selective or unconditional perspective can yield a power gain.
我们研究了一类以选择事件为条件的选择性推理方法。这类方法分为两个阶段。首先,从大量假设中选择一个数据驱动的假设集合。随后,在这个数据驱动的集合中,以用于选择的信息为条件进行推理。这类方法的例子包括基本的数据分割、现代的数据雕刻方法和基于多面体阶梯的套索系数选择后推理方法。在本文中,我们对此类方法采用了整体观点,将选择、调节和最终误差控制步骤视为一个方法。从这个角度出发,我们证明了直接定义于全部假设的多重检验方法总是至少与基于选择和条件的选择性推理方法一样强大。即使假设的范围可能是无限的,而且只是隐含定义的,例如在数据分割的情况下,这一结果也是成立的。我们先给出了一般理论和直觉,然后详细研究了几个案例,在这些案例中,转向非选择性或无条件视角可以获得更强的推理能力。
{"title":"On Selecting and Conditioning in Multiple Testing and Selective Inference","authors":"Jelle J Goeman, Aldo Solari","doi":"10.1093/biomet/asad078","DOIUrl":"https://doi.org/10.1093/biomet/asad078","url":null,"abstract":"We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include basic data splitting, as well as modern data carving methods and post-selection inference methods for lasso coefficients based on the polyhedral lemma. In this paper, we adopt a holistic view on such methods, considering the selection, conditioning, and final error control steps together as a single method. From this perspective, we demonstrate that multiple testing methods defined directly on the full universe of hypotheses are always at least as powerful as selective inference methods based on selection and conditioning. This result holds true even when the universe is potentially infinite and only implicitly defined, such as in the case of data splitting. We give general theory and intuitions before investigating in detail several case studies where a shift to a non-selective or unconditional perspective can yield a power gain.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"94 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139051164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Central limit theorems for local network statistics 本地网络统计的中心极限定理
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2023-12-22 DOI: 10.1093/biomet/asad080
P A Maugis
Summary Subgraph counts, in particular the number of occurrences of small shapes such as triangles, characterize properties of random networks. As a result, they have seen wide use as network summary statistics. Subgraphs are typically counted globally, making existing approaches unable to describe vertex-specific characteristics. In contrast, rooted subgraphs focus on vertex neighbourhoods, and are fundamental descriptors of local network properties. We derive the asymptotic joint distribution of rooted subgraph counts in inhomogeneous random graphs, a model which generalizes most statistical network models. This result enables a shift in the statistical analysis of graphs, from estimating network summaries, to estimating models linking local network structure and vertex-specific covariates. As an example, we consider a school friendship network and show that gender and race are significant predictors of local friendship patterns.
摘要 子图计数,尤其是三角形等小图形的出现次数,是随机网络属性的特征。因此,它们被广泛用作网络汇总统计。子图通常是全局统计的,因此现有方法无法描述特定顶点的特征。相比之下,有根子图侧重于顶点邻域,是局部网络特性的基本描述符。我们推导出了非均质随机图中有根子图计数的渐近联合分布,这一模型概括了大多数统计网络模型。这一结果使得图的统计分析从估算网络摘要转向估算连接局部网络结构和顶点特定协变量的模型。例如,我们考虑了一个学校友谊网络,结果表明性别和种族是本地友谊模式的重要预测因素。
{"title":"Central limit theorems for local network statistics","authors":"P A Maugis","doi":"10.1093/biomet/asad080","DOIUrl":"https://doi.org/10.1093/biomet/asad080","url":null,"abstract":"Summary Subgraph counts, in particular the number of occurrences of small shapes such as triangles, characterize properties of random networks. As a result, they have seen wide use as network summary statistics. Subgraphs are typically counted globally, making existing approaches unable to describe vertex-specific characteristics. In contrast, rooted subgraphs focus on vertex neighbourhoods, and are fundamental descriptors of local network properties. We derive the asymptotic joint distribution of rooted subgraph counts in inhomogeneous random graphs, a model which generalizes most statistical network models. This result enables a shift in the statistical analysis of graphs, from estimating network summaries, to estimating models linking local network structure and vertex-specific covariates. As an example, we consider a school friendship network and show that gender and race are significant predictors of local friendship patterns.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"178 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139051073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The state of cumulative sum sequential change point testing seventy years after Page 累积和顺序变化点测试七十年后的状况 Page
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2023-12-21 DOI: 10.1093/biomet/asad079
Alexander Aue, Claudia Kirch
Quality control charts aim at raising an alarm as soon as sequentially obtained observations of an underlying random process no longer seem to be within stochastic fluctuations prescribed by an ‘in-control’ scenario. Such random processes can often be modelled using the concept of stationarity, or even independence as in most classical works. An important out-of-control scenario is the changepoint alternative, for which the distribution of the process changes at an unknown point in time. In his seminal 1954 Biometrika paper, E. S. Page introduced the famous cumulative sum control charts for changepoint monitoring. Innovatively, decision rules based on cumulative sum procedures took the full history of the process into account, whereas previous procedures were based only on a fixed and typically small number of the most recent observations. The extreme case of using only the most recent observation, often referred to as the Shewhart chart, is more akin to serial outlier than changepoint detection. Page’s cumulative sum approach, introduced seven decades ago, is ubiquitous in modern changepoint analysis, and his original paper has led to a multitude of follow-up papers in different research communities. This review is focused on a particular subfield of this research, namely nonparametric sequential, or online, changepoint tests which are constructed to maintain a desired Type 1 error as opposed to the more traditional approach seeking to minimize the average run length of the procedures. Such tests have originated at the intersection of econometrics and statistics. We trace the development of these tests and highlight their properties, mostly using a simple location model for clarity of exposition, but also review more complex situations such as regression and time series models.
质量控制图的目的是,一旦连续获得的底层随机过程的观测结果似乎不再符合 "在控 "方案所规定的随机波动范围,就会发出警报。此类随机过程通常可以使用静止概念建模,甚至可以使用大多数经典著作中的独立概念建模。一个重要的失控情景是变化点替代方案,即过程的分布在一个未知的时间点发生变化。E. S. Page 在 1954 年发表的开创性论文《Biometrika》中,提出了著名的用于变化点监控的累积和控制图。创新性的是,基于累积总和程序的决策规则考虑到了整个过程的历史,而以前的程序仅基于固定的、通常为数不多的最新观测数据。仅使用最近观测值的极端情况通常被称为休哈特图表,它更类似于序列离群值,而非变化点检测。佩奇在七十年前提出的累积和方法在现代变化点分析中无处不在,他的原始论文在不同研究领域引发了大量后续论文。本综述的重点是这一研究的一个特殊子领域,即非参数序列或在线变化点检验,其构建目的是保持理想的 1 类误差,而不是寻求最小化程序平均运行长度的传统方法。这类检验起源于计量经济学和统计学的交叉学科。我们追溯了这些检验的发展历程,并强调了它们的特性,为了论述清晰,我们主要使用了简单的位置模型,但也回顾了回归和时间序列模型等更复杂的情况。
{"title":"The state of cumulative sum sequential change point testing seventy years after Page","authors":"Alexander Aue, Claudia Kirch","doi":"10.1093/biomet/asad079","DOIUrl":"https://doi.org/10.1093/biomet/asad079","url":null,"abstract":"\u0000 Quality control charts aim at raising an alarm as soon as sequentially obtained observations of an underlying random process no longer seem to be within stochastic fluctuations prescribed by an ‘in-control’ scenario. Such random processes can often be modelled using the concept of stationarity, or even independence as in most classical works. An important out-of-control scenario is the changepoint alternative, for which the distribution of the process changes at an unknown point in time. In his seminal 1954 Biometrika paper, E. S. Page introduced the famous cumulative sum control charts for changepoint monitoring. Innovatively, decision rules based on cumulative sum procedures took the full history of the process into account, whereas previous procedures were based only on a fixed and typically small number of the most recent observations. The extreme case of using only the most recent observation, often referred to as the Shewhart chart, is more akin to serial outlier than changepoint detection. Page’s cumulative sum approach, introduced seven decades ago, is ubiquitous in modern changepoint analysis, and his original paper has led to a multitude of follow-up papers in different research communities. This review is focused on a particular subfield of this research, namely nonparametric sequential, or online, changepoint tests which are constructed to maintain a desired Type 1 error as opposed to the more traditional approach seeking to minimize the average run length of the procedures. Such tests have originated at the intersection of econometrics and statistics. We trace the development of these tests and highlight their properties, mostly using a simple location model for clarity of exposition, but also review more complex situations such as regression and time series models.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"12 3","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138951837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: ‘A cross-validation-based statistical theory for point processes’ 更正:基于交叉验证的点过程统计理论
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2023-12-20 DOI: 10.1093/biomet/asad077
{"title":"Correction to: ‘A cross-validation-based statistical theory for point processes’","authors":"","doi":"10.1093/biomet/asad077","DOIUrl":"https://doi.org/10.1093/biomet/asad077","url":null,"abstract":"","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"45 2","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139169267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phylogenetic Association Analysis with Conditional Rank Correlation 基于条件秩相关的系统发育关联分析
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2023-12-01 DOI: 10.1093/biomet/asad075
Shulei Wang, Bo Yuan, T Tony Cai, Hongzhe Li
Summary Phylogenetic association analysis plays a crucial role in investigating the correlation between microbial compositions and specific outcomes of interest in microbiome studies. However, existing methods for testing such associations have limitations related to the assumption of a linear association in high-dimensional settings and the handling of confounding effects. Therefore, there is a need for methods capable of characterizing complex associations, including nonmonotonic relationships. This paper introduces a novel phylogenetic association analysis framework and associated tests to address these challenges by employing conditional rank correlation as a measure of association. These tests account for confounders in a fully nonparametric manner, ensuring robustness against outliers and the ability to detect diverse dependencies. The proposed framework aggregates conditional rank correlations for subtrees using a weighted sum and maximum approach to capture both dense and sparse signals. The significance level of the test statistics is determined by calibrating through a nearest neighbour bootstrapping method, which is straightforward to implement and can accommodate additional datasets when available. The practical advantages of the proposed framework are demonstrated through numerical experiments utilizing both simulated and real microbiome datasets.
在微生物组研究中,系统发育关联分析在研究微生物组成与特定结果之间的相关性方面起着至关重要的作用。然而,测试这种关联的现有方法存在与高维环境下线性关联假设和混淆效应处理相关的局限性。因此,需要能够表征复杂关联的方法,包括非单调关系。本文介绍了一种新的系统发育关联分析框架和相关测试,通过使用条件等级相关作为关联度量来解决这些挑战。这些测试以完全非参数的方式考虑混杂因素,确保对异常值的鲁棒性和检测不同依赖关系的能力。所提出的框架使用加权和和最大化方法聚合子树的条件秩相关性,以捕获密集和稀疏信号。测试统计数据的显著性水平是通过最近邻自举方法校准确定的,该方法易于实现,并且可以在可用时容纳额外的数据集。通过利用模拟和真实微生物组数据集的数值实验证明了所提出框架的实际优势。
{"title":"Phylogenetic Association Analysis with Conditional Rank Correlation","authors":"Shulei Wang, Bo Yuan, T Tony Cai, Hongzhe Li","doi":"10.1093/biomet/asad075","DOIUrl":"https://doi.org/10.1093/biomet/asad075","url":null,"abstract":"Summary Phylogenetic association analysis plays a crucial role in investigating the correlation between microbial compositions and specific outcomes of interest in microbiome studies. However, existing methods for testing such associations have limitations related to the assumption of a linear association in high-dimensional settings and the handling of confounding effects. Therefore, there is a need for methods capable of characterizing complex associations, including nonmonotonic relationships. This paper introduces a novel phylogenetic association analysis framework and associated tests to address these challenges by employing conditional rank correlation as a measure of association. These tests account for confounders in a fully nonparametric manner, ensuring robustness against outliers and the ability to detect diverse dependencies. The proposed framework aggregates conditional rank correlations for subtrees using a weighted sum and maximum approach to capture both dense and sparse signals. The significance level of the test statistics is determined by calibrating through a nearest neighbour bootstrapping method, which is straightforward to implement and can accommodate additional datasets when available. The practical advantages of the proposed framework are demonstrated through numerical experiments utilizing both simulated and real microbiome datasets.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"15 2","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Biometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1