首页 > 最新文献

Biometrika最新文献

英文 中文
An eigenvector-assisted estimation framework for signal-plus-noise matrix models 信号加噪声矩阵模型的特征向量辅助估计框架
2区 数学 Q2 BIOLOGY Pub Date : 2023-09-19 DOI: 10.1093/biomet/asad058
Fangzheng Xie, Dingbo Wu
Summary In this paper, we develop an eigenvector-assisted estimation framework for a collection of signal-plus-noise matrix models arising in high-dimensional statistics and many applications. The framework is built upon a novel asymptotically unbiased estimating equation using the leading eigenvectors of the data matrix. However, the estimator obtained by directly solving the estimating equation could be numerically unstable in practice and lacks robustness against model misspecification. We propose to use the quasi-posterior distribution by exponentiating a criterion function whose maximizer coincides with the estimating equation estimator. The proposed framework can incorporate heteroskedastic variance information but does not require the complete specification of the sampling distribution and is also robust to the potential misspecification of the distribution of the noise matrix. Computationally, the quasi-posterior distribution can be obtained via a Markov Chain Monte Carlo sampler, which exhibits superior numerical stability than some of the existing optimization-based estimators and is straightforward for uncertainty quantification. Under mild regularity conditions, we establish the large sample properties of the quasi-posterior distributions. In particular, the quasi-posterior credible sets have the correct frequentist nominal coverage probability provided that the criterion function is carefully selected. The validity and usefulness of the proposed framework are demonstrated through the analysis of synthetic datasets and the real-world ENZYMES network datasets.
在本文中,我们开发了一个特征向量辅助估计框架,用于高维统计和许多应用中的信号加噪声矩阵模型集合。该框架是建立在一个新的渐近无偏估计方程上,使用数据矩阵的首特征向量。然而,在实际应用中,直接求解估计方程得到的估计量在数值上是不稳定的,并且缺乏对模型错规范的鲁棒性。我们提出利用准后验分布,对一个准则函数取幂,该准则函数的最大值与估计方程估计量重合。所提出的框架可以包含异方差信息,但不需要完全规范采样分布,并且对噪声矩阵分布的潜在错误规范也具有鲁棒性。计算上,拟后验分布可以通过马尔可夫链蒙特卡罗采样器获得,与现有的一些基于优化的估计器相比,它具有更好的数值稳定性,并且可以直接用于不确定性量化。在温和的正则性条件下,我们建立了准后验分布的大样本性质。特别是,准后验可信集具有正确的频率论名义覆盖概率,前提是仔细选择准则函数。通过对合成数据集和实际酶网络数据集的分析,证明了所提出框架的有效性和实用性。
{"title":"An eigenvector-assisted estimation framework for signal-plus-noise matrix models","authors":"Fangzheng Xie, Dingbo Wu","doi":"10.1093/biomet/asad058","DOIUrl":"https://doi.org/10.1093/biomet/asad058","url":null,"abstract":"Summary In this paper, we develop an eigenvector-assisted estimation framework for a collection of signal-plus-noise matrix models arising in high-dimensional statistics and many applications. The framework is built upon a novel asymptotically unbiased estimating equation using the leading eigenvectors of the data matrix. However, the estimator obtained by directly solving the estimating equation could be numerically unstable in practice and lacks robustness against model misspecification. We propose to use the quasi-posterior distribution by exponentiating a criterion function whose maximizer coincides with the estimating equation estimator. The proposed framework can incorporate heteroskedastic variance information but does not require the complete specification of the sampling distribution and is also robust to the potential misspecification of the distribution of the noise matrix. Computationally, the quasi-posterior distribution can be obtained via a Markov Chain Monte Carlo sampler, which exhibits superior numerical stability than some of the existing optimization-based estimators and is straightforward for uncertainty quantification. Under mild regularity conditions, we establish the large sample properties of the quasi-posterior distributions. In particular, the quasi-posterior credible sets have the correct frequentist nominal coverage probability provided that the criterion function is carefully selected. The validity and usefulness of the proposed framework are demonstrated through the analysis of synthetic datasets and the real-world ENZYMES network datasets.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135060566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
E-values as unnormalized weights in multiple testing 在多重测试中,e值为非归一化权重
2区 数学 Q2 BIOLOGY Pub Date : 2023-09-15 DOI: 10.1093/biomet/asad057
Nikolaos Ignatiadis, Ruodu Wang, Aaditya Ramdas
Summary We study how to combine p-values and e-values, and design multiple testing procedures where both p-values and e-values are available for every hypothesis. Our results provide a new perspective on multiple testing with data-driven weights: while standard weighted multiple testing methods require the weights to deterministically add up to the number of hypotheses being tested, we show that this normalization is not required when the weights are e-values that are independent of the p-values. Such e-values can be obtained in meta-analysis where a primary dataset is used to compute p-values, and an independent secondary dataset is used to compute e-values. Going beyond meta-analysis, we showcase settings wherein independent e-values and p-values can be constructed on a single dataset itself. Our procedures can result in a substantial increase in power, especially if the non-null hypotheses have e-values much larger than one.
我们研究了如何结合p值和e值,并设计了多个检验程序,其中p值和e值对于每个假设都是可用的。我们的结果为使用数据驱动的权重进行多重测试提供了一个新的视角:虽然标准加权多重测试方法要求权重确定性地与被测试的假设数量相加,但我们表明,当权重是独立于p值的e值时,不需要这种归一化。这样的e值可以在meta分析中获得,其中使用主数据集计算p值,使用独立的辅助数据集计算e值。在meta分析之外,我们展示了可以在单个数据集本身上构建独立e值和p值的设置。我们的程序可以导致功率的大幅增加,特别是如果非零假设的e值远大于1。
{"title":"E-values as unnormalized weights in multiple testing","authors":"Nikolaos Ignatiadis, Ruodu Wang, Aaditya Ramdas","doi":"10.1093/biomet/asad057","DOIUrl":"https://doi.org/10.1093/biomet/asad057","url":null,"abstract":"Summary We study how to combine p-values and e-values, and design multiple testing procedures where both p-values and e-values are available for every hypothesis. Our results provide a new perspective on multiple testing with data-driven weights: while standard weighted multiple testing methods require the weights to deterministically add up to the number of hypotheses being tested, we show that this normalization is not required when the weights are e-values that are independent of the p-values. Such e-values can be obtained in meta-analysis where a primary dataset is used to compute p-values, and an independent secondary dataset is used to compute e-values. Going beyond meta-analysis, we showcase settings wherein independent e-values and p-values can be constructed on a single dataset itself. Our procedures can result in a substantial increase in power, especially if the non-null hypotheses have e-values much larger than one.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135436744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Retrospective causal inference with multiple effect variables 多效应变量的回顾性因果推理
2区 数学 Q2 BIOLOGY Pub Date : 2023-09-14 DOI: 10.1093/biomet/asad056
Wei Li, Zitong Lu, Jinzhu Jia, Min Xie, Zhi Geng
Summary As highlighted in Dawid (2000) and Pearl & Mackenzie (2018), deducing the causes of given effects is a more challenging problem than evaluating the effects of causes in causal inference. Lu et al. (2023) proposed an approach for deducing causes of a single effect variable based on posterior causal effects. In many applications, there are multiple effect variables, and thus they can be used simultaneously to more accurately deduce the causes. To retrospectively deduce causes from multiple effects, we propose multivariate posterior total, intervention and direct causal effects conditional on the observed evidence. We describe the assumptions of no-confounding and monotonicity, under which we prove identifiability of the multivariate posterior causal effects and provide their identification equations. The proposed approach can be applied for causal attributions, medical diagnosis, blame and responsibility in various studies with multiple effect or outcome variables. Two examples are used to illustrate the proposed approach.
david(2000)和Pearl &Mackenzie(2018)认为,在因果推理中,推断给定结果的原因比评估原因的影响更具挑战性。Lu等人(2023)提出了一种基于后验因果效应的方法来推断单个效应变量的原因。在许多应用中,有多个影响变量,因此可以同时使用它们来更准确地推断原因。为了从多重影响中回顾性地推断原因,我们提出了基于观察证据的多元后验效应、干预效应和直接因果效应。在无混杂和单调的假设下,我们证明了多元后验因果效应的可辨识性,并给出了它们的辨识方程。所提出的方法可以应用于各种具有多效果或结果变量的研究中的因果归因、医学诊断、责备和责任。用两个例子来说明所提出的方法。
{"title":"Retrospective causal inference with multiple effect variables","authors":"Wei Li, Zitong Lu, Jinzhu Jia, Min Xie, Zhi Geng","doi":"10.1093/biomet/asad056","DOIUrl":"https://doi.org/10.1093/biomet/asad056","url":null,"abstract":"Summary As highlighted in Dawid (2000) and Pearl & Mackenzie (2018), deducing the causes of given effects is a more challenging problem than evaluating the effects of causes in causal inference. Lu et al. (2023) proposed an approach for deducing causes of a single effect variable based on posterior causal effects. In many applications, there are multiple effect variables, and thus they can be used simultaneously to more accurately deduce the causes. To retrospectively deduce causes from multiple effects, we propose multivariate posterior total, intervention and direct causal effects conditional on the observed evidence. We describe the assumptions of no-confounding and monotonicity, under which we prove identifiability of the multivariate posterior causal effects and provide their identification equations. The proposed approach can be applied for causal attributions, medical diagnosis, blame and responsibility in various studies with multiple effect or outcome variables. Two examples are used to illustrate the proposed approach.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135552839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of prediction error in time series 时间序列预测误差的估计
2区 数学 Q2 BIOLOGY Pub Date : 2023-09-09 DOI: 10.1093/biomet/asad053
Alexander Aue, Prabir Burman
Summary The accurate estimation of prediction errors in time series is an important problem, which has immediate implications for the accuracy of prediction intervals as well as the quality of a number of widely used time series model selection criteria such as the Akaike information criterion. Except for simple cases, however, it is difficult or even impossible to obtain exact analytical expressions for one-step and multi-step predictions. This may be one of the reasons that, unlike in the independent case (see Efron, 2004), up to now there has been no fully established methodology for time series prediction error estimation. Starting from an approximation to the bias-variance decomposition of the squared prediction error, a method for accurate estimation of prediction errors in both univariate and multivariate stationary time series is developed in this article. In particular, several estimates are derived for a general class of predictors that includes most of the popular linear, nonlinear, parametric and nonparametric time series models used in practice, with causal invertible autoregressive moving average and nonparametric autoregressive processes discussed as lead examples. Simulations demonstrate that the proposed estimators perform quite well in finite samples. The estimates may also be used for model selection when the purpose of modelling is prediction.
时间序列预测误差的准确估计是一个重要的问题,它直接关系到预测区间的准确性以及一些广泛使用的时间序列模型选择准则(如赤池信息准则)的质量。然而,除了简单的情况外,很难甚至不可能获得一步和多步预测的精确解析表达式。这可能是原因之一,不像在独立的情况下(见Efron, 2004),到目前为止,还没有完全建立的方法来估计时间序列预测误差。本文从对预测误差平方的偏方差分解的近似出发,提出了一种单变量和多变量平稳时间序列预测误差的精确估计方法。特别是,对一般类型的预测器进行了一些估计,其中包括大多数流行的线性,非线性,参数和非参数时间序列模型,在实践中使用,因果可逆自回归移动平均和非参数自回归过程作为主要例子讨论。仿真结果表明,所提出的估计器在有限样本下具有良好的性能。当建模的目的是预测时,估计也可用于模型选择。
{"title":"Estimation of prediction error in time series","authors":"Alexander Aue, Prabir Burman","doi":"10.1093/biomet/asad053","DOIUrl":"https://doi.org/10.1093/biomet/asad053","url":null,"abstract":"Summary The accurate estimation of prediction errors in time series is an important problem, which has immediate implications for the accuracy of prediction intervals as well as the quality of a number of widely used time series model selection criteria such as the Akaike information criterion. Except for simple cases, however, it is difficult or even impossible to obtain exact analytical expressions for one-step and multi-step predictions. This may be one of the reasons that, unlike in the independent case (see Efron, 2004), up to now there has been no fully established methodology for time series prediction error estimation. Starting from an approximation to the bias-variance decomposition of the squared prediction error, a method for accurate estimation of prediction errors in both univariate and multivariate stationary time series is developed in this article. In particular, several estimates are derived for a general class of predictors that includes most of the popular linear, nonlinear, parametric and nonparametric time series models used in practice, with causal invertible autoregressive moving average and nonparametric autoregressive processes discussed as lead examples. Simulations demonstrate that the proposed estimators perform quite well in finite samples. The estimates may also be used for model selection when the purpose of modelling is prediction.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
More Efficient Exact Group Invariance Testing: using a Representative Subgroup 更有效的精确群不变性测试:使用代表子群
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2023-09-01 DOI: 10.1093/biomet/asad050
N. W. Koning, J. Hemerik
We consider testing invariance of a distribution under an algebraic group of transformations, such as permutations or sign-flips. As such groups are typically huge, tests based on the full group are often computationally infeasible. Hence, it is standard practice to use a random subset of transformations. We improve upon this by replacing the random subset with a strategically chosen, fixed subgroup of transformations. In a generalized location model, we show that the resulting tests are often consistent for lower signal-to-noise ratios. Moreover, we establish an analogy between the power improvement and switching from a t-test to a Z-test under normality. Importantly, in permutation-based multiple testing, the efficiency gain with our approach can be huge, since we attain the same power with much fewer permutations.
我们考虑在置换或符号翻转等变换的代数群下检验分布的不变性。由于这样的组通常是巨大的,基于整个组的测试通常在计算上是不可行的。因此,使用转换的随机子集是标准实践。我们通过用策略选择的固定变换子群替换随机子集来改进这一点。在广义的位置模型中,我们表明,在较低的信噪比下,结果测试通常是一致的。此外,我们建立了功率改进与正态性下从t检验切换到z检验之间的类比。重要的是,在基于排列的多重测试中,使用我们的方法可以获得巨大的效率增益,因为我们可以用更少的排列获得相同的功率。
{"title":"More Efficient Exact Group Invariance Testing: using a Representative Subgroup","authors":"N. W. Koning, J. Hemerik","doi":"10.1093/biomet/asad050","DOIUrl":"https://doi.org/10.1093/biomet/asad050","url":null,"abstract":"\u0000 We consider testing invariance of a distribution under an algebraic group of transformations, such as permutations or sign-flips. As such groups are typically huge, tests based on the full group are often computationally infeasible. Hence, it is standard practice to use a random subset of transformations. We improve upon this by replacing the random subset with a strategically chosen, fixed subgroup of transformations. In a generalized location model, we show that the resulting tests are often consistent for lower signal-to-noise ratios. Moreover, we establish an analogy between the power improvement and switching from a t-test to a Z-test under normality. Importantly, in permutation-based multiple testing, the efficiency gain with our approach can be huge, since we attain the same power with much fewer permutations.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48678941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A generalized Bayes framework for probabilistic clustering. 概率聚类的广义Bayes框架
IF 2.4 2区 数学 Q2 BIOLOGY Pub Date : 2023-09-01 Epub Date: 2023-01-19 DOI: 10.1093/biomet/asad004
Tommaso Rigon, Amy H Herring, David B Dunson

Loss-based clustering methods, such as k-means clustering and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative approach, but such methods face computational problems and are highly sensitive to the choice of kernel. In this article we propose a generalized Bayes framework that bridges between these paradigms through the use of Gibbs posteriors. In conducting Bayesian updating, the loglikelihood is replaced by a loss function for clustering, leading to a rich family of clustering methods. The Gibbs posterior represents a coherent updating of Bayesian beliefs without needing to specify a likelihood for the data, and can be used for characterizing uncertainty in clustering. We consider losses based on Bregman divergence and pairwise similarities, and develop efficient deterministic algorithms for point estimation along with sampling algorithms for uncertainty quantification. Several existing clustering algorithms, including k-means, can be interpreted as generalized Bayes estimators in our framework, and thus we provide a method of uncertainty quantification for these approaches, allowing, for example, calculation of the probability that a data point is well clustered.

基于损失的聚类方法,如k-means及其变体,是在数据中查找组的标准工具。然而,缺乏对估计聚类不确定性的量化是一个缺点。基于混合模型的基于模型的聚类提供了一种替代方法,但这种方法存在计算问题,并且对核的选择非常敏感。本文提出了一个广义贝叶斯框架,通过使用吉布斯后验在这些范式之间架起桥梁。在进行贝叶斯更新时,用损失函数代替对数似然进行聚类,从而产生了丰富的聚类方法。吉布斯后验代表了贝叶斯信念的连贯更新,而不需要指定数据的可能性,并且可以用于描述聚类中的不确定性。我们考虑了基于Bregman散度和成对相似度的损失,并开发了高效的确定性点估计算法以及用于不确定性量化的采样算法。现有的几种聚类算法,包括k-means,在我们的框架下可以被解释为广义贝叶斯估计,因此我们为这些方法提供了一种不确定性量化的方法;例如,允许计算数据点聚类良好的概率。
{"title":"A generalized Bayes framework for probabilistic clustering.","authors":"Tommaso Rigon, Amy H Herring, David B Dunson","doi":"10.1093/biomet/asad004","DOIUrl":"10.1093/biomet/asad004","url":null,"abstract":"<p><p>Loss-based clustering methods, such as k-means clustering and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative approach, but such methods face computational problems and are highly sensitive to the choice of kernel. In this article we propose a generalized Bayes framework that bridges between these paradigms through the use of Gibbs posteriors. In conducting Bayesian updating, the loglikelihood is replaced by a loss function for clustering, leading to a rich family of clustering methods. The Gibbs posterior represents a coherent updating of Bayesian beliefs without needing to specify a likelihood for the data, and can be used for characterizing uncertainty in clustering. We consider losses based on Bregman divergence and pairwise similarities, and develop efficient deterministic algorithms for point estimation along with sampling algorithms for uncertainty quantification. Several existing clustering algorithms, including k-means, can be interpreted as generalized Bayes estimators in our framework, and thus we provide a method of uncertainty quantification for these approaches, allowing, for example, calculation of the probability that a data point is well clustered.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":"559-578"},"PeriodicalIF":2.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11840691/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46381325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Marginal proportional hazards models for multivariate interval-censored data. 多变量区间删失数据的边际比例危害模型。
IF 2.4 2区 数学 Q2 BIOLOGY Pub Date : 2023-09-01 Epub Date: 2022-11-02 DOI: 10.1093/biomet/asac059
Yangjianchen Xu, Donglin Zeng, D Y Lin

Multivariate interval-censored data arise when there are multiple types of events or clusters of study subjects, such that the event times are potentially correlated and when each event is only known to occur over a particular time interval. We formulate the effects of potentially time-varying covariates on the multivariate event times through marginal proportional hazards models while leaving the dependence structures of the related event times unspecified. We construct the nonparametric pseudolikelihood under the working assumption that all event times are independent, and we provide a simple and stable EM-type algorithm. The resulting nonparametric maximum pseudolikelihood estimators for the regression parameters are shown to be consistent and asymptotically normal, with a limiting covariance matrix that can be consistently estimated by a sandwich estimator under arbitrary dependence structures for the related event times. We evaluate the performance of the proposed methods through extensive simulation studies and present an application to data from the Atherosclerosis Risk in Communities Study.

当存在多种类型的事件或研究对象集群时,就会产生多变量区间删失数据,从而使事件时间具有潜在的相关性,并且每个事件只在特定的时间区间内发生。我们通过边际比例危险模型来计算可能随时间变化的协变量对多元事件时间的影响,同时不指定相关事件时间的依赖结构。我们在所有事件时间都是独立的工作假设下构建了非参数伪概率,并提供了一种简单稳定的 EM 型算法。结果表明,回归参数的非参数最大伪似然估计值是一致的、渐近正态的,其极限协方差矩阵可以在相关事件时间的任意依赖结构下通过三明治估计值进行一致估计。我们通过大量的模拟研究评估了所提方法的性能,并介绍了对社区动脉粥样硬化风险研究数据的应用。
{"title":"Marginal proportional hazards models for multivariate interval-censored data.","authors":"Yangjianchen Xu, Donglin Zeng, D Y Lin","doi":"10.1093/biomet/asac059","DOIUrl":"10.1093/biomet/asac059","url":null,"abstract":"<p><p>Multivariate interval-censored data arise when there are multiple types of events or clusters of study subjects, such that the event times are potentially correlated and when each event is only known to occur over a particular time interval. We formulate the effects of potentially time-varying covariates on the multivariate event times through marginal proportional hazards models while leaving the dependence structures of the related event times unspecified. We construct the nonparametric pseudolikelihood under the working assumption that all event times are independent, and we provide a simple and stable EM-type algorithm. The resulting nonparametric maximum pseudolikelihood estimators for the regression parameters are shown to be consistent and asymptotically normal, with a limiting covariance matrix that can be consistently estimated by a sandwich estimator under arbitrary dependence structures for the related event times. We evaluate the performance of the proposed methods through extensive simulation studies and present an application to data from the Atherosclerosis Risk in Communities Study.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"110 3","pages":"815-830"},"PeriodicalIF":2.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10434824/pdf/nihms-1874830.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10490393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ASSESSING TIME-VARYING CAUSAL EFFECT MODERATION IN THE PRESENCE OF CLUSTER-LEVEL TREATMENT EFFECT HETEROGENEITY AND INTERFERENCE. 在集群水平治疗效果异质性和干扰存在的情况下评估时变因果效应调节。
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2023-09-01 DOI: 10.1093/biomet/asac065
Jieru Shi, Zhenke Wu, Walter Dempsey

The micro-randomized trial (MRT) is a sequential randomized experimental design to empirically evaluate the effectiveness of mobile health (mHealth) intervention components that may be delivered at hundreds or thousands of decision points. MRTs have motivated a new class of causal estimands, termed "causal excursion effects", for which semiparametric inference can be conducted via a weighted, centered least squares criterion (Boruvka et al., 2018). Existing methods assume between-subject independence and non-interference. Deviations from these assumptions often occur. In this paper, causal excursion effects are revisited under potential cluster-level treatment effect heterogeneity and interference, where the treatment effect of interest may depend on cluster-level moderators. Utility of the proposed methods is shown by analyzing data from a multi-institution cohort of first year medical residents in the United States.

微随机试验(MRT)是一种顺序随机实验设计,旨在经验性地评估可能在数百或数千个决策点提供的移动医疗(mHealth)干预组件的有效性。mrt激发了一类新的因果估计,称为“因果偏移效应”,其中半参数推理可以通过加权的中心最小二乘准则进行(Boruvka等人,2018)。现有的方法假设主体间独立性和不干涉性。偏离这些假设的情况经常发生。本文在潜在的集群水平治疗效应异质性和干扰下重新审视了因果偏移效应,其中感兴趣的治疗效应可能取决于集群水平调节因子。通过分析来自美国多机构的第一年住院医生队列的数据,显示了所提出方法的效用。
{"title":"ASSESSING TIME-VARYING CAUSAL EFFECT MODERATION IN THE PRESENCE OF CLUSTER-LEVEL TREATMENT EFFECT HETEROGENEITY AND INTERFERENCE.","authors":"Jieru Shi,&nbsp;Zhenke Wu,&nbsp;Walter Dempsey","doi":"10.1093/biomet/asac065","DOIUrl":"https://doi.org/10.1093/biomet/asac065","url":null,"abstract":"<p><p>The micro-randomized trial (MRT) is a sequential randomized experimental design to empirically evaluate the effectiveness of mobile health (mHealth) intervention components that may be delivered at hundreds or thousands of decision points. MRTs have motivated a new class of causal estimands, termed \"causal excursion effects\", for which semiparametric inference can be conducted via a weighted, centered least squares criterion (Boruvka et al., 2018). Existing methods assume between-subject independence and non-interference. Deviations from these assumptions often occur. In this paper, causal excursion effects are revisited under potential cluster-level treatment effect heterogeneity and interference, where the treatment effect of interest may depend on cluster-level moderators. Utility of the proposed methods is shown by analyzing data from a multi-institution cohort of first year medical residents in the United States.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"110 3","pages":"645-662"},"PeriodicalIF":2.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10501736/pdf/nihms-1882489.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10653942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Deep Kronecker Network 深度克罗内克网络
2区 数学 Q2 BIOLOGY Pub Date : 2023-08-31 DOI: 10.1093/biomet/asad049
Long Feng, Guang Yang
Summary We develop a novel framework named Deep Kronecker Network for the analysis of medical imaging data, including magnetic resonance imaging (MRI), functional MRI, computed tomography, and more. Medical imaging data differs from general images in two main aspects: i) the sample size is often considerably smaller, and ii) the interpretation of the model is usually more crucial than predicting the outcome. As a result, standard methods such as convolutional neural networks cannot be directly applied to medical imaging analysis. Therefore, we propose the Deep Kronecker Network, which can adapt to the low sample size constraint and offer the desired model interpretation. Our approach is versatile, as it works for both matrix and tensor represented image data and can be applied to discrete and continuous outcomes. The Deep Kronecker network is built upon a Kronecker product structure, which implicitly enforces a piecewise smooth property on coefficients. Moreover, our approach resembles a fully convolutional network as the Kronecker structure can be expressed in a convolutional form. Interestingly, our approach also has strong connections to the tensor regression framework proposed by Zhou et al. (2013), which imposes a canonical low-rank structure on tensor coefficients. We conduct both classification and regression analyses using real MRI data from the Alzheimer’s Disease Neuroimaging Initiative to demonstrate the effectiveness of our approach.
我们开发了一个名为Deep Kronecker Network的新框架,用于分析医学成像数据,包括磁共振成像(MRI)、功能性MRI、计算机断层扫描等。医学成像数据与一般图像在两个主要方面不同:i)样本量通常要小得多,ii)对模型的解释通常比预测结果更重要。因此,卷积神经网络等标准方法不能直接应用于医学成像分析。因此,我们提出了深度Kronecker网络,它可以适应低样本容量约束并提供所需的模型解释。我们的方法是通用的,因为它适用于矩阵和张量表示的图像数据,可以应用于离散和连续的结果。深度Kronecker网络建立在Kronecker积结构上,该结构隐式地在系数上强制执行分段平滑特性。此外,我们的方法类似于一个完全卷积网络,因为Kronecker结构可以用卷积形式表示。有趣的是,我们的方法也与Zhou等人(2013)提出的张量回归框架有很强的联系,该框架对张量系数施加了典型的低秩结构。我们使用来自阿尔茨海默病神经成像倡议的真实MRI数据进行分类和回归分析,以证明我们方法的有效性。
{"title":"Deep Kronecker Network","authors":"Long Feng, Guang Yang","doi":"10.1093/biomet/asad049","DOIUrl":"https://doi.org/10.1093/biomet/asad049","url":null,"abstract":"Summary We develop a novel framework named Deep Kronecker Network for the analysis of medical imaging data, including magnetic resonance imaging (MRI), functional MRI, computed tomography, and more. Medical imaging data differs from general images in two main aspects: i) the sample size is often considerably smaller, and ii) the interpretation of the model is usually more crucial than predicting the outcome. As a result, standard methods such as convolutional neural networks cannot be directly applied to medical imaging analysis. Therefore, we propose the Deep Kronecker Network, which can adapt to the low sample size constraint and offer the desired model interpretation. Our approach is versatile, as it works for both matrix and tensor represented image data and can be applied to discrete and continuous outcomes. The Deep Kronecker network is built upon a Kronecker product structure, which implicitly enforces a piecewise smooth property on coefficients. Moreover, our approach resembles a fully convolutional network as the Kronecker structure can be expressed in a convolutional form. Interestingly, our approach also has strong connections to the tensor regression framework proposed by Zhou et al. (2013), which imposes a canonical low-rank structure on tensor coefficients. We conduct both classification and regression analyses using real MRI data from the Alzheimer’s Disease Neuroimaging Initiative to demonstrate the effectiveness of our approach.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135830829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kernel interpolation generalizes poorly 核插值的泛化性很差
2区 数学 Q2 BIOLOGY Pub Date : 2023-08-07 DOI: 10.1093/biomet/asad048
Yicheng Li, Haobo Zhang, Qian Lin
Summary One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether kernel interpolation can generalize well, since it may help us understand the ‘benign overfitting phenomenon’ reported in the literature on deep networks. In this paper, under mild conditions, we show that, for any ε&gt;0, the generalization error of kernel interpolation is lower bounded by Ω(n−ε). In other words, the kernel interpolation generalizes poorly for a large class of kernels. As a direct corollary, we can show that overfitted wide neural networks defined on the sphere generalize poorly.
最近核回归研究复兴中最有趣的问题之一可能是核插值是否可以很好地泛化,因为它可以帮助我们理解深度网络文献中报道的“良性过拟合现象”。在温和条件下,我们证明了对于任意ε>0,核插值的泛化误差下界为Ω(n−ε)。换句话说,对于大量的核,核插值的泛化效果很差。作为一个直接推论,我们可以证明在球上定义的过拟合宽神经网络泛化效果很差。
{"title":"Kernel interpolation generalizes poorly","authors":"Yicheng Li, Haobo Zhang, Qian Lin","doi":"10.1093/biomet/asad048","DOIUrl":"https://doi.org/10.1093/biomet/asad048","url":null,"abstract":"Summary One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether kernel interpolation can generalize well, since it may help us understand the ‘benign overfitting phenomenon’ reported in the literature on deep networks. In this paper, under mild conditions, we show that, for any ε&amp;gt;0, the generalization error of kernel interpolation is lower bounded by Ω(n−ε). In other words, the kernel interpolation generalizes poorly for a large class of kernels. As a direct corollary, we can show that overfitted wide neural networks defined on the sphere generalize poorly.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135904639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Biometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1