Summary We develop a novel framework named Deep Kronecker Network for the analysis of medical imaging data, including magnetic resonance imaging (MRI), functional MRI, computed tomography, and more. Medical imaging data differs from general images in two main aspects: i) the sample size is often considerably smaller, and ii) the interpretation of the model is usually more crucial than predicting the outcome. As a result, standard methods such as convolutional neural networks cannot be directly applied to medical imaging analysis. Therefore, we propose the Deep Kronecker Network, which can adapt to the low sample size constraint and offer the desired model interpretation. Our approach is versatile, as it works for both matrix and tensor represented image data and can be applied to discrete and continuous outcomes. The Deep Kronecker network is built upon a Kronecker product structure, which implicitly enforces a piecewise smooth property on coefficients. Moreover, our approach resembles a fully convolutional network as the Kronecker structure can be expressed in a convolutional form. Interestingly, our approach also has strong connections to the tensor regression framework proposed by Zhou et al. (2013), which imposes a canonical low-rank structure on tensor coefficients. We conduct both classification and regression analyses using real MRI data from the Alzheimer’s Disease Neuroimaging Initiative to demonstrate the effectiveness of our approach.
{"title":"Deep Kronecker Network","authors":"Long Feng, Guang Yang","doi":"10.1093/biomet/asad049","DOIUrl":"https://doi.org/10.1093/biomet/asad049","url":null,"abstract":"Summary We develop a novel framework named Deep Kronecker Network for the analysis of medical imaging data, including magnetic resonance imaging (MRI), functional MRI, computed tomography, and more. Medical imaging data differs from general images in two main aspects: i) the sample size is often considerably smaller, and ii) the interpretation of the model is usually more crucial than predicting the outcome. As a result, standard methods such as convolutional neural networks cannot be directly applied to medical imaging analysis. Therefore, we propose the Deep Kronecker Network, which can adapt to the low sample size constraint and offer the desired model interpretation. Our approach is versatile, as it works for both matrix and tensor represented image data and can be applied to discrete and continuous outcomes. The Deep Kronecker network is built upon a Kronecker product structure, which implicitly enforces a piecewise smooth property on coefficients. Moreover, our approach resembles a fully convolutional network as the Kronecker structure can be expressed in a convolutional form. Interestingly, our approach also has strong connections to the tensor regression framework proposed by Zhou et al. (2013), which imposes a canonical low-rank structure on tensor coefficients. We conduct both classification and regression analyses using real MRI data from the Alzheimer’s Disease Neuroimaging Initiative to demonstrate the effectiveness of our approach.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135830829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether kernel interpolation can generalize well, since it may help us understand the ‘benign overfitting phenomenon’ reported in the literature on deep networks. In this paper, under mild conditions, we show that, for any ε>0, the generalization error of kernel interpolation is lower bounded by Ω(n−ε). In other words, the kernel interpolation generalizes poorly for a large class of kernels. As a direct corollary, we can show that overfitted wide neural networks defined on the sphere generalize poorly.
{"title":"Kernel interpolation generalizes poorly","authors":"Yicheng Li, Haobo Zhang, Qian Lin","doi":"10.1093/biomet/asad048","DOIUrl":"https://doi.org/10.1093/biomet/asad048","url":null,"abstract":"Summary One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether kernel interpolation can generalize well, since it may help us understand the ‘benign overfitting phenomenon’ reported in the literature on deep networks. In this paper, under mild conditions, we show that, for any ε>0, the generalization error of kernel interpolation is lower bounded by Ω(n−ε). In other words, the kernel interpolation generalizes poorly for a large class of kernels. As a direct corollary, we can show that overfitted wide neural networks defined on the sphere generalize poorly.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135904639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the field of multiple hypothesis testing, auxiliary information can be leveraged to enhance the efficiency of test procedures. A common way to make use of auxiliary information is by weighting p-values. However, when the weights are learned from data, controlling the finite-sample false discovery rate becomes challenging, and most existing weighted procedures only guarantee false discovery rate control in an asymptotic limit. In a recent study conducted by Ignatiadis & Huber (2021), a novel τ-censored weighted Benjamini-Hochberg procedure was proposed to control the finite-sample false discovery rate. The authors employed the cross-weighting approach to learn weights for the p-values. This approach randomly splits the data into several folds and constructs a weight for each p-value Pi using the p-values outside the fold containing Pi. Cross-weighting does not exploit the p-value information inside the fold and only balances the weights within each fold, which may result in a loss of power. In this article, we introduce two methods for constructing data-driven weights for τ-censored weighted Benjamini-Hochberg procedures under independence. They provide new insight into masking p-values to prevent overfitting in multiple testing. The first method utilizes a leave-one-out technique, where all but one of the p-values are used to learn a weight for each p-value. This technique masks the information of a p-value in its weight by calculating the infimum of the weight with respect to the p-value. The second method uses partial information from each p-value to construct weights and utilizes the conditional distributions of the null p-values to establish false discovery rate control. Additionally, we propose two methods for estimating the null proportion and demonstrate how to integrate null-proportion adaptivity into the proposed weights to improve power.
{"title":"τ -censored weighted Benjamini-Hochberg procedures under independence","authors":"Haibing Zhao, Huijuan Zhou","doi":"10.1093/biomet/asad047","DOIUrl":"https://doi.org/10.1093/biomet/asad047","url":null,"abstract":"\u0000 In the field of multiple hypothesis testing, auxiliary information can be leveraged to enhance the efficiency of test procedures. A common way to make use of auxiliary information is by weighting p-values. However, when the weights are learned from data, controlling the finite-sample false discovery rate becomes challenging, and most existing weighted procedures only guarantee false discovery rate control in an asymptotic limit. In a recent study conducted by Ignatiadis & Huber (2021), a novel τ-censored weighted Benjamini-Hochberg procedure was proposed to control the finite-sample false discovery rate. The authors employed the cross-weighting approach to learn weights for the p-values. This approach randomly splits the data into several folds and constructs a weight for each p-value Pi using the p-values outside the fold containing Pi. Cross-weighting does not exploit the p-value information inside the fold and only balances the weights within each fold, which may result in a loss of power. In this article, we introduce two methods for constructing data-driven weights for τ-censored weighted Benjamini-Hochberg procedures under independence. They provide new insight into masking p-values to prevent overfitting in multiple testing. The first method utilizes a leave-one-out technique, where all but one of the p-values are used to learn a weight for each p-value. This technique masks the information of a p-value in its weight by calculating the infimum of the weight with respect to the p-value. The second method uses partial information from each p-value to construct weights and utilizes the conditional distributions of the null p-values to establish false discovery rate control. Additionally, we propose two methods for estimating the null proportion and demonstrate how to integrate null-proportion adaptivity into the proposed weights to improve power.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49253424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a debiased stochastic gradient descent algorithm for online statistical inference with high-dimensional data. Our approach combines the debiasing technique developed in high-dimensional statistics with the stochastic gradient descent algorithm. It can be used for efficiently constructing confidence intervals in an online fashion. Our proposed algorithm has several appealing aspects: first, as a one-pass algorithm, it reduces the time complexity; in addition, each update step requires only the current data together with the previous estimate, which reduces the space complexity. We establish the asymptotic normality of the proposed estimator under mild conditions on the sparsity level of the parameter and the data distribution. We conduct numerical experiments to demonstrate the proposed debiased stochastic gradient descent algorithm reaches nominal coverage probability. Furthermore, we illustrate our method with a high-dimensional text dataset.
{"title":"Online Inference with Debiased Stochastic Gradient Descent","authors":"Ruijian Han, Lan Luo, Yuanyuan Lin, Jian Huang","doi":"10.1093/biomet/asad046","DOIUrl":"https://doi.org/10.1093/biomet/asad046","url":null,"abstract":"\u0000 We propose a debiased stochastic gradient descent algorithm for online statistical inference with high-dimensional data. Our approach combines the debiasing technique developed in high-dimensional statistics with the stochastic gradient descent algorithm. It can be used for efficiently constructing confidence intervals in an online fashion. Our proposed algorithm has several appealing aspects: first, as a one-pass algorithm, it reduces the time complexity; in addition, each update step requires only the current data together with the previous estimate, which reduces the space complexity. We establish the asymptotic normality of the proposed estimator under mild conditions on the sparsity level of the parameter and the data distribution. We conduct numerical experiments to demonstrate the proposed debiased stochastic gradient descent algorithm reaches nominal coverage probability. Furthermore, we illustrate our method with a high-dimensional text dataset.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44970146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It is frequently observed in practice that the Wald statistic gives a poor assessment of the statistical significance of a variance component. This paper provides detailed analytic insight into the phenomenon by way of two simple models, which point to an atypical geometry as the source of the aberration. The latter can in principle be checked numerically to cover situations of arbitrary complexity, such as those arising from elaborate forms of blocking in an experimental context, or models for longitudinal or clustered data. The salient point, echoing Dickey (2020), is that a suitable likelihood-ratio test should always be used for the assessment of variance components.
{"title":"An anomaly arising in the analysis of processes with more than one source of variability","authors":"H. Battey, P. McCullagh","doi":"10.1093/biomet/asad044","DOIUrl":"https://doi.org/10.1093/biomet/asad044","url":null,"abstract":"\u0000 It is frequently observed in practice that the Wald statistic gives a poor assessment of the statistical significance of a variance component. This paper provides detailed analytic insight into the phenomenon by way of two simple models, which point to an atypical geometry as the source of the aberration. The latter can in principle be checked numerically to cover situations of arbitrary complexity, such as those arising from elaborate forms of blocking in an experimental context, or models for longitudinal or clustered data. The salient point, echoing Dickey (2020), is that a suitable likelihood-ratio test should always be used for the assessment of variance components.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42978709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivated by cross-validation’s general ability to reduce overfitting and mean square error, we develop a cross-validation-based statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross-validation and prediction errors. Our cross-validation approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point processes. The new statistical approach, which may be used to model different distributional characteristics, exploits the prediction errors to measure how well a given model predicts validation sets using associated training sets. Having indicated that our new framework generalizes many existing statistical approaches, we then establish different theoretical properties for it, including large sample properties. We further recognize that non-parametric intensity estimation is an instance of Papangelou conditional intensity estimation, which we exploit to apply our new statistical theory to kernel intensity estimation. Using independent thinning-based cross-validation, we numerically show that the new approach substantially outperforms the state of the art in bandwidth selection. Finally, we carry out intensity estimation for a dataset in forestry (Euclidean domain) and a dataset in neurology (linear network).
{"title":"A cross-validation-based statistical theory for point processes","authors":"O. Cronie, M. Moradi, C. Biscio","doi":"10.1093/biomet/asad041","DOIUrl":"https://doi.org/10.1093/biomet/asad041","url":null,"abstract":"\u0000 Motivated by cross-validation’s general ability to reduce overfitting and mean square error, we develop a cross-validation-based statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross-validation and prediction errors. Our cross-validation approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point processes. The new statistical approach, which may be used to model different distributional characteristics, exploits the prediction errors to measure how well a given model predicts validation sets using associated training sets. Having indicated that our new framework generalizes many existing statistical approaches, we then establish different theoretical properties for it, including large sample properties. We further recognize that non-parametric intensity estimation is an instance of Papangelou conditional intensity estimation, which we exploit to apply our new statistical theory to kernel intensity estimation. Using independent thinning-based cross-validation, we numerically show that the new approach substantially outperforms the state of the art in bandwidth selection. Finally, we carry out intensity estimation for a dataset in forestry (Euclidean domain) and a dataset in neurology (linear network).","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45141737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: Ancestor regression in linear structural equation models","authors":"","doi":"10.1093/biomet/asad028","DOIUrl":"https://doi.org/10.1093/biomet/asad028","url":null,"abstract":"","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47990042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper considers binary classification of high-dimensional features under a postulated model with a low-dimensional latent Gaussian mixture structure and nonvanishing noise. A generalized least-squares estimator is used to estimate the direction of the optimal separating hyperplane. The estimated hyperplane is shown to interpolate on the training data. While the direction vector can be consistently estimated, as could be expected from recent results in linear regression, a naive plug-in estimate fails to consistently estimate the intercept. A simple correction, which requires an independent hold-out sample, renders the procedure minimax optimal in many scenarios. The interpolation property of the latter procedure can be retained, but surprisingly depends on the way the labels are encoded.
{"title":"Interpolating discriminant functions in high-dimensional Gaussian latent mixtures","authors":"Xin Bing, Marten Wegkamp","doi":"10.1093/biomet/asad037","DOIUrl":"https://doi.org/10.1093/biomet/asad037","url":null,"abstract":"Abstract This paper considers binary classification of high-dimensional features under a postulated model with a low-dimensional latent Gaussian mixture structure and nonvanishing noise. A generalized least-squares estimator is used to estimate the direction of the optimal separating hyperplane. The estimated hyperplane is shown to interpolate on the training data. While the direction vector can be consistently estimated, as could be expected from recent results in linear regression, a naive plug-in estimate fails to consistently estimate the intercept. A simple correction, which requires an independent hold-out sample, renders the procedure minimax optimal in many scenarios. The interpolation property of the latter procedure can be retained, but surprisingly depends on the way the labels are encoded.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135215337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew J Tudball, Rachael A Hughes, Kate Tilling, Jack Bowden, Qingyuan Zhao
Many partial identification problems can be characterized by the optimal value of a function over a set where both the function and set need to be estimated by empirical data. Despite some progress for convex problems, statistical inference in this general setting remains to be developed. To address this, we derive an asymptotically valid confidence interval for the optimal value through an appropriate relaxation of the estimated set. We then apply this general result to the problem of selection bias in population-based cohort studies. We show that existing sensitivity analyses, which are often conservative and difficult to implement, can be formulated in our framework and made significantly more informative via auxiliary information on the population. We conduct a simulation study to evaluate the finite sample performance of our inference procedure, and conclude with a substantive motivating example on the causal effect of education on income in the highly selected UK Biobank cohort. We demonstrate that our method can produce informative bounds using plausible population-level auxiliary constraints. We implement this method in the [Formula: see text] package [Formula: see text].
许多部分辨识问题的特征是函数在集合上的最优值,其中函数和集合都需要由经验数据估计。尽管在凸问题上取得了一些进展,但在这种一般情况下的统计推断仍有待发展。为了解决这个问题,我们通过对估计集进行适当的松弛,推导出最优值的渐近有效置信区间。然后,我们将这一一般结果应用于基于人群的队列研究中的选择偏倚问题。我们表明,现有的敏感性分析往往是保守的,难以实施,可以在我们的框架中制定,并通过对人口的辅助信息使信息更加丰富。我们进行了一项模拟研究,以评估我们的推理过程的有限样本性能,并以一个实质性的激励例子来总结教育对收入的因果影响,这个例子是在高度选择的英国生物银行队列中进行的。我们证明了我们的方法可以使用合理的人口水平辅助约束产生信息界。我们在[Formula: see text]包[Formula: see text]中实现了这个方法。
{"title":"Sample-constrained partial identification with application to selection bias.","authors":"Matthew J Tudball, Rachael A Hughes, Kate Tilling, Jack Bowden, Qingyuan Zhao","doi":"10.1093/biomet/asac042","DOIUrl":"https://doi.org/10.1093/biomet/asac042","url":null,"abstract":"<p><p>Many partial identification problems can be characterized by the optimal value of a function over a set where both the function and set need to be estimated by empirical data. Despite some progress for convex problems, statistical inference in this general setting remains to be developed. To address this, we derive an asymptotically valid confidence interval for the optimal value through an appropriate relaxation of the estimated set. We then apply this general result to the problem of selection bias in population-based cohort studies. We show that existing sensitivity analyses, which are often conservative and difficult to implement, can be formulated in our framework and made significantly more informative via auxiliary information on the population. We conduct a simulation study to evaluate the finite sample performance of our inference procedure, and conclude with a substantive motivating example on the causal effect of education on income in the highly selected UK Biobank cohort. We demonstrate that our method can produce informative bounds using plausible population-level auxiliary constraints. We implement this method in the [Formula: see text] package [Formula: see text].</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"110 2","pages":"485-498"},"PeriodicalIF":2.7,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10183833/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9914105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.
{"title":"Bayesian learning of network structures from interventional experimental data","authors":"F. Castelletti, S. Peluso","doi":"10.1093/biomet/asad032","DOIUrl":"https://doi.org/10.1093/biomet/asad032","url":null,"abstract":"\u0000 Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43958916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}