Abstract Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterising the fit of the model to the underlying conditional law of labels given the features vector (Y∣X), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessing the goodness-of-fit for a general binary classifier. Our framework does not make any parametric assumption on the conditional law Y∣X and treats that as a black-box oracle model which can be accessed only through queries. We formulate the goodness-of-fit assessment problem as a tolerance hypothesis testing of the form H0:E[Df(Bern(η(X))‖Bern(η^(X)))]≤τ where Df represents an f-divergence function, and η(x), η^(x), respectively, denote the true and an estimate likelihood for a feature vector x admitting a positive label. We propose a novel test, called Goodness-of-fit with Randomisation and Scoring Procedure (GRASP) for testing H0, which works in finite sample settings, no matter the features (distribution-free). We also propose model-X GRASP designed for model-X settings where the joint distribution of the features vector is known. Model-X GRASP uses this distributional information to achieve better power. We evaluate the performance of our tests through extensive numerical experiments.
{"title":"GRASP: a goodness-of-fit test for classification learning","authors":"Adel Javanmard, Mohammad Mehrabi","doi":"10.1093/jrsssb/qkad106","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad106","url":null,"abstract":"Abstract Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterising the fit of the model to the underlying conditional law of labels given the features vector (Y∣X), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessing the goodness-of-fit for a general binary classifier. Our framework does not make any parametric assumption on the conditional law Y∣X and treats that as a black-box oracle model which can be accessed only through queries. We formulate the goodness-of-fit assessment problem as a tolerance hypothesis testing of the form H0:E[Df(Bern(η(X))‖Bern(η^(X)))]≤τ where Df represents an f-divergence function, and η(x), η^(x), respectively, denote the true and an estimate likelihood for a feature vector x admitting a positive label. We propose a novel test, called Goodness-of-fit with Randomisation and Scoring Procedure (GRASP) for testing H0, which works in finite sample settings, no matter the features (distribution-free). We also propose model-X GRASP designed for model-X settings where the joint distribution of the features vector is known. Model-X GRASP uses this distributional information to achieve better power. We evaluate the performance of our tests through extensive numerical experiments.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135957976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Maullin-Sapey, Armin Schwartzman, Thomas E Nichols
Abstract The analysis of excursion sets in imaging data is essential to a wide range of scientific disciplines such as neuroimaging, climatology, and cosmology. Despite growing literature, there is little published concerning the comparison of processes that have been sampled across the same spatial region but which reflect different study conditions. Given a set of asymptotically Gaussian random fields, each corresponding to a sample acquired for a different study condition, this work aims to provide confidence statements about the intersection, or union, of the excursion sets across all fields. Such spatial regions are of natural interest as they directly correspond to the questions ‘Where do all random fields exceed a predetermined threshold?’, or ‘Where does at least one random field exceed a predetermined threshold?’. To assess the degree of spatial variability present, our method provides, with a desired confidence, subsets and supersets of spatial regions defined by logical conjunctions (i.e. set intersections) or disjunctions (i.e. set unions), without any assumption on the dependence between the different fields. The method is verified by extensive simulations and demonstrated using task-fMRI data to identify brain regions with activation common to four variants of a working memory task.
{"title":"Spatial confidence regions for combinations of excursion sets in image analysis","authors":"Thomas Maullin-Sapey, Armin Schwartzman, Thomas E Nichols","doi":"10.1093/jrsssb/qkad104","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad104","url":null,"abstract":"Abstract The analysis of excursion sets in imaging data is essential to a wide range of scientific disciplines such as neuroimaging, climatology, and cosmology. Despite growing literature, there is little published concerning the comparison of processes that have been sampled across the same spatial region but which reflect different study conditions. Given a set of asymptotically Gaussian random fields, each corresponding to a sample acquired for a different study condition, this work aims to provide confidence statements about the intersection, or union, of the excursion sets across all fields. Such spatial regions are of natural interest as they directly correspond to the questions ‘Where do all random fields exceed a predetermined threshold?’, or ‘Where does at least one random field exceed a predetermined threshold?’. To assess the degree of spatial variability present, our method provides, with a desired confidence, subsets and supersets of spatial regions defined by logical conjunctions (i.e. set intersections) or disjunctions (i.e. set unions), without any assumption on the dependence between the different fields. The method is verified by extensive simulations and demonstrated using task-fMRI data to identify brain regions with activation common to four variants of a working memory task.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136238528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We develop a novel, general framework for reduced-bias M-estimation from asymptotically unbiased estimating functions. The framework relies on an empirical approximation of the bias by a function of derivatives of estimating function contributions. Reduced-bias M-estimation operates either implicitly, solving empirically adjusted estimating equations, or explicitly, subtracting the estimated bias from the original M-estimates, and applies to partially or fully specified models with likelihoods or surrogate objectives. Automatic differentiation can abstract away the algebra required to implement reduced-bias M-estimation. As a result, the bias-reduction methods, we introduce have broader applicability, straightforward implementation, and less algebraic or computational effort than other established bias-reduction methods that require resampling or expectations of products of log-likelihood derivatives. If M-estimation is by maximising an objective, then there always exists a bias-reducing penalised objective. That penalised objective relates to information criteria for model selection and can be enhanced with plug-in penalties to deliver reduced-bias M-estimates with extra properties, like finiteness for categorical data models. Inferential procedures and model selection procedures for M-estimators apply unaltered with the reduced-bias M-estimates. We demonstrate and assess the properties of reduced-bias M-estimation in well-used, prominent modelling settings of varying complexity.
{"title":"Empirical bias-reducing adjustments to estimating functions","authors":"Ioannis Kosmidis, Nicola Lunardon","doi":"10.1093/jrsssb/qkad083","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad083","url":null,"abstract":"Abstract We develop a novel, general framework for reduced-bias M-estimation from asymptotically unbiased estimating functions. The framework relies on an empirical approximation of the bias by a function of derivatives of estimating function contributions. Reduced-bias M-estimation operates either implicitly, solving empirically adjusted estimating equations, or explicitly, subtracting the estimated bias from the original M-estimates, and applies to partially or fully specified models with likelihoods or surrogate objectives. Automatic differentiation can abstract away the algebra required to implement reduced-bias M-estimation. As a result, the bias-reduction methods, we introduce have broader applicability, straightforward implementation, and less algebraic or computational effort than other established bias-reduction methods that require resampling or expectations of products of log-likelihood derivatives. If M-estimation is by maximising an objective, then there always exists a bias-reducing penalised objective. That penalised objective relates to information criteria for model selection and can be enhanced with plug-in penalties to deliver reduced-bias M-estimates with extra properties, like finiteness for categorical data models. Inferential procedures and model selection procedures for M-estimators apply unaltered with the reduced-bias M-estimates. We demonstrate and assess the properties of reduced-bias M-estimation in well-used, prominent modelling settings of varying complexity.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135304899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vishesh Karwa, Debdeep Pati, Sonja Petrović, Liam Solus, Nikita Alexeev, Mateja Raič, Dane Wilburne, Robert Williams, Bowei Yan
Abstract We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel for network data. Since all of the stochastic blockmodel variants are log-linear in form when block assignments are known, the tests for the latent block model versions combine a block membership estimator with the algebraic statistics machinery for testing goodness-of-fit in log-linear models. We describe Markov bases and marginal polytopes of the variants of the stochastic blockmodel and discuss how both facilitate the development of goodness-of-fit tests and understanding of model behaviour. The general testing methodology developed here extends to any finite mixture of log-linear models on discrete data, and as such is the first application of the algebraic statistics machinery for latent-variable models.
{"title":"Monte Carlo goodness-of-fit tests for degree corrected and related stochastic blockmodels","authors":"Vishesh Karwa, Debdeep Pati, Sonja Petrović, Liam Solus, Nikita Alexeev, Mateja Raič, Dane Wilburne, Robert Williams, Bowei Yan","doi":"10.1093/jrsssb/qkad084","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad084","url":null,"abstract":"Abstract We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel for network data. Since all of the stochastic blockmodel variants are log-linear in form when block assignments are known, the tests for the latent block model versions combine a block membership estimator with the algebraic statistics machinery for testing goodness-of-fit in log-linear models. We describe Markov bases and marginal polytopes of the variants of the stochastic blockmodel and discuss how both facilitate the development of goodness-of-fit tests and understanding of model behaviour. The general testing methodology developed here extends to any finite mixture of log-linear models on discrete data, and as such is the first application of the algebraic statistics machinery for latent-variable models.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135394666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Bayesian modelling helps applied researchers to articulate assumptions about their data and develop models tailored for specific applications. Thanks to good methods for approximate posterior inference, researchers can now easily build, use, and revise complicated Bayesian models for large and rich data. These capabilities, however, bring into focus the problem of model criticism. Researchers need tools to diagnose the fitness of their models, to understand where they fall short, and to guide their revision. In this paper, we develop a new method for Bayesian model criticism, the holdout predictive check (HPC). Holdout predictive check are built on posterior predictive check (PPC), a seminal method that checks a model by assessing the posterior predictive distribution on the observed data. However, PPC use the data twice—both to calculate the posterior predictive and to evaluate it—which can lead to uncalibrated p-values. Holdout predictive check, in contrast, compare the posterior predictive distribution to a draw from the population distribution, a heldout dataset. This method blends Bayesian modelling with frequentist assessment. Unlike the PPC, we prove that the HPC is properly calibrated. Empirically, we study HPC on classical regression, a hierarchical model of text data, and factor analysis.
{"title":"Holdout predictive checks for Bayesian model criticism","authors":"Gemma E Moran, David M Blei, Rajesh Ranganath","doi":"10.1093/jrsssb/qkad105","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad105","url":null,"abstract":"Abstract Bayesian modelling helps applied researchers to articulate assumptions about their data and develop models tailored for specific applications. Thanks to good methods for approximate posterior inference, researchers can now easily build, use, and revise complicated Bayesian models for large and rich data. These capabilities, however, bring into focus the problem of model criticism. Researchers need tools to diagnose the fitness of their models, to understand where they fall short, and to guide their revision. In this paper, we develop a new method for Bayesian model criticism, the holdout predictive check (HPC). Holdout predictive check are built on posterior predictive check (PPC), a seminal method that checks a model by assessing the posterior predictive distribution on the observed data. However, PPC use the data twice—both to calculate the posterior predictive and to evaluate it—which can lead to uncalibrated p-values. Holdout predictive check, in contrast, compare the posterior predictive distribution to a draw from the population distribution, a heldout dataset. This method blends Bayesian modelling with frequentist assessment. Unlike the PPC, we prove that the HPC is properly calibrated. Empirically, we study HPC on classical regression, a hierarchical model of text data, and factor analysis.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135394458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Huk, Lorenzo Pacchiardi, Ritabrata Dutta, Mark Steel
{"title":"David Huk, Lorenzo Pacchiardi, Ritabrata Dutta and Mark Steel’s contribution to the Discussion of “Martingale Posterior Distributions” by Fong, Holmes and Walker","authors":"David Huk, Lorenzo Pacchiardi, Ritabrata Dutta, Mark Steel","doi":"10.1093/jrsssb/qkad094","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad094","url":null,"abstract":"","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135552136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: Semi-supervised approaches to efficient evaluation of model prediction performance","authors":"","doi":"10.1093/jrsssb/qkad107","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad107","url":null,"abstract":"","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135552383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We propose a very fast approximate Markov chain Monte Carlo sampling framework that is applicable to a large class of sparse Bayesian inference problems. The computational cost per iteration in several regression models is of order O(n(s+J)), where n is the sample size, s is the underlying sparsity of the model, and J is the size of a randomly selected subset of regressors. This cost can be further reduced by data sub-sampling when stochastic gradient Langevin dynamics are employed. The algorithm is an extension of the asynchronous Gibbs sampler of Johnson et al. [(2013). Analyzing Hogwild parallel Gaussian Gibbs sampling. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS’13) (Vol. 2, pp. 2715–2723)], but can be viewed from a statistical perspective as a form of Bayesian iterated sure independent screening [Fan, J., Samworth, R., & Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. Journal of Machine Learning Research, 10, 2013–2038]. We show that in high-dimensional linear regression problems, the Markov chain generated by the proposed algorithm admits an invariant distribution that recovers correctly the main signal with high probability under some statistical assumptions. Furthermore, we show that its mixing time is at most linear in the number of regressors. We illustrate the algorithm with several models.
{"title":"A fast asynchronous Markov chain Monte Carlo sampler for sparse Bayesian inference","authors":"Yves Atchadé, Liwei Wang","doi":"10.1093/jrsssb/qkad078","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad078","url":null,"abstract":"Abstract We propose a very fast approximate Markov chain Monte Carlo sampling framework that is applicable to a large class of sparse Bayesian inference problems. The computational cost per iteration in several regression models is of order O(n(s+J)), where n is the sample size, s is the underlying sparsity of the model, and J is the size of a randomly selected subset of regressors. This cost can be further reduced by data sub-sampling when stochastic gradient Langevin dynamics are employed. The algorithm is an extension of the asynchronous Gibbs sampler of Johnson et al. [(2013). Analyzing Hogwild parallel Gaussian Gibbs sampling. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS’13) (Vol. 2, pp. 2715–2723)], but can be viewed from a statistical perspective as a form of Bayesian iterated sure independent screening [Fan, J., Samworth, R., & Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. Journal of Machine Learning Research, 10, 2013–2038]. We show that in high-dimensional linear regression problems, the Markov chain generated by the proposed algorithm admits an invariant distribution that recovers correctly the main signal with high probability under some statistical assumptions. Furthermore, we show that its mixing time is at most linear in the number of regressors. We illustrate the algorithm with several models.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135781780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalised network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classes of stationary nonseparable space-time covariance functions where space can be a generalised network, a Euclidean tree, or a linear network, and where time can be linear or circular (seasonal). Because the construction principles are technical, we focus on illustrations that guide the reader through the construction of statistically interpretable examples. A simulation study demonstrates that the correct model can be recovered when compared to misspecified models. In addition, our simulation studies show that we effectively recover simulation parameters. In our data analysis, we consider a traffic accident dataset that shows improved model performance based on covariance specifications and network-based metrics.
{"title":"Stationary nonseparable space-time covariance functions on networks","authors":"Emilio Porcu, Philip A White, Marc G Genton","doi":"10.1093/jrsssb/qkad082","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad082","url":null,"abstract":"Abstract The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalised network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classes of stationary nonseparable space-time covariance functions where space can be a generalised network, a Euclidean tree, or a linear network, and where time can be linear or circular (seasonal). Because the construction principles are technical, we focus on illustrations that guide the reader through the construction of statistically interpretable examples. A simulation study demonstrates that the correct model can be recovered when compared to misspecified models. In addition, our simulation studies show that we effectively recover simulation parameters. In our data analysis, we consider a traffic accident dataset that shows improved model performance based on covariance specifications and network-based metrics.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136298458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Model-X knockoffs is a flexible wrapper method for high-dimensional regression algorithms, which provides guaranteed control of the false discovery rate (FDR). Due to the randomness inherent to the method, different runs of model-X knockoffs on the same dataset often result in different sets of selected variables, which is undesirable in practice. In this article, we introduce a methodology for derandomising model-X knockoffs with provable FDR control. The key insight of our proposed method lies in the discovery that the knockoffs procedure is in essence an e-BH procedure. We make use of this connection and derandomise model-X knockoffs by aggregating the e-values resulting from multiple knockoff realisations. We prove that the derandomised procedure controls the FDR at the desired level, without any additional conditions (in contrast, previously proposed methods for derandomisation are not able to guarantee FDR control). The proposed method is evaluated with numerical experiments, where we find that the derandomised procedure achieves comparable power and dramatically decreased selection variability when compared with model-X knockoffs.
{"title":"Derandomised knockoffs: leveraging <i>e</i>-values for false discovery rate control","authors":"Zhimei Ren, Rina Foygel Barber","doi":"10.1093/jrsssb/qkad085","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad085","url":null,"abstract":"Abstract Model-X knockoffs is a flexible wrapper method for high-dimensional regression algorithms, which provides guaranteed control of the false discovery rate (FDR). Due to the randomness inherent to the method, different runs of model-X knockoffs on the same dataset often result in different sets of selected variables, which is undesirable in practice. In this article, we introduce a methodology for derandomising model-X knockoffs with provable FDR control. The key insight of our proposed method lies in the discovery that the knockoffs procedure is in essence an e-BH procedure. We make use of this connection and derandomise model-X knockoffs by aggregating the e-values resulting from multiple knockoff realisations. We prove that the derandomised procedure controls the FDR at the desired level, without any additional conditions (in contrast, previously proposed methods for derandomisation are not able to guarantee FDR control). The proposed method is evaluated with numerical experiments, where we find that the derandomised procedure achieves comparable power and dramatically decreased selection variability when compared with model-X knockoffs.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136364059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}