Genomics and other studies encounter many features and a selection of essential features with high accuracy is desired. In recent years, there has been a significant advancement in the use of Bayesian inference for variable (or feature) selection. However, there needs to be more practical information regarding their implementation and assessment of their relative performance. Our goal in this paper is to perform a comparative analysis of approaches, mainly from different Bayesian genres that apply to genomic analysis. In particular, we are examining how well shrinkage, global-local, and mixture priors, SUSIE, and a simple two-step procedure-namely, RFSFS, which we propose-perform in terms of several metrics: FDR, FNR, F-score, and mean squared prediction error under various simulation scenarios. There is no single method that outperforms others uniformly across all scenarios and in terms of variable selection and prediction performance metrics. So, we order the methods based on the average ranking across different scenarios. We found LASSO, spike-and-slab prior with normal slab (SN), and RFSFS are the most competitive methods for FDR and F-score when features are uncorrelated. When features are correlated, SN, SuSIE, and RFSFS are the most competitive methods for FDR whereas LASSO has an edge over SuSIE in terms of F-score. For illustration, we have applied these methods to analyzed The Cancer Genome Atlas Program (TCGA) renal cell carcinoma (RCC) data and have offered methodological direction.
扫码关注我们
求助内容:
应助结果提醒方式:
