转录组数据不足以控制调控网络推断中的错误发现。

Cell systems Pub Date : 2024-08-21 DOI:10.1016/j.cels.2024.07.006

Eric Kernfeld, Rebecca Keener, Patrick Cahan, Alexis Battle

{"title":"转录组数据不足以控制调控网络推断中的错误发现。","authors":"Eric Kernfeld, Rebecca Keener, Patrick Cahan, Alexis Battle","doi":"10.1016/j.cels.2024.07.006","DOIUrl":null,"url":null,"abstract":"Inference of causal transcriptional regulatory networks (TRNs) from transcriptomic data suffers notoriously from false positives. Approaches to control the false discovery rate (FDR), for example, via permutation, bootstrapping, or multivariate Gaussian distributions, suffer from several complications: difficulty in distinguishing direct from indirect regulation, nonlinear effects, and causal structure inference requiring \"causal sufficiency,\" meaning experiments that are free of any unmeasured, confounding variables. Here, we use a recently developed statistical framework, model-X knockoffs, to control the FDR while accounting for indirect effects, nonlinear dose-response, and user-provided covariates. We adjust the procedure to estimate the FDR correctly even when measured against incomplete gold standards. However, benchmarking against chromatin immunoprecipitation (ChIP) and other gold standards reveals higher observed than reported FDR. This indicates that unmeasured confounding is a major driver of FDR in TRN inference. A record of this paper's transparent peer review process is included in the supplemental information.","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":"15 8","pages":"709-724.e13"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11642480/pdf/","citationCount":"0","resultStr":"{\"title\":\"Transcriptome data are insufficient to control false discoveries in regulatory network inference.\",\"authors\":\"Eric Kernfeld, Rebecca Keener, Patrick Cahan, Alexis Battle\",\"doi\":\"10.1016/j.cels.2024.07.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Inference of causal transcriptional regulatory networks (TRNs) from transcriptomic data suffers notoriously from false positives. Approaches to control the false discovery rate (FDR), for example, via permutation, bootstrapping, or multivariate Gaussian distributions, suffer from several complications: difficulty in distinguishing direct from indirect regulation, nonlinear effects, and causal structure inference requiring \\\"causal sufficiency,\\\" meaning experiments that are free of any unmeasured, confounding variables. Here, we use a recently developed statistical framework, model-X knockoffs, to control the FDR while accounting for indirect effects, nonlinear dose-response, and user-provided covariates. We adjust the procedure to estimate the FDR correctly even when measured against incomplete gold standards. However, benchmarking against chromatin immunoprecipitation (ChIP) and other gold standards reveals higher observed than reported FDR. This indicates that unmeasured confounding is a major driver of FDR in TRN inference. A record of this paper's transparent peer review process is included in the supplemental information.\",\"PeriodicalId\":93929,\"journal\":{\"name\":\"Cell systems\",\"volume\":\"15 8\",\"pages\":\"709-724.e13\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11642480/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cell systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.cels.2024.07.006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.cels.2024.07.006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

从转录组数据推断转录调控网络（TRN）的因果关系时，往往会出现假阳性。控制假阳性发现率（FDR）的方法（例如，通过置换、引导或多元高斯分布）有几个并发症：难以区分直接调控和间接调控、非线性效应，以及因果结构推断需要 "因果充分性"，即实验中没有任何未测量的混杂变量。在此，我们使用最近开发的统计框架--X模型山寨版--来控制FDR，同时考虑间接效应、非线性剂量反应和用户提供的协变量。我们对程序进行了调整，即使根据不完整的黄金标准进行测量，也能正确估计 FDR。然而，以染色质免疫沉淀（ChIP）和其他黄金标准为基准，发现观察到的 FDR 比报告的要高。这表明，未测量的混杂因素是 TRN 推断中 FDR 的主要驱动因素。补充信息中包含了本文透明的同行评审过程记录。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Transcriptome data are insufficient to control false discoveries in regulatory network inference.

Inference of causal transcriptional regulatory networks (TRNs) from transcriptomic data suffers notoriously from false positives. Approaches to control the false discovery rate (FDR), for example, via permutation, bootstrapping, or multivariate Gaussian distributions, suffer from several complications: difficulty in distinguishing direct from indirect regulation, nonlinear effects, and causal structure inference requiring "causal sufficiency," meaning experiments that are free of any unmeasured, confounding variables. Here, we use a recently developed statistical framework, model-X knockoffs, to control the FDR while accounting for indirect effects, nonlinear dose-response, and user-provided covariates. We adjust the procedure to estimate the FDR correctly even when measured against incomplete gold standards. However, benchmarking against chromatin immunoprecipitation (ChIP) and other gold standards reveals higher observed than reported FDR. This indicates that unmeasured confounding is a major driver of FDR in TRN inference. A record of this paper's transparent peer review process is included in the supplemental information.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cell systems

自引率

0.00%

发文量