Exact model-free function inference using uniform marginal counts for null population.

Yiyi Li, Mingzhou Song
{"title":"Exact model-free function inference using uniform marginal counts for null population.","authors":"Yiyi Li, Mingzhou Song","doi":"10.1093/bioinformatics/btaf121","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Recognizing cause-effect relationships is a fundamental inquiry in science. However, current causal inference methods often focus on directionality but not statistical significance. A ramification is chance patterns of uneven marginal distributions achieving a perfect directionality score.</p><p><strong>Results: </strong>To overcome such issues, we design the uniform exact function test with continuity correction (UEFTC) to detect functional dependency between two discrete random variables. The null hypothesis is two variables being statistically independent. Unique from related tests whose null populations use observed marginals, we define the null population by an embedded uniform square. We also present a fast algorithm to accomplish the test. On datasets with ground truth, the UEFTC exhibits accurate directionality, low biases, and robust statistical behavior over alternatives. We found nonmonotonic response by gene TCB2 to beta-estradiol dosage in engineered yeast strains. In the human duodenum with environmental enteric dysfunction, we discovered pathology-dependent anti-co-methylated CpG sites in the vicinity of genes POU2AF1 and LSP1; such activity represents orchestrated methylation and demethylation along the same gene, unreported previously. The UEFTC has much improved effectiveness in exact model-free function inference for data-driven knowledge discovery.</p><p><strong>Availability and implementation: </strong>An open-source R package \"UniExactFunTest\" implementing the presented uniform exact function tests is available via CRAN at doi: 10.32614/CRAN.package.UniExactFunTest. Computer code to reproduce figures can be found in supplementary file \"UEFTC-main.zip.\"</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11972114/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Recognizing cause-effect relationships is a fundamental inquiry in science. However, current causal inference methods often focus on directionality but not statistical significance. A ramification is chance patterns of uneven marginal distributions achieving a perfect directionality score.

Results: To overcome such issues, we design the uniform exact function test with continuity correction (UEFTC) to detect functional dependency between two discrete random variables. The null hypothesis is two variables being statistically independent. Unique from related tests whose null populations use observed marginals, we define the null population by an embedded uniform square. We also present a fast algorithm to accomplish the test. On datasets with ground truth, the UEFTC exhibits accurate directionality, low biases, and robust statistical behavior over alternatives. We found nonmonotonic response by gene TCB2 to beta-estradiol dosage in engineered yeast strains. In the human duodenum with environmental enteric dysfunction, we discovered pathology-dependent anti-co-methylated CpG sites in the vicinity of genes POU2AF1 and LSP1; such activity represents orchestrated methylation and demethylation along the same gene, unreported previously. The UEFTC has much improved effectiveness in exact model-free function inference for data-driven knowledge discovery.

Availability and implementation: An open-source R package "UniExactFunTest" implementing the presented uniform exact function tests is available via CRAN at doi: 10.32614/CRAN.package.UniExactFunTest. Computer code to reproduce figures can be found in supplementary file "UEFTC-main.zip."

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用无效人口的均匀边际计数进行精确的无模型函数推断。
动机:认识因果关系是科学研究的基本问题。然而,目前的因果推理方法往往侧重于方向性,而不是统计显著性。分支是不均匀边际分布的偶然模式,达到完美的方向性得分。结果:为了克服这些问题,我们设计了带连续性校正的均匀精确函数检验(UEFTC)来检测两个离散随机变量之间的函数依赖性。零假设是两个变量在统计上是独立的。与零总体使用观察到的边际的相关检验不同,我们用嵌入的均匀平方定义零总体。我们还提出了一种快速算法来完成测试。在真实的数据集上,UEFTC表现出准确的方向性、低偏差和健壮的统计行为。我们发现TCB2基因对工程酵母菌β -雌二醇剂量有非单调反应。在患有环境性肠功能障碍的人十二指肠中,我们在POU2AF1和LSP1基因附近发现了病理依赖的抗共甲基化CpG位点;这种活性表示沿同一基因有组织的甲基化和去甲基化,以前未报道过。UEFTC在数据驱动知识发现的精确无模型函数推理方面有了很大的提高。可用性:通过CRAN,可以在doi: 10.32614/CRAN.package.UniExactFunTest上获得一个开源R包“UniExactFunTest”,实现了所提出的统一的精确功能测试。复制图形的代码可以在补充文件“UEFTC-main.zip”中找到。补充资料:补充资料可在生物信息学网站获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ORFannotate: reproducible coding sequence annotation of transcriptome assemblies. ReverseGWAS identifies combined phenotypes associated with a genotype in GWA studies. A Dual Diffusion Model-Based Representation Learning Framework for AMPs Classification. From Genes to Trajectories: Mapping Genetic Influences on Huntington's Disease Progression. pyBiodatafuse: Extending interoperability of data using modular queries across biomedical resources.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1