用于元基因组数据成对关系检测的互信息方法比较分析

IF 2.9 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS BMC Bioinformatics Pub Date : 2024-08-14 DOI:10.1186/s12859-024-05883-7
Dallace Francis, Fengzhu Sun
{"title":"用于元基因组数据成对关系检测的互信息方法比较分析","authors":"Dallace Francis, Fengzhu Sun","doi":"10.1186/s12859-024-05883-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Construction of co-occurrence networks in metagenomic data often employs correlation to infer pairwise relationships between microbes. However, biological systems are complex and often display qualities non-linear in nature. Therefore, the reliance on correlation alone may overlook important relationships and fail to capture the full breadth of intricacies presented in underlying interaction networks. It is of interest to incorporate metrics that are not only robust in detecting linear relationships, but non-linear ones as well.</p><p><strong>Results: </strong>In this paper, we explore the use of various mutual information (MI) estimation approaches for quantifying pairwise relationships in biological data and compare their performances against two traditional measures-Pearson's correlation coefficient, r, and Spearman's rank correlation coefficient, ρ. Metrics are tested on both simulated data designed to mimic pairwise relationships that may be found in ecological systems and real data from a previous study on C. diff infection. The results demonstrate that, in the case of asymmetric relationships, mutual information estimators can provide better detection ability than Pearson's or Spearman's correlation coefficients. Specifically, we find that these estimators have elevated performances in the detection of exploitative relationships, demonstrating the potential benefit of including them in future metagenomic studies.</p><p><strong>Conclusions: </strong>Mutual information (MI) can uncover complex pairwise relationships in biological data that may be missed by traditional measures of association. The inclusion of such relationships when constructing co-occurrence networks can result in a more comprehensive analysis than the use of correlation alone.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11323399/pdf/","citationCount":"0","resultStr":"{\"title\":\"A comparative analysis of mutual information methods for pairwise relationship detection in metagenomic data.\",\"authors\":\"Dallace Francis, Fengzhu Sun\",\"doi\":\"10.1186/s12859-024-05883-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Construction of co-occurrence networks in metagenomic data often employs correlation to infer pairwise relationships between microbes. However, biological systems are complex and often display qualities non-linear in nature. Therefore, the reliance on correlation alone may overlook important relationships and fail to capture the full breadth of intricacies presented in underlying interaction networks. It is of interest to incorporate metrics that are not only robust in detecting linear relationships, but non-linear ones as well.</p><p><strong>Results: </strong>In this paper, we explore the use of various mutual information (MI) estimation approaches for quantifying pairwise relationships in biological data and compare their performances against two traditional measures-Pearson's correlation coefficient, r, and Spearman's rank correlation coefficient, ρ. Metrics are tested on both simulated data designed to mimic pairwise relationships that may be found in ecological systems and real data from a previous study on C. diff infection. The results demonstrate that, in the case of asymmetric relationships, mutual information estimators can provide better detection ability than Pearson's or Spearman's correlation coefficients. Specifically, we find that these estimators have elevated performances in the detection of exploitative relationships, demonstrating the potential benefit of including them in future metagenomic studies.</p><p><strong>Conclusions: </strong>Mutual information (MI) can uncover complex pairwise relationships in biological data that may be missed by traditional measures of association. The inclusion of such relationships when constructing co-occurrence networks can result in a more comprehensive analysis than the use of correlation alone.</p>\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11323399/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-024-05883-7\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-024-05883-7","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

背景:元基因组数据中共生网络的构建通常采用相关性来推断微生物之间的配对关系。然而,生物系统是复杂的,而且往往表现出非线性的性质。因此,仅仅依靠相关性可能会忽略重要的关系,也无法捕捉潜在交互网络中错综复杂的全部内容。因此,我们有兴趣采用不仅能检测线性关系,而且能检测非线性关系的指标:本文探讨了使用各种互信息(MI)估算方法来量化生物数据中的配对关系,并将它们的性能与两个传统指标--皮尔逊相关系数 r 和斯皮尔曼等级相关系数 ρ 进行了比较。结果表明,在非对称关系的情况下,互信息估计器比皮尔逊或斯皮尔曼相关系数能提供更好的检测能力。具体来说,我们发现这些估计值在检测利用关系时表现更佳,这表明将它们纳入未来的元基因组研究可能会带来益处:结论:互信息(MI)可以发现生物数据中复杂的成对关系,而传统的关联测量方法可能会忽略这些关系。结论:互信息(MI)可以发现生物数据中复杂的成对关系,而传统的关联测量方法可能会忽略这些关系。在构建共现网络时,如果将这些关系纳入其中,就能获得比单独使用相关性更全面的分析结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A comparative analysis of mutual information methods for pairwise relationship detection in metagenomic data.

Background: Construction of co-occurrence networks in metagenomic data often employs correlation to infer pairwise relationships between microbes. However, biological systems are complex and often display qualities non-linear in nature. Therefore, the reliance on correlation alone may overlook important relationships and fail to capture the full breadth of intricacies presented in underlying interaction networks. It is of interest to incorporate metrics that are not only robust in detecting linear relationships, but non-linear ones as well.

Results: In this paper, we explore the use of various mutual information (MI) estimation approaches for quantifying pairwise relationships in biological data and compare their performances against two traditional measures-Pearson's correlation coefficient, r, and Spearman's rank correlation coefficient, ρ. Metrics are tested on both simulated data designed to mimic pairwise relationships that may be found in ecological systems and real data from a previous study on C. diff infection. The results demonstrate that, in the case of asymmetric relationships, mutual information estimators can provide better detection ability than Pearson's or Spearman's correlation coefficients. Specifically, we find that these estimators have elevated performances in the detection of exploitative relationships, demonstrating the potential benefit of including them in future metagenomic studies.

Conclusions: Mutual information (MI) can uncover complex pairwise relationships in biological data that may be missed by traditional measures of association. The inclusion of such relationships when constructing co-occurrence networks can result in a more comprehensive analysis than the use of correlation alone.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMC Bioinformatics
BMC Bioinformatics 生物-生化研究方法
CiteScore
5.70
自引率
3.30%
发文量
506
审稿时长
4.3 months
期刊介绍: BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.
期刊最新文献
Rare copy number variant analysis in case-control studies using snp array data: a scalable and automated data analysis pipeline. Mining contextually meaningful subgraphs from a vertex-attributed graph. Robust double machine learning model with application to omics data. A mapping-free natural language processing-based technique for sequence search in nanopore long-reads. Closha 2.0: a bio-workflow design system for massive genome data analysis on high performance cluster infrastructure.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1