James D Brunner, Aaron J Robinson, Patrick S G Chain
{"title":"Combining compositional data sets introduces error in covariance network reconstruction.","authors":"James D Brunner, Aaron J Robinson, Patrick S G Chain","doi":"10.1093/ismeco/ycae057","DOIUrl":null,"url":null,"abstract":"<p><p>Microbial communities are diverse biological systems that include taxa from across multiple kingdoms of life. Notably, interactions between bacteria and fungi play a significant role in determining community structure. However, these statistical associations across kingdoms are more difficult to infer than intra-kingdom associations due to the nature of the data involved using standard network inference techniques. We quantify the challenges of cross-kingdom network inference from both theoretical and practical points of view using synthetic and real-world microbiome data. We detail the theoretical issue presented by combining compositional data sets drawn from the same environment, e.g. 16S and ITS sequencing of a single set of samples, and we survey common network inference techniques for their ability to handle this error. We then test these techniques for the accuracy and usefulness of their intra- and inter-kingdom associations by inferring networks from a set of simulated samples for which a ground-truth set of associations is known. We show that while the two methods mitigate the error of cross-kingdom inference, there is little difference between techniques for key practical applications including identification of strong correlations and identification of possible keystone taxa (i.e. hub nodes in the network). Furthermore, we identify a signature of the error caused by transkingdom network inference and demonstrate that it appears in networks constructed using real-world environmental microbiome data.</p>","PeriodicalId":73516,"journal":{"name":"ISME communications","volume":"4 1","pages":"ycae057"},"PeriodicalIF":5.1000,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135214/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISME communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/ismeco/ycae057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Microbial communities are diverse biological systems that include taxa from across multiple kingdoms of life. Notably, interactions between bacteria and fungi play a significant role in determining community structure. However, these statistical associations across kingdoms are more difficult to infer than intra-kingdom associations due to the nature of the data involved using standard network inference techniques. We quantify the challenges of cross-kingdom network inference from both theoretical and practical points of view using synthetic and real-world microbiome data. We detail the theoretical issue presented by combining compositional data sets drawn from the same environment, e.g. 16S and ITS sequencing of a single set of samples, and we survey common network inference techniques for their ability to handle this error. We then test these techniques for the accuracy and usefulness of their intra- and inter-kingdom associations by inferring networks from a set of simulated samples for which a ground-truth set of associations is known. We show that while the two methods mitigate the error of cross-kingdom inference, there is little difference between techniques for key practical applications including identification of strong correlations and identification of possible keystone taxa (i.e. hub nodes in the network). Furthermore, we identify a signature of the error caused by transkingdom network inference and demonstrate that it appears in networks constructed using real-world environmental microbiome data.
微生物群落是一个多样化的生物系统,包括来自多个生命领域的类群。值得注意的是,细菌和真菌之间的相互作用在决定群落结构方面发挥着重要作用。然而,由于使用标准网络推断技术所涉及数据的性质,这些跨生物界的统计关联比生物界内的关联更难推断。我们利用合成和真实世界的微生物组数据,从理论和实践角度量化了跨王国网络推断所面临的挑战。我们详细介绍了将来自同一环境的组成数据集(如单个样本集的 16S 和 ITS 测序)结合起来所带来的理论问题,并考察了常见网络推断技术处理这一误差的能力。然后,我们通过从一组已知关联的模拟样本中推断网络,测试这些技术在内部和部门间关联方面的准确性和实用性。我们发现,虽然这两种方法都能减少跨领域推断的误差,但在关键的实际应用中,包括识别强相关性和识别可能的基石类群(即网络中的枢纽节点)方面,这两种技术的差别并不大。此外,我们还发现了跨王国网络推断所造成的误差特征,并证明它出现在使用真实世界环境微生物组数据构建的网络中。