Application of coincidence index in the discovery of co-expressed metabolic pathways.

IF 2 4区 生物学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY Physical biology Pub Date : 2024-08-29 DOI:10.1088/1478-3975/ad68b6
João Paulo Cassucci Dos Santos, Odemir Martinez Bruno
{"title":"Application of coincidence index in the discovery of co-expressed metabolic pathways.","authors":"João Paulo Cassucci Dos Santos, Odemir Martinez Bruno","doi":"10.1088/1478-3975/ad68b6","DOIUrl":null,"url":null,"abstract":"<p><p>Analyzing transcription data requires intensive statistical analysis to obtain useful biological information and knowledge. A significant portion of this data is affected by random noise or even noise intrinsic to the modeling of the experiment. Without robust treatment, the data might not be explored thoroughly, and incorrect conclusions could be drawn. Examining the correlation between gene expression profiles is one way bioinformaticians extract information from transcriptomic experiments. However, the correlation measurements traditionally used have worrisome shortcomings that need to be addressed. This paper compares five already published and experimented-with correlation measurements to the newly developed coincidence index, a similarity measurement that combines Jaccard and interiority indexes and generalizes them to be applied to vectors containing real values. We used microarray and RNA-Seq data from the archaeon<i>Halobacterium salinarum</i>and the bacterium<i>Escherichia coli</i>, respectively, to evaluate the capacity of each correlation/similarity measurement. The utilized method explores the co-expressed metabolic pathways by measuring the correlations between the expression levels of enzymes that share metabolites, represented in the form of a weighted graph. It then searches for local maxima in this graph using a simulated annealing algorithm. We demonstrate that the coincidence index extracts larger, more comprehensive, and more statistically significant pathways for microarray experiments. In RNA-Seq experiments, the results are more limited, but the coincidence index managed the largest percentage of significant components in the graph.</p>","PeriodicalId":20207,"journal":{"name":"Physical biology","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1088/1478-3975/ad68b6","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Analyzing transcription data requires intensive statistical analysis to obtain useful biological information and knowledge. A significant portion of this data is affected by random noise or even noise intrinsic to the modeling of the experiment. Without robust treatment, the data might not be explored thoroughly, and incorrect conclusions could be drawn. Examining the correlation between gene expression profiles is one way bioinformaticians extract information from transcriptomic experiments. However, the correlation measurements traditionally used have worrisome shortcomings that need to be addressed. This paper compares five already published and experimented-with correlation measurements to the newly developed coincidence index, a similarity measurement that combines Jaccard and interiority indexes and generalizes them to be applied to vectors containing real values. We used microarray and RNA-Seq data from the archaeonHalobacterium salinarumand the bacteriumEscherichia coli, respectively, to evaluate the capacity of each correlation/similarity measurement. The utilized method explores the co-expressed metabolic pathways by measuring the correlations between the expression levels of enzymes that share metabolites, represented in the form of a weighted graph. It then searches for local maxima in this graph using a simulated annealing algorithm. We demonstrate that the coincidence index extracts larger, more comprehensive, and more statistically significant pathways for microarray experiments. In RNA-Seq experiments, the results are more limited, but the coincidence index managed the largest percentage of significant components in the graph.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
共现指数在发现共表达代谢途径中的应用
分析转录数据需要进行深入的统计分析,以获得有用的生物信息和知识。这些数据中有很大一部分受到随机噪声甚至是实验建模固有噪声的影响。如果不进行稳健的处理,可能无法对数据进行透彻的研究,从而得出错误的结论。研究基因表达谱之间的相关性是生物信息学家从转录组实验中提取信息的一种方法。然而,传统使用的相关性测量方法存在令人担忧的缺陷,需要加以解决。本文比较了五种已发表和实验过的相关性测量方法和新开发的巧合指数,巧合指数是一种相似性测量方法,它结合了雅卡德指数和内部性指数,并将它们推广应用于包含实值的向量。我们使用了分别来自古生物 Halobacterium salinarum 和大肠杆菌的微阵列和 RNA-Seq 数据来评估每种相关性/相似性测量方法的能力。所使用的方法通过测量共享代谢物的酶的表达水平之间的相关性来探索共表达的代谢途径,以加权图的形式表示。然后使用模拟退火算法在该图中寻找局部最大值。我们证明,巧合指数能为微阵列实验提取更大、更全面、更具统计意义的路径。在 RNA-Seq 实验中,结果较为有限,但重合指数在图中管理了最大比例的重要成分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Physical biology
Physical biology 生物-生物物理
CiteScore
4.20
自引率
0.00%
发文量
50
审稿时长
3 months
期刊介绍: Physical Biology publishes articles in the broad interdisciplinary field bridging biology with the physical sciences and engineering. This journal focuses on research in which quantitative approaches – experimental, theoretical and modeling – lead to new insights into biological systems at all scales of space and time, and all levels of organizational complexity. Physical Biology accepts contributions from a wide range of biological sub-fields, including topics such as: molecular biophysics, including single molecule studies, protein-protein and protein-DNA interactions subcellular structures, organelle dynamics, membranes, protein assemblies, chromosome structure intracellular processes, e.g. cytoskeleton dynamics, cellular transport, cell division systems biology, e.g. signaling, gene regulation and metabolic networks cells and their microenvironment, e.g. cell mechanics and motility, chemotaxis, extracellular matrix, biofilms cell-material interactions, e.g. biointerfaces, electrical stimulation and sensing, endocytosis cell-cell interactions, cell aggregates, organoids, tissues and organs developmental dynamics, including pattern formation and morphogenesis physical and evolutionary aspects of disease, e.g. cancer progression, amyloid formation neuronal systems, including information processing by networks, memory and learning population dynamics, ecology, and evolution collective action and emergence of collective phenomena.
期刊最新文献
A role of fear on diseased food web model with multiple functional response. Two fitness inference schemes compared using allele frequencies from 1,068,391 sequences sampled in the UK during the COVID-19 pandemic. Unraveling the role of exercise in cancer suppression: insights from a mathematical model. An exactly solvable model for RNA polymerase during the elongation stage. A theoretical framework for predicting the heterogeneous stiffness map of brain white matter tissue.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1