NPM:用最近配对法对 Omics 数据进行潜在批次效应校正。

Antonino Zito, Axel Martinelli, Mauro Masiero, Murodzhon Akhmedov, Ivo Kwee
{"title":"NPM:用最近配对法对 Omics 数据进行潜在批次效应校正。","authors":"Antonino Zito, Axel Martinelli, Mauro Masiero, Murodzhon Akhmedov, Ivo Kwee","doi":"10.1093/bioinformatics/btaf084","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Batch effects (BEs) are a predominant source of noise in omics data and often mask real biological signals. BEs remain common in existing datasets. Current methods for BE correction mostly rely on specific assumptions or complex models, and may not detect and adjust BEs adequately, impacting downstream analysis and discovery power. To address these challenges we developed NPM, a nearest-neighbor matching-based method that adjusts BEs and may outperform other methods in a wide range of datasets.</p><p><strong>Results: </strong>We assessed distinct metrics and graphical readouts, and compared our method to commonly used BE correction methods. NPM demonstrates the ability in correcting for BEs, while preserving biological differences. It may outperform other methods based on multiple metrics. Altogether, NPM proves to be a valuable BE correction approach to maximize discovery in biomedical research, with applicability in clinical research where latent BEs are often dominant.</p><p><strong>Availability and implementation: </strong>NPM is freely available on GitHub (https://github.com/bigomics/NPM) and on Omics Playground (https://bigomics.ch/omics-playground). Computer codes for analyses are available at (https://github.com/bigomics/NPM). The datasets underlying this article are the following: GSE120099, GSE82177, GSE162760, GSE171343, GSE153380, GSE163214, GSE182440, GSE163857, GSE117970, GSE173078, and GSE10846. All these datasets are publicly available and can be freely accessed on the Gene Expression Omnibus repository.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11925496/pdf/","citationCount":"0","resultStr":"{\"title\":\"NPM: latent batch effects correction of omics data by nearest-pair matching.\",\"authors\":\"Antonino Zito, Axel Martinelli, Mauro Masiero, Murodzhon Akhmedov, Ivo Kwee\",\"doi\":\"10.1093/bioinformatics/btaf084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Batch effects (BEs) are a predominant source of noise in omics data and often mask real biological signals. BEs remain common in existing datasets. Current methods for BE correction mostly rely on specific assumptions or complex models, and may not detect and adjust BEs adequately, impacting downstream analysis and discovery power. To address these challenges we developed NPM, a nearest-neighbor matching-based method that adjusts BEs and may outperform other methods in a wide range of datasets.</p><p><strong>Results: </strong>We assessed distinct metrics and graphical readouts, and compared our method to commonly used BE correction methods. NPM demonstrates the ability in correcting for BEs, while preserving biological differences. It may outperform other methods based on multiple metrics. Altogether, NPM proves to be a valuable BE correction approach to maximize discovery in biomedical research, with applicability in clinical research where latent BEs are often dominant.</p><p><strong>Availability and implementation: </strong>NPM is freely available on GitHub (https://github.com/bigomics/NPM) and on Omics Playground (https://bigomics.ch/omics-playground). Computer codes for analyses are available at (https://github.com/bigomics/NPM). The datasets underlying this article are the following: GSE120099, GSE82177, GSE162760, GSE171343, GSE153380, GSE163214, GSE182440, GSE163857, GSE117970, GSE173078, and GSE10846. All these datasets are publicly available and can be freely accessed on the Gene Expression Omnibus repository.</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11925496/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btaf084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

动机:批效应(BEs)是组学数据中的主要噪声源,经常掩盖真实的生物信号。BEs在现有数据集中仍然很常见。目前的BE校正方法大多依赖于特定的假设或复杂的模型,可能无法充分检测和调整BE,从而影响下游分析和发现能力。为了解决这些挑战,我们开发了NPM,这是一种基于最近邻匹配的方法,可以调整BEs,并且在广泛的数据集中可能优于其他方法。结果:我们评估了不同的指标和图形读数,并将我们的方法与常用的BE校正方法进行了比较。NPM显示了在保留生物差异的同时纠正生物多样性的能力。它可能优于基于多个指标的其他方法。总之,NPM被证明是一种有价值的be纠正方法,可以最大限度地提高生物医学研究的发现,适用于潜在be往往占主导地位的临床研究。可用性:NPM可以在GitHub (https://github.com/bigomics/NPM)和Omics Playground (https://bigomics.ch/omics-playground)上免费获得。分析的计算机代码可在(https://github.com/bigomics/NPM)上获得。本文涉及的数据集如下:GSE120099、GSE82177、GSE162760、GSE171343、GSE153380、GSE163214、GSE182440、GSE163857、GSE117970、GSE173078、GSE10846。所有这些数据集都是公开的,可以在基因表达综合数据库(Gene Expression Omnibus, GEO)上免费访问。补充信息:补充数据可在生物信息学在线获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
NPM: latent batch effects correction of omics data by nearest-pair matching.

Motivation: Batch effects (BEs) are a predominant source of noise in omics data and often mask real biological signals. BEs remain common in existing datasets. Current methods for BE correction mostly rely on specific assumptions or complex models, and may not detect and adjust BEs adequately, impacting downstream analysis and discovery power. To address these challenges we developed NPM, a nearest-neighbor matching-based method that adjusts BEs and may outperform other methods in a wide range of datasets.

Results: We assessed distinct metrics and graphical readouts, and compared our method to commonly used BE correction methods. NPM demonstrates the ability in correcting for BEs, while preserving biological differences. It may outperform other methods based on multiple metrics. Altogether, NPM proves to be a valuable BE correction approach to maximize discovery in biomedical research, with applicability in clinical research where latent BEs are often dominant.

Availability and implementation: NPM is freely available on GitHub (https://github.com/bigomics/NPM) and on Omics Playground (https://bigomics.ch/omics-playground). Computer codes for analyses are available at (https://github.com/bigomics/NPM). The datasets underlying this article are the following: GSE120099, GSE82177, GSE162760, GSE171343, GSE153380, GSE163214, GSE182440, GSE163857, GSE117970, GSE173078, and GSE10846. All these datasets are publicly available and can be freely accessed on the Gene Expression Omnibus repository.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DeepSynBa: Actionable Drug Combination Prediction with Complete Dose-Response Profiles. Microbial Named Entity Recognition and Normalisation for AI-assisted Literature Review and Meta-Analysis. Protein-Nucleic Acid Binding Site Prediction Using Interpretable Kolmogorov-Arnold Networks with Hypergraph Representation Learning. Fitness translocation: improving variant effect prediction with biologically-grounded data augmentation. ChromBERT-tools: A versatile toolkit for context-specific regulatory representations of transcription regulators across different cell types.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1