Antonino Zito, Axel Martinelli, Mauro Masiero, Murat Akhmedov, Ivo Kwee
{"title":"NPM:用最近配对法对 Omics 数据进行潜在批次效应校正。","authors":"Antonino Zito, Axel Martinelli, Mauro Masiero, Murat Akhmedov, Ivo Kwee","doi":"10.1093/bioinformatics/btaf084","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Batch effects (BEs) are a predominant source of noise in omics data and often mask real biological signals. BEs remain common in existing datasets. Current methods for BE correction mostly rely on specific assumptions or complex models, and may not detect and adjust BEs adequately, impacting downstream analysis and discovery power. To address these challenges we developed NPM, a nearest-neighbor matching-based method that adjusts BEs and may outperform other methods in a wide range of datasets.</p><p><strong>Results: </strong>We assessed distinct metrics and graphical readouts, and compared our method to commonly used BE correction methods. NPM demonstrates ability in correcting for BEs, while preserving biological differences. It may outperform other methods based on multiple metrics. Altogether, NPM proves to be a valuable BE correction approach to maximize discovery in biomedical research, with applicability in clinical research where latent BEs are often dominant.</p><p><strong>Availability: </strong>NPM is freely available on GitHub (https://github.com/bigomics/NPM) and on Omics Playground (https://bigomics.ch/omics-playground). Computer codes for analyses are available at (https://github.com/bigomics/NPM). The datasets underlying this article are the following: GSE120099, GSE82177, GSE162760, GSE171343, GSE153380, GSE163214, GSE182440, GSE163857, GSE117970, GSE173078, GSE10846. All these datasets are publicly available and can be freely accessed on the Gene Expression Omnibus (GEO) repository.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NPM: Latent Batch Effects Correction of Omics data by Nearest Pair Matching.\",\"authors\":\"Antonino Zito, Axel Martinelli, Mauro Masiero, Murat Akhmedov, Ivo Kwee\",\"doi\":\"10.1093/bioinformatics/btaf084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Batch effects (BEs) are a predominant source of noise in omics data and often mask real biological signals. BEs remain common in existing datasets. Current methods for BE correction mostly rely on specific assumptions or complex models, and may not detect and adjust BEs adequately, impacting downstream analysis and discovery power. To address these challenges we developed NPM, a nearest-neighbor matching-based method that adjusts BEs and may outperform other methods in a wide range of datasets.</p><p><strong>Results: </strong>We assessed distinct metrics and graphical readouts, and compared our method to commonly used BE correction methods. NPM demonstrates ability in correcting for BEs, while preserving biological differences. It may outperform other methods based on multiple metrics. Altogether, NPM proves to be a valuable BE correction approach to maximize discovery in biomedical research, with applicability in clinical research where latent BEs are often dominant.</p><p><strong>Availability: </strong>NPM is freely available on GitHub (https://github.com/bigomics/NPM) and on Omics Playground (https://bigomics.ch/omics-playground). Computer codes for analyses are available at (https://github.com/bigomics/NPM). The datasets underlying this article are the following: GSE120099, GSE82177, GSE162760, GSE171343, GSE153380, GSE163214, GSE182440, GSE163857, GSE117970, GSE173078, GSE10846. All these datasets are publicly available and can be freely accessed on the Gene Expression Omnibus (GEO) repository.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btaf084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NPM: Latent Batch Effects Correction of Omics data by Nearest Pair Matching.
Motivation: Batch effects (BEs) are a predominant source of noise in omics data and often mask real biological signals. BEs remain common in existing datasets. Current methods for BE correction mostly rely on specific assumptions or complex models, and may not detect and adjust BEs adequately, impacting downstream analysis and discovery power. To address these challenges we developed NPM, a nearest-neighbor matching-based method that adjusts BEs and may outperform other methods in a wide range of datasets.
Results: We assessed distinct metrics and graphical readouts, and compared our method to commonly used BE correction methods. NPM demonstrates ability in correcting for BEs, while preserving biological differences. It may outperform other methods based on multiple metrics. Altogether, NPM proves to be a valuable BE correction approach to maximize discovery in biomedical research, with applicability in clinical research where latent BEs are often dominant.
Availability: NPM is freely available on GitHub (https://github.com/bigomics/NPM) and on Omics Playground (https://bigomics.ch/omics-playground). Computer codes for analyses are available at (https://github.com/bigomics/NPM). The datasets underlying this article are the following: GSE120099, GSE82177, GSE162760, GSE171343, GSE153380, GSE163214, GSE182440, GSE163857, GSE117970, GSE173078, GSE10846. All these datasets are publicly available and can be freely accessed on the Gene Expression Omnibus (GEO) repository.
Supplementary information: Supplementary data are available at Bioinformatics online.