Yingxue Fu, Zuo-Fei Yuan, Long Wu, Junmin Peng, Xusheng Wang, Anthony A High
{"title":"Addressing Sample Mix-Ups: Tools and Approaches for Large-Scale Multi-Omics Studies.","authors":"Yingxue Fu, Zuo-Fei Yuan, Long Wu, Junmin Peng, Xusheng Wang, Anthony A High","doi":"10.1002/pmic.202400271","DOIUrl":null,"url":null,"abstract":"<p><p>Advances in high-throughput omics technologies have enabled system-wide characterization of biological samples across multiple molecular levels, such as the genome, transcriptome, and proteome. However, as sample sizes rapidly increase in large-scale multi-omics studies, sample mix-ups have become a prevalent issue, compromising data integrity and leading to erroneous conclusions. The interconnected nature of multi-omics data presents an opportunity to identify and correct these errors. This review examines the potential sources of sample mix-ups and evaluates the methodologies and tools developed for detecting and correcting these errors, with an emphasis on approaches applicable to proteomics data. We categorize existing tools into three main groups: expression/protein quantitative trait loci-based, genotype concordance-based, and gene/protein expression correlation-based approaches. Notably, only a handful of tools currently utilize the proteogenomics approach for correcting sample mix-ups at the proteomics level. Integrating the strengths of current tools across diverse data types could enable the development of more versatile and comprehensive solutions. In conclusion, verifying sample identity is a critical first step to reduce bias and increase precision in subsequent analyses for large-scale multi-omics studies. By leveraging these tools for identifying and correcting sample mix-ups, researchers can significantly improve the reliability and reproducibility of biomedical research.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":" ","pages":"e202400271"},"PeriodicalIF":3.4000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pmic.202400271","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/10 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Advances in high-throughput omics technologies have enabled system-wide characterization of biological samples across multiple molecular levels, such as the genome, transcriptome, and proteome. However, as sample sizes rapidly increase in large-scale multi-omics studies, sample mix-ups have become a prevalent issue, compromising data integrity and leading to erroneous conclusions. The interconnected nature of multi-omics data presents an opportunity to identify and correct these errors. This review examines the potential sources of sample mix-ups and evaluates the methodologies and tools developed for detecting and correcting these errors, with an emphasis on approaches applicable to proteomics data. We categorize existing tools into three main groups: expression/protein quantitative trait loci-based, genotype concordance-based, and gene/protein expression correlation-based approaches. Notably, only a handful of tools currently utilize the proteogenomics approach for correcting sample mix-ups at the proteomics level. Integrating the strengths of current tools across diverse data types could enable the development of more versatile and comprehensive solutions. In conclusion, verifying sample identity is a critical first step to reduce bias and increase precision in subsequent analyses for large-scale multi-omics studies. By leveraging these tools for identifying and correcting sample mix-ups, researchers can significantly improve the reliability and reproducibility of biomedical research.
期刊介绍:
PROTEOMICS is the premier international source for information on all aspects of applications and technologies, including software, in proteomics and other "omics". The journal includes but is not limited to proteomics, genomics, transcriptomics, metabolomics and lipidomics, and systems biology approaches. Papers describing novel applications of proteomics and integration of multi-omics data and approaches are especially welcome.