BatchEval Pipeline: batch effect evaluation workflow for multiple datasets joint analysis.

GigaByte (Hong Kong, China) Pub Date : 2024-02-20 eCollection Date: 2024-01-01 DOI:10.46471/gigabyte.108
Chao Zhang, Qiang Kang, Mei Li, Hongqing Xie, Shuangsang Fang, Xun Xu
{"title":"BatchEval Pipeline: batch effect evaluation workflow for multiple datasets joint analysis.","authors":"Chao Zhang, Qiang Kang, Mei Li, Hongqing Xie, Shuangsang Fang, Xun Xu","doi":"10.46471/gigabyte.108","DOIUrl":null,"url":null,"abstract":"<p><p>As genomic sequencing technology continues to advance, it becomes increasingly important to perform joint analyses of multiple datasets of transcriptomics. However, batch effect presents challenges for dataset integration, such as sequencing data measured on different platforms, and datasets collected at different times. Here, we report the development of BatchEval Pipeline, a batch effect workflow used to evaluate batch effect on dataset integration. The BatchEval Pipeline generates a comprehensive report, which consists of a series of HTML pages for assessment findings, including a main page, a raw dataset evaluation page, and several built-in methods evaluation pages. The main page exhibits basic information of the integrated datasets, a comprehensive score of batch effect, and the most recommended method for removing batch effect from the current datasets. The remaining pages exhibit evaluation details for the raw dataset, and evaluation results from the built-in batch effect removal methods after removing batch effect. This comprehensive report enables researchers to accurately identify and remove batch effects, resulting in more reliable and meaningful biological insights from integrated datasets. In summary, the BatchEval Pipeline represents a significant advancement in batch effect evaluation, and is a valuable tool to improve the accuracy and reliability of the experimental results.</p><p><strong>Availability & implementation: </strong>The source code of the BatchEval Pipeline is available at https://github.com/STOmics/BatchEval.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte108"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10905258/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaByte (Hong Kong, China)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46471/gigabyte.108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

As genomic sequencing technology continues to advance, it becomes increasingly important to perform joint analyses of multiple datasets of transcriptomics. However, batch effect presents challenges for dataset integration, such as sequencing data measured on different platforms, and datasets collected at different times. Here, we report the development of BatchEval Pipeline, a batch effect workflow used to evaluate batch effect on dataset integration. The BatchEval Pipeline generates a comprehensive report, which consists of a series of HTML pages for assessment findings, including a main page, a raw dataset evaluation page, and several built-in methods evaluation pages. The main page exhibits basic information of the integrated datasets, a comprehensive score of batch effect, and the most recommended method for removing batch effect from the current datasets. The remaining pages exhibit evaluation details for the raw dataset, and evaluation results from the built-in batch effect removal methods after removing batch effect. This comprehensive report enables researchers to accurately identify and remove batch effects, resulting in more reliable and meaningful biological insights from integrated datasets. In summary, the BatchEval Pipeline represents a significant advancement in batch effect evaluation, and is a valuable tool to improve the accuracy and reliability of the experimental results.

Availability & implementation: The source code of the BatchEval Pipeline is available at https://github.com/STOmics/BatchEval.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
BatchEval Pipeline:用于多个数据集联合分析的批量效应评估工作流程。
随着基因组测序技术的不断进步,对多个转录组学数据集进行联合分析变得越来越重要。然而,批次效应给数据集整合带来了挑战,例如在不同平台测量的测序数据和在不同时间采集的数据集。在此,我们报告了 BatchEval Pipeline 的开发情况,这是一个批次效应工作流,用于评估数据集整合的批次效应。BatchEval Pipeline 生成的综合报告由一系列用于评估结果的 HTML 页面组成,包括一个主页面、一个原始数据集评估页面和几个内置方法评估页面。主页面展示了集成数据集的基本信息、批量效应的综合评分,以及从当前数据集中去除批量效应的最推荐方法。其余页面展示了原始数据集的评估详情,以及内置批量效应去除方法在去除批量效应后的评估结果。这份全面的报告能帮助研究人员准确识别和去除批次效应,从而从集成数据集中获得更可靠、更有意义的生物学见解。总之,BatchEval 管道代表了批次效应评估的重大进步,是提高实验结果准确性和可靠性的重要工具:BatchEval Pipeline 的源代码可从 https://github.com/STOmics/BatchEval 获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.60
自引率
0.00%
发文量
0
审稿时长
5 weeks
期刊最新文献
The genome of the sapphire damselfish Chrysiptera cyanea: a new resource to support further investigation of the evolution of Pomacentrids. Polyploid genome assembly of Cardamine chenopodiifolia. NeuroVar: an open-source tool for the visualization of gene expression and variation data for biomarkers of neurological diseases. Whole-genome re-sequencing of the Baikal seal and other phocid seals for a glimpse into their genetic diversity, demographic history, and phylogeny. Chromosome-level genome assembly and annotation of the crested gecko, Correlophus ciliatus, a lizard incapable of tail regeneration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1