下载PDF
{"title":"Integrating Bacterial ChIP-seq and RNA-seq Data With SnakeChunks","authors":"Claire Rioualen, Lucie Charbonnier-Khamvongsa, Julio Collado-Vides, Jacques van Helden","doi":"10.1002/cpbi.72","DOIUrl":null,"url":null,"abstract":"<p>Next-generation sequencing (NGS) is becoming a routine approach in most domains of the life sciences. To ensure reproducibility of results, there is a crucial need to improve the automation of NGS data processing and enable forthcoming studies relying on big datasets. Although user-friendly interfaces now exist, there remains a strong need for accessible solutions that allow experimental biologists to analyze and explore their results in an autonomous and flexible way. The protocols here describe a modular system that enable a user to compose and fine-tune workflows based on SnakeChunks, a library of rules for the Snakemake workflow engine. They are illustrated using a study combining ChIP-seq and RNA-seq to identify target genes of the global transcription factor FNR in <i>Escherichia coli</i>, which has the advantage that results can be compared with the most up-to-date collection of existing knowledge about transcriptional regulation in this model organism, extracted from the RegulonDB database. © 2019 by John Wiley & Sons, Inc.</p>","PeriodicalId":10958,"journal":{"name":"Current protocols in bioinformatics","volume":"66 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/cpbi.72","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current protocols in bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpbi.72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 2
引用
批量引用
Abstract
Next-generation sequencing (NGS) is becoming a routine approach in most domains of the life sciences. To ensure reproducibility of results, there is a crucial need to improve the automation of NGS data processing and enable forthcoming studies relying on big datasets. Although user-friendly interfaces now exist, there remains a strong need for accessible solutions that allow experimental biologists to analyze and explore their results in an autonomous and flexible way. The protocols here describe a modular system that enable a user to compose and fine-tune workflows based on SnakeChunks, a library of rules for the Snakemake workflow engine. They are illustrated using a study combining ChIP-seq and RNA-seq to identify target genes of the global transcription factor FNR in Escherichia coli , which has the advantage that results can be compared with the most up-to-date collection of existing knowledge about transcriptional regulation in this model organism, extracted from the RegulonDB database. © 2019 by John Wiley & Sons, Inc.
利用SnakeChunks整合细菌ChIP-seq和RNA-seq数据
新一代测序(NGS)正在成为生命科学大多数领域的常规方法。为了确保结果的可重复性,迫切需要提高NGS数据处理的自动化程度,并使未来的研究依赖于大数据集。虽然用户友好的界面现在已经存在,但仍然迫切需要可访问的解决方案,使实验生物学家能够以自主和灵活的方式分析和探索他们的结果。这里的协议描述了一个模块化系统,使用户能够基于SnakeChunks (Snakemake工作流引擎的规则库)编写和微调工作流。他们使用一项结合ChIP-seq和RNA-seq的研究来鉴定大肠杆菌中全局转录因子FNR的靶基因,其优点是结果可以与从RegulonDB数据库中提取的关于这种模式生物中转录调控的最新现有知识集合进行比较。©2019 by John Wiley &儿子,Inc。
本文章由计算机程序翻译,如有差异,请以英文原文为准。