首页 > 最新文献

Source Code for Biology and Medicine最新文献

英文 中文
Robust dose-response curve estimation applied to high content screening data analysis. 鲁棒剂量-反应曲线估计在高含量筛选数据分析中的应用。
Q2 Decision Sciences Pub Date : 2014-12-10 eCollection Date: 2014-01-01 DOI: 10.1186/s13029-014-0027-x
Thuy Tuong Nguyen, Kyungmin Song, Yury Tsoy, Jin Yeop Kim, Yong-Jun Kwon, Myungjoo Kang, Michael Adsetts Edberg Hansen

Background and method: Successfully automated sigmoidal curve fitting is highly challenging when applied to large data sets. In this paper, we describe a robust algorithm for fitting sigmoid dose-response curves by estimating four parameters (floor, window, shift, and slope), together with the detection of outliers. We propose two improvements over current methods for curve fitting. The first one is the detection of outliers which is performed during the initialization step with correspondent adjustments of the derivative and error estimation functions. The second aspect is the enhancement of the weighting quality of data points using mean calculation in Tukey's biweight function.

Results and conclusion: Automatic curve fitting of 19,236 dose-response experiments shows that our proposed method outperforms the current fitting methods provided by MATLAB®;'s nlinfit function and GraphPad's Prism software.

背景和方法:当应用于大数据集时,成功的自动化s型曲线拟合是极具挑战性的。在本文中,我们描述了一个稳健的算法拟合s型剂量响应曲线,通过估计四个参数(底、窗、位移和斜率),以及检测异常值。我们对当前的曲线拟合方法提出了两个改进。第一个是在初始化阶段进行异常值检测,并对导数和误差估计函数进行相应的调整。第二个方面是在Tukey的双权函数中使用均值计算来提高数据点的加权质量。结果和结论:19236个剂量反应实验的自动曲线拟合表明,我们提出的方法优于目前由MATLAB®的nlinfit函数和GraphPad的Prism软件提供的拟合方法。
{"title":"Robust dose-response curve estimation applied to high content screening data analysis.","authors":"Thuy Tuong Nguyen,&nbsp;Kyungmin Song,&nbsp;Yury Tsoy,&nbsp;Jin Yeop Kim,&nbsp;Yong-Jun Kwon,&nbsp;Myungjoo Kang,&nbsp;Michael Adsetts Edberg Hansen","doi":"10.1186/s13029-014-0027-x","DOIUrl":"https://doi.org/10.1186/s13029-014-0027-x","url":null,"abstract":"<p><strong>Background and method: </strong>Successfully automated sigmoidal curve fitting is highly challenging when applied to large data sets. In this paper, we describe a robust algorithm for fitting sigmoid dose-response curves by estimating four parameters (floor, window, shift, and slope), together with the detection of outliers. We propose two improvements over current methods for curve fitting. The first one is the detection of outliers which is performed during the initialization step with correspondent adjustments of the derivative and error estimation functions. The second aspect is the enhancement of the weighting quality of data points using mean calculation in Tukey's biweight function.</p><p><strong>Results and conclusion: </strong>Automatic curve fitting of 19,236 dose-response experiments shows that our proposed method outperforms the current fitting methods provided by MATLAB®;'s nlinfit function and GraphPad's Prism software.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 1","pages":"27"},"PeriodicalIF":0.0,"publicationDate":"2014-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-014-0027-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32997445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Onco-STS: a web-based laboratory information management system for sample and analysis tracking in oncogenomic experiments. Onco-STS:基于网络的实验室信息管理系统,用于肿瘤基因组实验中的样本和分析跟踪。
Q2 Decision Sciences Pub Date : 2014-12-05 eCollection Date: 2014-01-01 DOI: 10.1186/s13029-014-0025-z
Mike Gavrielides, Simon J Furney, Tim Yates, Crispin J Miller, Richard Marais

Background: Whole genomes, whole exomes and transcriptomes of tumour samples are sequenced routinely to identify the drivers of cancer. The systematic sequencing and analysis of tumour samples, as well other oncogenomic experiments, necessitates the tracking of relevant sample information throughout the investigative process. These meta-data of the sequencing and analysis procedures include information about the samples and projects as well as the sequencing centres, platforms, data locations, results locations, alignments, analysis specifications and further information relevant to the experiments.

Results: The current work presents a sample tracking system for oncogenomic studies (Onco-STS) to store these data and make them easily accessible to the researchers who work with the samples. The system is a web application, which includes a database and a front-end web page that allows the remote access, submission and updating of the sample data in the database. The web application development programming framework Grails was used for the development and implementation of the system.

Conclusions: The resulting Onco-STS solution is efficient, secure and easy to use and is intended to replace the manual data handling of text records. Onco-STS allows simultaneous remote access to the system making collaboration among researchers more effective. The system stores both information on the samples in oncogenomic studies and details of the analyses conducted on the resulting data. Onco-STS is based on open-source software, is easy to develop and can be modified according to a research group's needs. Hence it is suitable for laboratories that do not require a commercial system.

背景:对肿瘤样本的全基因组、全外显子组和转录组进行常规测序,以确定癌症的驱动因素。肿瘤样本的系统测序和分析以及其他肿瘤基因组实验需要在整个研究过程中跟踪相关样本信息。这些测序和分析程序的元数据包括样本和项目信息,以及测序中心、平台、数据位置、结果位置、排列、分析规格和与实验相关的其他信息:目前的工作提出了一个肿瘤基因组研究样本跟踪系统(Onco-STS),用于存储这些数据,并使处理样本的研究人员能够方便地访问这些数据。该系统是一个网络应用程序,包括一个数据库和一个前端网页,允许远程访问、提交和更新数据库中的样本数据。该系统的开发和实施使用了网络应用程序开发编程框架 Grails:Onco-STS 解决方案高效、安全、易于使用,旨在取代文本记录的人工数据处理。Onco-STS 允许同时远程访问系统,使研究人员之间的合作更加有效。该系统既能存储肿瘤基因组研究中的样本信息,也能存储对所得数据进行分析的详细信息。Onco-STS 基于开源软件,易于开发,可根据研究小组的需要进行修改。因此,它适用于不需要商业系统的实验室。
{"title":"Onco-STS: a web-based laboratory information management system for sample and analysis tracking in oncogenomic experiments.","authors":"Mike Gavrielides, Simon J Furney, Tim Yates, Crispin J Miller, Richard Marais","doi":"10.1186/s13029-014-0025-z","DOIUrl":"10.1186/s13029-014-0025-z","url":null,"abstract":"<p><strong>Background: </strong>Whole genomes, whole exomes and transcriptomes of tumour samples are sequenced routinely to identify the drivers of cancer. The systematic sequencing and analysis of tumour samples, as well other oncogenomic experiments, necessitates the tracking of relevant sample information throughout the investigative process. These meta-data of the sequencing and analysis procedures include information about the samples and projects as well as the sequencing centres, platforms, data locations, results locations, alignments, analysis specifications and further information relevant to the experiments.</p><p><strong>Results: </strong>The current work presents a sample tracking system for oncogenomic studies (Onco-STS) to store these data and make them easily accessible to the researchers who work with the samples. The system is a web application, which includes a database and a front-end web page that allows the remote access, submission and updating of the sample data in the database. The web application development programming framework Grails was used for the development and implementation of the system.</p><p><strong>Conclusions: </strong>The resulting Onco-STS solution is efficient, secure and easy to use and is intended to replace the manual data handling of text records. Onco-STS allows simultaneous remote access to the system making collaboration among researchers more effective. The system stores both information on the samples in oncogenomic studies and details of the analyses conducted on the resulting data. Onco-STS is based on open-source software, is easy to develop and can be modified according to a research group's needs. Hence it is suitable for laboratories that do not require a commercial system.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 1","pages":"25"},"PeriodicalIF":0.0,"publicationDate":"2014-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288629/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32967514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mega2: validated data-reformatting for linkage and association analyses. Mega2:用于链接和关联分析的经过验证的数据重新格式化。
Q2 Decision Sciences Pub Date : 2014-12-05 eCollection Date: 2014-01-01 DOI: 10.1186/s13029-014-0026-y
Robert V Baron, Charles Kollar, Nandita Mukhopadhyay, Daniel E Weeks

Background: In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilitate this can be tedious, time-consuming, and error-prone. To address these issues, the open source Mega2 data reformatting program provides validated and tested data conversions from several commonly-used input formats to many output formats.

Results: Mega2, the Manipulation Environment for Genetic Analysis, facilitates the creation of analysis-ready datasets from data gathered as part of a genetic study. It transparently allows users to process genetic data for family-based or case/control studies accurately and efficiently. In addition to data validation checks, Mega2 provides analysis setup capabilities for a broad choice of commonly-used genetic analysis programs. First released in 2000, Mega2 has recently been significantly improved in a number of ways. We have rewritten it in C++ and have reduced its memory requirements. Mega2 now can read input files in LINKAGE, PLINK, and VCF/BCF formats, as well as its own specialized annotated format. It supports conversion to many commonly-used formats including SOLAR, PLINK, Merlin, Mendel, SimWalk2, Cranefoot, IQLS, FBAT, MORGAN, BEAGLE, Eigenstrat, Structure, and PLINK/SEQ. When controlled by a batch file, Mega2 can be used non-interactively in data reformatting pipelines. Support for genetic data from several other species besides humans has been added.

Conclusions: By providing tested and validated data reformatting, Mega2 facilitates more accurate and extensive analyses of genetic data, avoiding the need to write, debug, and maintain one's own custom data reformatting scripts. Mega2 is freely available at https://watson.hgen.pitt.edu/register/.

背景:在一项复杂人类疾病的典型遗传学研究中,使用了许多不同的分析程序来测试连锁和关联。这需要广泛而仔细地重新格式化数据,因为许多这些分析程序使用不同的输入格式。编写脚本来实现这一点可能是乏味、耗时且容易出错的。为了解决这些问题,开放源码Mega2数据重新格式化程序提供了从几种常用输入格式到许多输出格式的经过验证和测试的数据转换。结果:Mega2,遗传分析操作环境,便于从作为遗传研究的一部分收集的数据中创建可供分析的数据集。它透明地允许用户准确有效地处理基于家庭或病例/对照研究的遗传数据。除了数据验证检查外,Mega2还为广泛选择的常用遗传分析程序提供了分析设置功能。《Mega2》于2000年首次发布,最近在许多方面进行了重大改进。我们用c++重写了它,减少了对内存的需求。Mega2现在可以读取LINKAGE、PLINK和VCF/BCF格式的输入文件,以及它自己专门的注释格式。它支持转换到许多常用的格式,包括SOLAR, PLINK, Merlin, Mendel, SimWalk2, Cranefoot, IQLS, FBAT, MORGAN, BEAGLE, Eigenstrat, Structure和PLINK/SEQ。当由批处理文件控制时,Mega2可以在数据重新格式化管道中非交互式使用。除了人类之外,还增加了其他几个物种的基因数据支持。结论:通过提供经过测试和验证的数据重新格式化,Mega2促进了更准确和广泛的遗传数据分析,避免了编写、调试和维护自己的自定义数据重新格式化脚本的需要。Mega2可在https://watson.hgen.pitt.edu/register/免费获得。
{"title":"Mega2: validated data-reformatting for linkage and association analyses.","authors":"Robert V Baron,&nbsp;Charles Kollar,&nbsp;Nandita Mukhopadhyay,&nbsp;Daniel E Weeks","doi":"10.1186/s13029-014-0026-y","DOIUrl":"https://doi.org/10.1186/s13029-014-0026-y","url":null,"abstract":"<p><strong>Background: </strong>In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilitate this can be tedious, time-consuming, and error-prone. To address these issues, the open source Mega2 data reformatting program provides validated and tested data conversions from several commonly-used input formats to many output formats.</p><p><strong>Results: </strong>Mega2, the Manipulation Environment for Genetic Analysis, facilitates the creation of analysis-ready datasets from data gathered as part of a genetic study. It transparently allows users to process genetic data for family-based or case/control studies accurately and efficiently. In addition to data validation checks, Mega2 provides analysis setup capabilities for a broad choice of commonly-used genetic analysis programs. First released in 2000, Mega2 has recently been significantly improved in a number of ways. We have rewritten it in C++ and have reduced its memory requirements. Mega2 now can read input files in LINKAGE, PLINK, and VCF/BCF formats, as well as its own specialized annotated format. It supports conversion to many commonly-used formats including SOLAR, PLINK, Merlin, Mendel, SimWalk2, Cranefoot, IQLS, FBAT, MORGAN, BEAGLE, Eigenstrat, Structure, and PLINK/SEQ. When controlled by a batch file, Mega2 can be used non-interactively in data reformatting pipelines. Support for genetic data from several other species besides humans has been added.</p><p><strong>Conclusions: </strong>By providing tested and validated data reformatting, Mega2 facilitates more accurate and extensive analyses of genetic data, avoiding the need to write, debug, and maintain one's own custom data reformatting scripts. Mega2 is freely available at https://watson.hgen.pitt.edu/register/.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 1","pages":"26"},"PeriodicalIF":0.0,"publicationDate":"2014-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-014-0026-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33060440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Three algorithms and SAS macros for estimating power and sample size for logistic models with one or more independent variables of interest in the presence of covariates. 三种算法和SAS宏用于估计在协变量存在下具有一个或多个感兴趣的自变量的逻辑模型的功率和样本大小。
Q2 Decision Sciences Pub Date : 2014-11-15 eCollection Date: 2014-01-01 DOI: 10.1186/1751-0473-9-24
David Keith Williams, Zoran Bursac

Background: Commonly when designing studies, researchers propose to measure several independent variables in a regression model, a subset of which are identified as the main variables of interest while the rest are retained in a model as covariates or confounders. Power for linear regression in this setting can be calculated using SAS PROC POWER. There exists a void in estimating power for the logistic regression models in the same setting.

Methods: Currently, an approach that calculates power for only one variable of interest in the presence of other covariates for logistic regression is in common use and works well for this special case. In this paper we propose three related algorithms along with corresponding SAS macros that extend power estimation for one or more primary variables of interest in the presence of some confounders.

Results: The three proposed empirical algorithms employ likelihood ratio test to provide a user with either a power estimate for a given sample size, a quick sample size estimate for a given power, and an approximate power curve for a range of sample sizes. A user can specify odds ratios for a combination of binary, uniform and standard normal independent variables of interest, and or remaining covariates/confounders in the model, along with a correlation between variables.

Conclusions: These user friendly algorithms and macro tools are a promising solution that can fill the void for estimation of power for logistic regression when multiple independent variables are of interest, in the presence of additional covariates in the model.

背景:通常在设计研究时,研究人员建议在回归模型中测量几个独立变量,其中一个子集被确定为感兴趣的主要变量,而其余的则作为协变量或混杂因素保留在模型中。在这种情况下,线性回归的功率可以使用SAS PROC Power计算。在相同设置下,逻辑回归模型的估计能力存在空白。方法:目前,在逻辑回归中存在其他协变量时,仅计算一个感兴趣变量的功率的方法是常用的,并且适用于这种特殊情况。在本文中,我们提出了三种相关的算法以及相应的SAS宏,这些宏在存在一些混杂因素的情况下扩展了对一个或多个感兴趣的主要变量的功率估计。结果:提出的三种经验算法采用似然比检验为用户提供给定样本量的功率估计,给定功率的快速样本量估计以及样本量范围内的近似功率曲线。用户可以为感兴趣的二元、统一和标准正态自变量的组合指定比值比,以及模型中剩余的协变量/混杂因素,以及变量之间的相关性。结论:这些用户友好的算法和宏观工具是一个很有前途的解决方案,可以填补在模型中存在额外协变量的情况下,当多个自变量感兴趣时,逻辑回归功率估计的空白。
{"title":"Three algorithms and SAS macros for estimating power and sample size for logistic models with one or more independent variables of interest in the presence of covariates.","authors":"David Keith Williams,&nbsp;Zoran Bursac","doi":"10.1186/1751-0473-9-24","DOIUrl":"https://doi.org/10.1186/1751-0473-9-24","url":null,"abstract":"<p><strong>Background: </strong>Commonly when designing studies, researchers propose to measure several independent variables in a regression model, a subset of which are identified as the main variables of interest while the rest are retained in a model as covariates or confounders. Power for linear regression in this setting can be calculated using SAS PROC POWER. There exists a void in estimating power for the logistic regression models in the same setting.</p><p><strong>Methods: </strong>Currently, an approach that calculates power for only one variable of interest in the presence of other covariates for logistic regression is in common use and works well for this special case. In this paper we propose three related algorithms along with corresponding SAS macros that extend power estimation for one or more primary variables of interest in the presence of some confounders.</p><p><strong>Results: </strong>The three proposed empirical algorithms employ likelihood ratio test to provide a user with either a power estimate for a given sample size, a quick sample size estimate for a given power, and an approximate power curve for a range of sample sizes. A user can specify odds ratios for a combination of binary, uniform and standard normal independent variables of interest, and or remaining covariates/confounders in the model, along with a correlation between variables.</p><p><strong>Conclusions: </strong>These user friendly algorithms and macro tools are a promising solution that can fill the void for estimation of power for logistic regression when multiple independent variables are of interest, in the presence of additional covariates in the model.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 ","pages":"24"},"PeriodicalIF":0.0,"publicationDate":"2014-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-24","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33143462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Publication quality 2D graphs with less manual effort due to explicit use of dual coordinate systems 由于明确使用双坐标系统,以较少的手工工作发布高质量的2D图形
Q2 Decision Sciences Pub Date : 2014-10-21 DOI: 10.1186/1751-0473-9-22
Daan Wagenaar
{"title":"Publication quality 2D graphs with less manual effort due to explicit use of dual coordinate systems","authors":"Daan Wagenaar","doi":"10.1186/1751-0473-9-22","DOIUrl":"https://doi.org/10.1186/1751-0473-9-22","url":null,"abstract":"","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 1","pages":"22 - 22"},"PeriodicalIF":0.0,"publicationDate":"2014-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-22","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65725321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
PREdator: a python based GUI for data analysis, evaluation and fitting 一个基于python的GUI,用于数据分析、评估和拟合
Q2 Decision Sciences Pub Date : 2014-09-24 DOI: 10.1186/1751-0473-9-21
C. Wiedemann, Peter Bellstedt, M. Görlach
{"title":"PREdator: a python based GUI for data analysis, evaluation and fitting","authors":"C. Wiedemann, Peter Bellstedt, M. Görlach","doi":"10.1186/1751-0473-9-21","DOIUrl":"https://doi.org/10.1186/1751-0473-9-21","url":null,"abstract":"","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"35 1","pages":"21 - 21"},"PeriodicalIF":0.0,"publicationDate":"2014-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-21","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65725273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
BioFlow: a web based workflow management software for design and execution of genomics pipelines BioFlow:一个基于网络的工作流程管理软件,用于基因组学管道的设计和执行
Q2 Decision Sciences Pub Date : 2014-09-18 DOI: 10.1186/1751-0473-9-20
H. Garner, Ashwin Puthige
{"title":"BioFlow: a web based workflow management software for design and execution of genomics pipelines","authors":"H. Garner, Ashwin Puthige","doi":"10.1186/1751-0473-9-20","DOIUrl":"https://doi.org/10.1186/1751-0473-9-20","url":null,"abstract":"","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 1","pages":"20 - 20"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-20","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65725260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
EC2KEGG: a command line tool for comparison of metabolic pathways. EC2KEGG:用于比较代谢途径的命令行工具。
Q2 Decision Sciences Pub Date : 2014-09-02 eCollection Date: 2014-01-01 DOI: 10.1186/1751-0473-9-19
Aleksey Porollo

Background: Next-generation sequencing and metagenome projects yield a large number of new genomes that need further annotations, such as identification of enzymes and metabolic pathways, or analysis of metabolic strategies of newly sequenced species in comparison to known organisms. While methods for enzyme identification are available, development of the command line tools for high-throughput comparative analysis and visualization of identified enzymes is lagging.

Methods: A set of perl scripts has been developed to perform automated data retrieval from the KEGG database using its new REST program application interface. Enrichment or depletion in metabolic pathways is evaluated using the two-tailed Fisher exact test followed by Benjamini and Hochberg correction.

Results: Comparative analysis of a given set of enzymes with a specified reference organism includes mapping to known metabolic pathways, finding shared and unique enzymes, generating links to visualize maps at KEGG Pathway, computing enrichment of the pathways, listing the non-mapped enzymes.

Conclusions: EC2KEGG provides a platform independent toolkit for automated comparison of identified sets of enzymes from newly sequenced organisms against annotated reference genomes. The tool can be used both for manual annotations of individual species and for high-throughput annotations as part of a computational pipeline. The tool is publicly available at http://sourceforge.net/projects/ec2kegg/.

背景:下一代测序和宏基因组计划产生了大量需要进一步注释的新基因组,例如酶和代谢途径的鉴定,或新测序物种与已知生物的代谢策略分析。虽然酶鉴定的方法是可用的,但用于高通量比较分析和已鉴定酶的可视化的命令行工具的开发滞后。方法:开发了一组perl脚本,使用新的REST程序应用程序接口从KEGG数据库执行自动数据检索。利用双尾Fisher精确检验和Benjamini和Hochberg校正来评估代谢途径中的富集或消耗。结果:一组给定的酶与特定参考生物的比较分析包括已知代谢途径的映射,寻找共享和独特的酶,在KEGG Pathway上生成可视化图的链接,计算途径的富集程度,列出未映射的酶。结论:EC2KEGG提供了一个独立于平台的工具,用于将新测序的生物体中已鉴定的酶与注释的参考基因组进行自动比较。该工具既可以用于单个物种的手动注释,也可以用于作为计算管道一部分的高吞吐量注释。该工具可在http://sourceforge.net/projects/ec2kegg/上公开获取。
{"title":"EC2KEGG: a command line tool for comparison of metabolic pathways.","authors":"Aleksey Porollo","doi":"10.1186/1751-0473-9-19","DOIUrl":"https://doi.org/10.1186/1751-0473-9-19","url":null,"abstract":"<p><strong>Background: </strong>Next-generation sequencing and metagenome projects yield a large number of new genomes that need further annotations, such as identification of enzymes and metabolic pathways, or analysis of metabolic strategies of newly sequenced species in comparison to known organisms. While methods for enzyme identification are available, development of the command line tools for high-throughput comparative analysis and visualization of identified enzymes is lagging.</p><p><strong>Methods: </strong>A set of perl scripts has been developed to perform automated data retrieval from the KEGG database using its new REST program application interface. Enrichment or depletion in metabolic pathways is evaluated using the two-tailed Fisher exact test followed by Benjamini and Hochberg correction.</p><p><strong>Results: </strong>Comparative analysis of a given set of enzymes with a specified reference organism includes mapping to known metabolic pathways, finding shared and unique enzymes, generating links to visualize maps at KEGG Pathway, computing enrichment of the pathways, listing the non-mapped enzymes.</p><p><strong>Conclusions: </strong>EC2KEGG provides a platform independent toolkit for automated comparison of identified sets of enzymes from newly sequenced organisms against annotated reference genomes. The tool can be used both for manual annotations of individual species and for high-throughput annotations as part of a computational pipeline. The tool is publicly available at http://sourceforge.net/projects/ec2kegg/.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 ","pages":"19"},"PeriodicalIF":0.0,"publicationDate":"2014-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-19","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32651085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
VIRAPOPS2 supports the influenza virus reassortments. VIRAPOPS2支持流感病毒重组。
Q2 Decision Sciences Pub Date : 2014-08-17 eCollection Date: 2014-01-01 DOI: 10.1186/1751-0473-9-18
Michel Petitjean, Anne Vanet

Background: For over 400 years, due to the reassortment of their segmented genomes, influenza viruses evolve extremely quickly and cause devastating epidemics. This reassortment arises because two flu viruses can infect the same cell and therefore the new virions' genomes will be composed of segment reassortments of the two parental strains. A treatment developed against parents could then be ineffective if the virions' genomes are different enough from their parent's genomes. It is therefore essential to simulate such reassortment phenomena to assess the risk of apparition of new flu strain.

Findings: So we decided to upgrade the forward simulator VIRAPOPS, containing already the necessary options to handle non-segmented viral populations. This new version can mimic single or successive reassortments, in birds, humans and/or swines. Other options such as the ability to treat populations of positive or negative sense viral RNAs, were also added. Finally, we propose output options giving statistics of the results.

Conclusion: In this paper we present a new version of VIRAPOPS which now manages the viral segment reassortments and the negative sense single strain RNA viruses, these two issues being the cause of serious public health problems.

背景:400多年来,由于其分段基因组的重新组合,流感病毒进化极其迅速,并造成毁灭性的流行病。这种重组的出现是因为两种流感病毒可以感染同一个细胞,因此新的病毒粒子的基因组将由两种亲本毒株的片段重组组成。如果病毒粒子的基因组与其父母的基因组有足够的差异,那么针对父母的治疗方法就可能无效。因此,有必要模拟这种重组现象,以评估新流感病毒株出现的风险。因此,我们决定升级正向模拟器VIRAPOPS,它已经包含了处理非分段病毒群的必要选项。这个新版本可以模仿鸟类、人类和/或猪的单一或连续重组。其他选项,如治疗阳性或阴性感觉病毒rna群体的能力,也被添加。最后,我们提出了输出选项,给出了结果的统计。结论:本文提出了一个新版本的VIRAPOPS,它可以管理病毒片段重组和负感单株RNA病毒,这两个问题是严重的公共卫生问题的原因。
{"title":"VIRAPOPS2 supports the influenza virus reassortments.","authors":"Michel Petitjean,&nbsp;Anne Vanet","doi":"10.1186/1751-0473-9-18","DOIUrl":"https://doi.org/10.1186/1751-0473-9-18","url":null,"abstract":"<p><strong>Background: </strong>For over 400 years, due to the reassortment of their segmented genomes, influenza viruses evolve extremely quickly and cause devastating epidemics. This reassortment arises because two flu viruses can infect the same cell and therefore the new virions' genomes will be composed of segment reassortments of the two parental strains. A treatment developed against parents could then be ineffective if the virions' genomes are different enough from their parent's genomes. It is therefore essential to simulate such reassortment phenomena to assess the risk of apparition of new flu strain.</p><p><strong>Findings: </strong>So we decided to upgrade the forward simulator VIRAPOPS, containing already the necessary options to handle non-segmented viral populations. This new version can mimic single or successive reassortments, in birds, humans and/or swines. Other options such as the ability to treat populations of positive or negative sense viral RNAs, were also added. Finally, we propose output options giving statistics of the results.</p><p><strong>Conclusion: </strong>In this paper we present a new version of VIRAPOPS which now manages the viral segment reassortments and the negative sense single strain RNA viruses, these two issues being the cause of serious public health problems.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 ","pages":"18"},"PeriodicalIF":0.0,"publicationDate":"2014-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-18","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32636514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Easyworm: an open-source software tool to determine the mechanical properties of worm-like chains. Easyworm:一个开源软件工具,用于确定蠕虫状链的机械特性。
Q2 Decision Sciences Pub Date : 2014-07-10 eCollection Date: 2014-01-01 DOI: 10.1186/1751-0473-9-16
Guillaume Lamour, Julius B Kirkegaard, Hongbin Li, Tuomas Pj Knowles, Jörg Gsponer

Background: A growing spectrum of applications for natural and synthetic polymers, whether in industry or for biomedical research, demands for fast and universally applicable tools to determine the mechanical properties of very diverse polymers. To date, determining these properties is the privilege of a limited circle of biophysicists and engineers with appropriate technical skills.

Findings: Easyworm is a user-friendly software suite coded in MATLAB that simplifies the image analysis of individual polymeric chains and the extraction of the mechanical properties of these chains. Easyworm contains a comprehensive set of tools that, amongst others, allow the persistence length of single chains and the Young's modulus of elasticity to be calculated in multiple ways from images of polymers obtained by a variety of techniques (e.g. atomic force microscopy, electron, contrast-phase, or epifluorescence microscopy).

Conclusions: Easyworm thus provides a simple and efficient tool for specialists and non-specialists alike to solve a common problem in (bio)polymer science. Stand-alone executables and shell scripts are provided along with source code for further development.

背景:无论是工业还是生物医学研究,天然和合成聚合物的应用范围越来越广,需要快速和普遍适用的工具来确定各种聚合物的机械性能。迄今为止,确定这些特性是具有适当技术技能的有限的生物物理学家和工程师的特权。Easyworm是一个用MATLAB编写的用户友好的软件套件,它简化了单个聚合物链的图像分析和这些链的机械性能的提取。Easyworm包含一套全面的工具,其中包括允许从各种技术(例如原子力显微镜,电子,对比相或荧光显微镜)获得的聚合物图像以多种方式计算单链的持续长度和杨氏弹性模量。结论:Easyworm为专家和非专业人士提供了一种简单有效的工具来解决(生物)聚合物科学中的常见问题。独立可执行文件和shell脚本与源代码一起提供,以供进一步开发。
{"title":"Easyworm: an open-source software tool to determine the mechanical properties of worm-like chains.","authors":"Guillaume Lamour,&nbsp;Julius B Kirkegaard,&nbsp;Hongbin Li,&nbsp;Tuomas Pj Knowles,&nbsp;Jörg Gsponer","doi":"10.1186/1751-0473-9-16","DOIUrl":"https://doi.org/10.1186/1751-0473-9-16","url":null,"abstract":"<p><strong>Background: </strong>A growing spectrum of applications for natural and synthetic polymers, whether in industry or for biomedical research, demands for fast and universally applicable tools to determine the mechanical properties of very diverse polymers. To date, determining these properties is the privilege of a limited circle of biophysicists and engineers with appropriate technical skills.</p><p><strong>Findings: </strong>Easyworm is a user-friendly software suite coded in MATLAB that simplifies the image analysis of individual polymeric chains and the extraction of the mechanical properties of these chains. Easyworm contains a comprehensive set of tools that, amongst others, allow the persistence length of single chains and the Young's modulus of elasticity to be calculated in multiple ways from images of polymers obtained by a variety of techniques (e.g. atomic force microscopy, electron, contrast-phase, or epifluorescence microscopy).</p><p><strong>Conclusions: </strong>Easyworm thus provides a simple and efficient tool for specialists and non-specialists alike to solve a common problem in (bio)polymer science. Stand-alone executables and shell scripts are provided along with source code for further development.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 ","pages":"16"},"PeriodicalIF":0.0,"publicationDate":"2014-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-16","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32561011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
期刊
Source Code for Biology and Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1