Pub Date : 2016-07-11eCollection Date: 2016-01-01DOI: 10.1186/s13029-016-0057-7
Lindsay V Clark, Erik J Sacks
Background: In genotyping-by-sequencing (GBS) and restriction site-associated DNA sequencing (RAD-seq), read depth is important for assessing the quality of genotype calls and estimating allele dosage in polyploids. However, existing pipelines for GBS and RAD-seq do not provide read counts in formats that are both accurate and easy to access. Additionally, although existing pipelines allow previously-mined SNPs to be genotyped on new samples, they do not allow the user to manually specify a subset of loci to examine. Pipelines that do not use a reference genome assign arbitrary names to SNPs, making meta-analysis across projects difficult.
Results: We created the software TagDigger, which includes three programs for analyzing GBS and RAD-seq data. The first script, tagdigger_interactive.py, rapidly extracts read counts and genotypes from FASTQ files using user-supplied sets of barcodes and tags. Input and output is in CSV format so that it can be opened by spreadsheet software. Tag sequences can also be imported from the Stacks, TASSEL-GBSv2, TASSEL-UNEAK, or pyRAD pipelines, and a separate file can be imported listing the names of markers to retain. A second script, tag_manager.py, consolidates marker names and sequences across multiple projects. A third script, barcode_splitter.py, assists with preparing FASTQ data for deposit in a public archive by splitting FASTQ files by barcode and generating MD5 checksums for the resulting files.
Conclusions: TagDigger is open-source and freely available software written in Python 3. It uses a scalable, rapid search algorithm that can process over 100 million FASTQ reads per hour. TagDigger will run on a laptop with any operating system, does not consume hard drive space with intermediate files, and does not require programming skill to use.
{"title":"TagDigger: user-friendly extraction of read counts from GBS and RAD-seq data.","authors":"Lindsay V Clark, Erik J Sacks","doi":"10.1186/s13029-016-0057-7","DOIUrl":"https://doi.org/10.1186/s13029-016-0057-7","url":null,"abstract":"<p><strong>Background: </strong>In genotyping-by-sequencing (GBS) and restriction site-associated DNA sequencing (RAD-seq), read depth is important for assessing the quality of genotype calls and estimating allele dosage in polyploids. However, existing pipelines for GBS and RAD-seq do not provide read counts in formats that are both accurate and easy to access. Additionally, although existing pipelines allow previously-mined SNPs to be genotyped on new samples, they do not allow the user to manually specify a subset of loci to examine. Pipelines that do not use a reference genome assign arbitrary names to SNPs, making meta-analysis across projects difficult.</p><p><strong>Results: </strong>We created the software TagDigger, which includes three programs for analyzing GBS and RAD-seq data. The first script, tagdigger_interactive.py, rapidly extracts read counts and genotypes from FASTQ files using user-supplied sets of barcodes and tags. Input and output is in CSV format so that it can be opened by spreadsheet software. Tag sequences can also be imported from the Stacks, TASSEL-GBSv2, TASSEL-UNEAK, or pyRAD pipelines, and a separate file can be imported listing the names of markers to retain. A second script, tag_manager.py, consolidates marker names and sequences across multiple projects. A third script, barcode_splitter.py, assists with preparing FASTQ data for deposit in a public archive by splitting FASTQ files by barcode and generating MD5 checksums for the resulting files.</p><p><strong>Conclusions: </strong>TagDigger is open-source and freely available software written in Python 3. It uses a scalable, rapid search algorithm that can process over 100 million FASTQ reads per hour. TagDigger will run on a laptop with any operating system, does not consume hard drive space with intermediate files, and does not require programming skill to use.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":" ","pages":"11"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0057-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34662025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Whole exome sequencing (WES) has provided a means for researchers to gain access to a highly enriched subset of the human genome in which to search for variants that are likely to be pathogenic and possibly provide important insights into disease mechanisms. In developing countries, bioinformatics capacity and expertise is severely limited and wet bench scientists are required to take on the challenging task of understanding and implementing the barrage of bioinformatics tools that are available to them.
Results: We designed a novel method for the filtration of WES data called TAPER™ (Tool for Automated selection and Prioritization for Efficient Retrieval of sequence variants).
Conclusions: TAPER™ implements a set of logical steps by which to prioritize candidate variants that could be associated with disease and this is aimed for implementation in biomedical laboratories with limited bioinformatics capacity. TAPER™ is free, can be setup on a Windows operating system (from Windows 7 and above) and does not require any programming knowledge. In summary, we have developed a freely available tool that simplifies variant prioritization from WES data in order to facilitate discovery of disease-causing genes.
{"title":"A new tool for prioritization of sequence variants from whole exome sequencing data.","authors":"Brigitte Glanzmann, Hendri Herbst, Craig J Kinnear, Marlo Möller, Junaid Gamieldien, Soraya Bardien","doi":"10.1186/s13029-016-0056-8","DOIUrl":"https://doi.org/10.1186/s13029-016-0056-8","url":null,"abstract":"<p><strong>Background: </strong>Whole exome sequencing (WES) has provided a means for researchers to gain access to a highly enriched subset of the human genome in which to search for variants that are likely to be pathogenic and possibly provide important insights into disease mechanisms. In developing countries, bioinformatics capacity and expertise is severely limited and wet bench scientists are required to take on the challenging task of understanding and implementing the barrage of bioinformatics tools that are available to them.</p><p><strong>Results: </strong>We designed a novel method for the filtration of WES data called TAPER™ (Tool for Automated selection and Prioritization for Efficient Retrieval of sequence variants).</p><p><strong>Conclusions: </strong>TAPER™ implements a set of logical steps by which to prioritize candidate variants that could be associated with disease and this is aimed for implementation in biomedical laboratories with limited bioinformatics capacity. TAPER™ is free, can be setup on a Windows operating system (from Windows 7 and above) and does not require any programming knowledge. In summary, we have developed a freely available tool that simplifies variant prioritization from WES data in order to facilitate discovery of disease-causing genes.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":" ","pages":"10"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0056-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34634488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-06-24eCollection Date: 2016-01-01DOI: 10.1186/s13029-016-0055-9
John M Macdonald, Paul C Boutros
Background: To reproduce and report a bioinformatics analysis, it is important to be able to determine the environment in which a program was run. It can also be valuable when trying to debug why different executions are giving unexpectedly different results.
Results: Log::ProgramInfo is a Perl module that writes a log file at the termination of execution of the enclosing program, to document useful execution characteristics. This log file can be used to re-create the environment in order to reproduce an earlier execution. It can also be used to compare the environments of two executions to determine whether there were any differences that might affect (or explain) their operation.
Availability: The source is available on CPAN (Macdonald and Boutros, Log-ProgramInfo. http://search.cpan.org/~boutroslb/Log-ProgramInfo/).
Conclusion: Using Log::ProgramInfo in programs creating result data for publishable research, and including the Log::ProgramInfo output log as part of the publication of that research is a valuable method to assist others to duplicate the programming environment as a precursor to validating and/or extending that research.
{"title":"Log::ProgramInfo: A Perl module to collect and log data for bioinformatics pipelines.","authors":"John M Macdonald, Paul C Boutros","doi":"10.1186/s13029-016-0055-9","DOIUrl":"10.1186/s13029-016-0055-9","url":null,"abstract":"<p><strong>Background: </strong>To reproduce and report a bioinformatics analysis, it is important to be able to determine the environment in which a program was run. It can also be valuable when trying to debug why different executions are giving unexpectedly different results.</p><p><strong>Results: </strong>Log::ProgramInfo is a Perl module that writes a log file at the termination of execution of the enclosing program, to document useful execution characteristics. This log file can be used to re-create the environment in order to reproduce an earlier execution. It can also be used to compare the environments of two executions to determine whether there were any differences that might affect (or explain) their operation.</p><p><strong>Availability: </strong>The source is available on CPAN (Macdonald and Boutros, Log-ProgramInfo. http://search.cpan.org/~boutroslb/Log-ProgramInfo/).</p><p><strong>Conclusion: </strong>Using Log::ProgramInfo in programs creating result data for publishable research, and including the Log::ProgramInfo output log as part of the publication of that research is a valuable method to assist others to duplicate the programming environment as a precursor to validating and/or extending that research.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":" ","pages":"9"},"PeriodicalIF":0.0,"publicationDate":"2016-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4919834/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34613905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-06-18eCollection Date: 2016-01-01DOI: 10.1186/s13029-016-0051-0
Caleb F Davis, Deborah I Ritter, David A Wheeler, Hongmei Wang, Yan Ding, Shannon P Dugan, Matthew N Bainbridge, Donna M Muzny, Pulivarthi H Rao, Tsz-Kwong Man, Sharon E Plon, Richard A Gibbs, Ching C Lau
Background: Genomic deletions, inversions, and other rearrangements known collectively as structural variations (SVs) are implicated in many human disorders. Technologies for sequencing DNA provide a potentially rich source of information in which to detect breakpoints of structural variations at base-pair resolution. However, accurate prediction of SVs remains challenging, and existing informatics tools predict rearrangements with significant rates of false positives or negatives.
Results: To address this challenge, we developed 'Structural Variation detection by STAck and Tail' (SV-STAT) which implements a novel scoring metric. The software uses this statistic to quantify evidence for structural variation in genomic regions suspected of harboring rearrangements. To demonstrate SV-STAT, we used targeted and genome-wide approaches. First, we applied a custom capture array followed by Roche/454 and SV-STAT to three pediatric B-lineage acute lymphoblastic leukemias, identifying five structural variations joining known and novel breakpoint regions. Next, we detected SVs genome-wide in paired-end Illumina data collected from additional tumor samples. SV-STAT showed predictive accuracy as high as or higher than leading alternatives. The software is freely available under the terms of the GNU General Public License version 3 at https://gitorious.org/svstat/svstat.
Conclusions: SV-STAT works across multiple sequencing chemistries, paired and single-end technologies, targeted or whole-genome strategies, and it complements existing SV-detection software. The method is a significant advance towards accurate detection and genotyping of genomic rearrangements from DNA sequencing data.
{"title":"SV-STAT accurately detects structural variation via alignment to reference-based assemblies.","authors":"Caleb F Davis, Deborah I Ritter, David A Wheeler, Hongmei Wang, Yan Ding, Shannon P Dugan, Matthew N Bainbridge, Donna M Muzny, Pulivarthi H Rao, Tsz-Kwong Man, Sharon E Plon, Richard A Gibbs, Ching C Lau","doi":"10.1186/s13029-016-0051-0","DOIUrl":"https://doi.org/10.1186/s13029-016-0051-0","url":null,"abstract":"<p><strong>Background: </strong>Genomic deletions, inversions, and other rearrangements known collectively as structural variations (SVs) are implicated in many human disorders. Technologies for sequencing DNA provide a potentially rich source of information in which to detect breakpoints of structural variations at base-pair resolution. However, accurate prediction of SVs remains challenging, and existing informatics tools predict rearrangements with significant rates of false positives or negatives.</p><p><strong>Results: </strong>To address this challenge, we developed 'Structural Variation detection by STAck and Tail' (SV-STAT) which implements a novel scoring metric. The software uses this statistic to quantify evidence for structural variation in genomic regions suspected of harboring rearrangements. To demonstrate SV-STAT, we used targeted and genome-wide approaches. First, we applied a custom capture array followed by Roche/454 and SV-STAT to three pediatric B-lineage acute lymphoblastic leukemias, identifying five structural variations joining known and novel breakpoint regions. Next, we detected SVs genome-wide in paired-end Illumina data collected from additional tumor samples. SV-STAT showed predictive accuracy as high as or higher than leading alternatives. The software is freely available under the terms of the GNU General Public License version 3 at https://gitorious.org/svstat/svstat.</p><p><strong>Conclusions: </strong>SV-STAT works across multiple sequencing chemistries, paired and single-end technologies, targeted or whole-genome strategies, and it complements existing SV-detection software. The method is a significant advance towards accurate detection and genotyping of genomic rearrangements from DNA sequencing data.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":" ","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2016-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0051-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34601128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-04-15eCollection Date: 2016-01-01DOI: 10.1186/s13029-016-0054-x
Ana Gabriella de Oliveira Sardinha, Ceres Nunes de Resende Oyama, Armando de Mendonça Maroja, Ivan F Costa
Background: The aim of this paper is to provide a general discussion, algorithm, and actual working programs of the deformation method for fast simulation of biological tissue formed by fibers and fluid. In order to demonstrate the benefit of the clinical applications software, we successfully used our computational program to deform a 3D breast image acquired from patients, using a 3D scanner, in a real hospital environment.
Results: The method implements a quasi-static solution for elastic global deformations of objects. Each pair of vertices of the surface is connected and defines an elastic fiber. The set of all the elastic fibers defines a mesh of smaller size than the volumetric meshes, allowing for simulation of complex objects with less computational effort. The behavior similar to the stress tensor is obtained by the volume conservation equation that mixes the 3D coordinates. Step by step, we show the computational implementation of this approach.
Conclusions: As an example, a 2D rectangle formed by only 4 vertices is solved and, for this simple geometry, all intermediate results are shown. On the other hand, actual implementations of these ideas in the form of working computer routines are provided for general 3D objects, including a clinical application.
{"title":"Implementation and clinical application of a deformation method for fast simulation of biological tissue formed by fibers and fluid.","authors":"Ana Gabriella de Oliveira Sardinha, Ceres Nunes de Resende Oyama, Armando de Mendonça Maroja, Ivan F Costa","doi":"10.1186/s13029-016-0054-x","DOIUrl":"https://doi.org/10.1186/s13029-016-0054-x","url":null,"abstract":"<p><strong>Background: </strong>The aim of this paper is to provide a general discussion, algorithm, and actual working programs of the deformation method for fast simulation of biological tissue formed by fibers and fluid. In order to demonstrate the benefit of the clinical applications software, we successfully used our computational program to deform a 3D breast image acquired from patients, using a 3D scanner, in a real hospital environment.</p><p><strong>Results: </strong>The method implements a quasi-static solution for elastic global deformations of objects. Each pair of vertices of the surface is connected and defines an elastic fiber. The set of all the elastic fibers defines a mesh of smaller size than the volumetric meshes, allowing for simulation of complex objects with less computational effort. The behavior similar to the stress tensor is obtained by the volume conservation equation that mixes the 3D coordinates. Step by step, we show the computational implementation of this approach.</p><p><strong>Conclusions: </strong>As an example, a 2D rectangle formed by only 4 vertices is solved and, for this simple geometry, all intermediate results are shown. On the other hand, actual implementations of these ideas in the form of working computer routines are provided for general 3D objects, including a clinical application.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":" ","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2016-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0054-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34312216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-04-11eCollection Date: 2016-01-01DOI: 10.1186/s13029-016-0053-y
Deena M A Gendoo, Benjamin Haibe-Kains
Background: Medulloblastoma (MB) is a highly malignant and heterogeneous brain tumour that is the most common cause of cancer-related deaths in children. Increasing availability of genomic data over the last decade had resulted in improvement of human subtype classification methods, and the parallel development of MB mouse models towards identification of subtype-specific disease origins and signaling pathways. Despite these advances, MB classification schemes remained inadequate for personalized prediction of MB subtypes for individual patient samples and across model systems. To address this issue, we developed the Medullo-Model to Subtypes ( MM2S ) classifier, a new method enabling classification of individual gene expression profiles from MB samples (patient samples, mouse models, and cell lines) against well-established molecular subtypes [Genomics 106:96-106, 2015]. We demonstrated the accuracy and flexibility of MM2S in the largest meta-analysis of human patients and mouse models to date. Here, we present a new functional package that provides an easy-to-use and fully documented implementation of the MM2S method, with additional functionalities that allow users to obtain graphical and tabular summaries of MB subtype predictions for single samples and across sample replicates. The flexibility of the MM2S package promotes incorporation of MB predictions into large Medulloblastoma-driven analysis pipelines, making this tool suitable for use by researchers.
Results: The MM2S package is applied in two case studies involving human primary patient samples, as well as sample replicates of the GTML mouse model. We highlight functions that are of use for species-specific MB classification, across individual samples and sample replicates. We emphasize on the range of functions that can be used to derive both singular and meta-centric views of MB predictions, across samples and across MB subtypes.
Conclusions: Our MM2S package can be used to generate predictions without having to rely on an external web server or additional sources. Our open-source package facilitates and extends the MM2S algorithm in diverse computational and bioinformatics contexts. The package is available on CRAN, at the following URL: https://cran.r-project.org/web/packages/MM2S/, as well as on Github at the following URLs: https://github.com/DGendoo and https://github.com/bhklab.
背景:髓母细胞瘤(MB)是一种高度恶性和异质性的脑肿瘤,是儿童癌症相关死亡的最常见原因。在过去十年中,基因组数据的增加导致了人类亚型分类方法的改进,以及MB小鼠模型的平行发展,以确定亚型特异性疾病的起源和信号通路。尽管取得了这些进展,但MB分类方案仍然不足以对个体患者样本和跨模型系统的MB亚型进行个性化预测。为了解决这个问题,我们开发了Medullo-Model To Subtypes (MM2S)分类器,这是一种新的方法,可以根据已建立的分子亚型对MB样本(患者样本、小鼠模型和细胞系)的个体基因表达谱进行分类[Genomics 106:96-106, 2015]。我们在迄今为止最大的人类患者和小鼠模型荟萃分析中证明了MM2S的准确性和灵活性。在这里,我们提出了一个新的功能包,它提供了一个易于使用和完整文档化的MM2S方法实现,并具有其他功能,允许用户获得单个样本和跨样本复制的MB亚型预测的图形和表格摘要。MM2S包的灵活性促进了将MB预测合并到大型髓母细胞瘤驱动的分析管道中,使该工具适合研究人员使用。结果:MM2S包应用于涉及人类主要患者样本的两个案例研究,以及GTML小鼠模型的样本复制。我们强调了在个体样本和样本重复中用于物种特异性MB分类的功能。我们强调了函数的范围,这些函数可用于推导跨样本和跨MB亚型的MB预测的奇异和元中心视图。结论:我们的MM2S包可以用来生成预测,而无需依赖外部web服务器或其他来源。我们的开源包在不同的计算和生物信息学环境中促进和扩展了MM2S算法。该软件包可在CRAN上获得,网址如下:https://cran.r-project.org/web/packages/MM2S/,也可在Github上获得,网址如下:https://github.com/DGendoo和https://github.com/bhklab。
{"title":"MM2S: personalized diagnosis of medulloblastoma patients and model systems.","authors":"Deena M A Gendoo, Benjamin Haibe-Kains","doi":"10.1186/s13029-016-0053-y","DOIUrl":"https://doi.org/10.1186/s13029-016-0053-y","url":null,"abstract":"<p><strong>Background: </strong>Medulloblastoma (MB) is a highly malignant and heterogeneous brain tumour that is the most common cause of cancer-related deaths in children. Increasing availability of genomic data over the last decade had resulted in improvement of human subtype classification methods, and the parallel development of MB mouse models towards identification of subtype-specific disease origins and signaling pathways. Despite these advances, MB classification schemes remained inadequate for personalized prediction of MB subtypes for individual patient samples and across model systems. To address this issue, we developed the Medullo-Model to Subtypes ( MM2S ) classifier, a new method enabling classification of individual gene expression profiles from MB samples (patient samples, mouse models, and cell lines) against well-established molecular subtypes [Genomics 106:96-106, 2015]. We demonstrated the accuracy and flexibility of MM2S in the largest meta-analysis of human patients and mouse models to date. Here, we present a new functional package that provides an easy-to-use and fully documented implementation of the MM2S method, with additional functionalities that allow users to obtain graphical and tabular summaries of MB subtype predictions for single samples and across sample replicates. The flexibility of the MM2S package promotes incorporation of MB predictions into large Medulloblastoma-driven analysis pipelines, making this tool suitable for use by researchers.</p><p><strong>Results: </strong>The MM2S package is applied in two case studies involving human primary patient samples, as well as sample replicates of the GTML mouse model. We highlight functions that are of use for species-specific MB classification, across individual samples and sample replicates. We emphasize on the range of functions that can be used to derive both singular and meta-centric views of MB predictions, across samples and across MB subtypes.</p><p><strong>Conclusions: </strong>Our MM2S package can be used to generate predictions without having to rely on an external web server or additional sources. Our open-source package facilitates and extends the MM2S algorithm in diverse computational and bioinformatics contexts. The package is available on CRAN, at the following URL: https://cran.r-project.org/web/packages/MM2S/, as well as on Github at the following URLs: https://github.com/DGendoo and https://github.com/bhklab.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":" ","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0053-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34307296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-04-02DOI: 10.1186/s13029-016-0052-z
Fuquan Zhang
{"title":"A flexible tool to plot a genomic map for single nucleotide polymorphisms","authors":"Fuquan Zhang","doi":"10.1186/s13029-016-0052-z","DOIUrl":"https://doi.org/10.1186/s13029-016-0052-z","url":null,"abstract":"","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0052-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65752531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-03-28eCollection Date: 2016-01-01DOI: 10.1186/s13029-016-0050-1
Samson S Kiware, Tanya L Russell, Zacharia J Mtema, Alpha D Malishee, Prosper Chaki, Dickson Lwetoijera, Javan Chanda, Dingani Chinula, Silas Majambere, John E Gimnig, Thomas A Smith, Gerry F Killeen
Background: Standardized schemas, databases, and public data repositories are needed for the studies of malaria vectors that encompass a remarkably diverse array of designs and rapidly generate large data volumes, often in resource-limited tropical settings lacking specialized software or informatics support.
Results: Data from the majority of mosquito studies conformed to a generic schema, with data collection forms recording the experimental design, sorting of collections, details of sample pooling or subdivision, and additional observations. Generically applicable forms with standardized attribute definitions enabled rigorous, consistent data and sample management with generic software and minimal expertise. Forms use now includes 20 experiments, 8 projects, and 15 users at 3 research and control institutes in 3 African countries, resulting in 11 peer-reviewed publications.
Conclusion: We have designed generic data schema that can be used to develop paper or electronic based data collection forms depending on the availability of resources. We have developed paper-based data collection forms that can be used to collect data from majority of entomological studies across multiple study areas using standardized data formats. Data recorded on these forms with standardized formats can be entered and linked with any relational database software. These informatics tools are recommended because they ensure that medical entomologists save time, improve data quality, and data collected and shared across multiple studies is in standardized formats hence increasing research outputs.
{"title":"A generic schema and data collection forms applicable to diverse entomological studies of mosquitoes.","authors":"Samson S Kiware, Tanya L Russell, Zacharia J Mtema, Alpha D Malishee, Prosper Chaki, Dickson Lwetoijera, Javan Chanda, Dingani Chinula, Silas Majambere, John E Gimnig, Thomas A Smith, Gerry F Killeen","doi":"10.1186/s13029-016-0050-1","DOIUrl":"10.1186/s13029-016-0050-1","url":null,"abstract":"<p><strong>Background: </strong>Standardized schemas, databases, and public data repositories are needed for the studies of malaria vectors that encompass a remarkably diverse array of designs and rapidly generate large data volumes, often in resource-limited tropical settings lacking specialized software or informatics support.</p><p><strong>Results: </strong>Data from the majority of mosquito studies conformed to a generic schema, with data collection forms recording the experimental design, sorting of collections, details of sample pooling or subdivision, and additional observations. Generically applicable forms with standardized attribute definitions enabled rigorous, consistent data and sample management with generic software and minimal expertise. Forms use now includes 20 experiments, 8 projects, and 15 users at 3 research and control institutes in 3 African countries, resulting in 11 peer-reviewed publications.</p><p><strong>Conclusion: </strong>We have designed generic data schema that can be used to develop paper or electronic based data collection forms depending on the availability of resources. We have developed paper-based data collection forms that can be used to collect data from majority of entomological studies across multiple study areas using standardized data formats. Data recorded on these forms with standardized formats can be entered and linked with any relational database software. These informatics tools are recommended because they ensure that medical entomologists save time, improve data quality, and data collected and shared across multiple studies is in standardized formats hence increasing research outputs.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"11 ","pages":"4"},"PeriodicalIF":0.0,"publicationDate":"2016-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4809029/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9832699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-03-09DOI: 10.1186/s13029-016-0049-7
Refat Sharmin, A. B. Islam
{"title":"Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis","authors":"Refat Sharmin, A. B. Islam","doi":"10.1186/s13029-016-0049-7","DOIUrl":"https://doi.org/10.1186/s13029-016-0049-7","url":null,"abstract":"","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0049-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65752489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-02-16DOI: 10.1186/s13029-016-0048-8
Vasanth R. Singan, J. Simpson
{"title":"Implementation of the Rank-Weighted Co-localization (RWC) algorithm in multiple image analysis platforms for quantitative analysis of microscopy images","authors":"Vasanth R. Singan, J. Simpson","doi":"10.1186/s13029-016-0048-8","DOIUrl":"https://doi.org/10.1186/s13029-016-0048-8","url":null,"abstract":"","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13029-016-0048-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65752472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}