Pub Date : 2014-09-20eCollection Date: 2014-01-01DOI: 10.1186/2043-9113-4-12
Quinn S Wells, Eric Farber-Eger, Dana C Crawford
Background: Measures of cardiac structure and function are important human phenotypes that are associated with a range of clinical outcomes. Studying these traits in large populations can be time consuming and costly. Utilizing data from large electronic medical records (EMRs) is one possible solution to this problem. We describe the extraction and filtering of quantitative transthoracic echocardiographic data from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study, a large, racially diverse, EMR-based cohort (n = 15,863).
Results: There were 6,076 echocardiography reports for 2,834 unique adult subjects. Missing data were uncommon with over 90% of data points present. Data irregularities are primarily related to inconsistent use of measurement units and transcriptional errors. The reported filtering method requires manual review of very few data points (<1%), and filtered echocardiographic parameters are similar to published data from epidemiologic populations of similar ethnicity. Moreover, the cohort is comparable in size, and in some cases larger than community-based cohorts of similar race/ethnicity.
Conclusions: These results demonstrate that echocardiographic data can be efficiently extracted from EMRs, and suggest that EMR-based cohorts have the potential to make major contributions toward the study of epidemiologic and genotype-phenotype associations for cardiac structure and function in diverse populations.
{"title":"Extraction of echocardiographic data from the electronic medical record is a rapid and efficient method for study of cardiac structure and function.","authors":"Quinn S Wells, Eric Farber-Eger, Dana C Crawford","doi":"10.1186/2043-9113-4-12","DOIUrl":"https://doi.org/10.1186/2043-9113-4-12","url":null,"abstract":"<p><strong>Background: </strong>Measures of cardiac structure and function are important human phenotypes that are associated with a range of clinical outcomes. Studying these traits in large populations can be time consuming and costly. Utilizing data from large electronic medical records (EMRs) is one possible solution to this problem. We describe the extraction and filtering of quantitative transthoracic echocardiographic data from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study, a large, racially diverse, EMR-based cohort (n = 15,863).</p><p><strong>Results: </strong>There were 6,076 echocardiography reports for 2,834 unique adult subjects. Missing data were uncommon with over 90% of data points present. Data irregularities are primarily related to inconsistent use of measurement units and transcriptional errors. The reported filtering method requires manual review of very few data points (<1%), and filtered echocardiographic parameters are similar to published data from epidemiologic populations of similar ethnicity. Moreover, the cohort is comparable in size, and in some cases larger than community-based cohorts of similar race/ethnicity.</p><p><strong>Conclusions: </strong>These results demonstrate that echocardiographic data can be efficiently extracted from EMRs, and suggest that EMR-based cohorts have the potential to make major contributions toward the study of epidemiologic and genotype-phenotype associations for cardiac structure and function in diverse populations.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"12"},"PeriodicalIF":0.0,"publicationDate":"2014-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-12","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32713691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aims: The aim of our study was to evaluate the relationship between four CFTR variations and the congenital bilateral absence of the vas deferens (CBAVD).
Methods: A systematic search was performed in the literature databases for the case-control studies of CFTR variations with the risk of CBAVD. A total of 29 studies among 1139 controls and 1562 CBAVD patients were gathered for the meta-analyses of three commonly tecsted variations (5T, ΔF508 and M470V) with CBAVD.
Results: Our meta-analyses observed significant associations between CBAVD and all the three variations, including 5T (P < 0.001, OR = 8.35, 95% CI = 6.68-10.43), M470V (P = 0.027, OR = 0.74, 95% CI = 0.60-0.91) and ΔF508 (P < 0.001, OR = 22.20, 95% CI = 7.49-65.79).
Conclusion: In the current study, we demonstrated a significant association between CFTR variations and CBAVD. Our results showed that the 5T variation was a risk factor of CBAVD in French, Spanish, Japanese, Chinese, Iranian, Indian, Mexican and Egyptian populations. CFTR ΔF508 was another important risk factor in Caucasians, including Slovenians, Canadians, Iranians, and Egyptians. In addition, M470V was a protective factor among French, Chinese, Italian and Iranian populations.
{"title":"Meta-analyses of 4 CFTR variants associated with the risk of the congenital bilateral absence of the vas deferens.","authors":"Xuting Xu, Jufen Zheng, Qi Liao, Huiqing Zhu, Hongyan Xie, Huijuan Shi, Shiwei Duan","doi":"10.1186/2043-9113-4-11","DOIUrl":"https://doi.org/10.1186/2043-9113-4-11","url":null,"abstract":"<p><strong>Aims: </strong>The aim of our study was to evaluate the relationship between four CFTR variations and the congenital bilateral absence of the vas deferens (CBAVD).</p><p><strong>Methods: </strong>A systematic search was performed in the literature databases for the case-control studies of CFTR variations with the risk of CBAVD. A total of 29 studies among 1139 controls and 1562 CBAVD patients were gathered for the meta-analyses of three commonly tecsted variations (5T, ΔF508 and M470V) with CBAVD.</p><p><strong>Results: </strong>Our meta-analyses observed significant associations between CBAVD and all the three variations, including 5T (P < 0.001, OR = 8.35, 95% CI = 6.68-10.43), M470V (P = 0.027, OR = 0.74, 95% CI = 0.60-0.91) and ΔF508 (P < 0.001, OR = 22.20, 95% CI = 7.49-65.79).</p><p><strong>Conclusion: </strong>In the current study, we demonstrated a significant association between CFTR variations and CBAVD. Our results showed that the 5T variation was a risk factor of CBAVD in French, Spanish, Japanese, Chinese, Iranian, Indian, Mexican and Egyptian populations. CFTR ΔF508 was another important risk factor in Caucasians, including Slovenians, Canadians, Iranians, and Egyptians. In addition, M470V was a protective factor among French, Chinese, Italian and Iranian populations.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"11"},"PeriodicalIF":0.0,"publicationDate":"2014-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-11","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32625383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-07-04eCollection Date: 2014-01-01DOI: 10.1186/2043-9113-4-10
Nicolae Todor, Irina Todor, Gavril Săplăcan
Background: The linear combination of variables is an attractive method in many medical analyses targeting a score to classify patients. In the case of ROC curves the most popular problem is to identify the linear combination which maximizes area under curve (AUC). This problem is complete closed when normality assumptions are met. With no assumption of normality search algorithm are avoided because it is accepted that we have to evaluate AUC n(d) times where n is the number of distinct observation and d is the number of variables.
Methods: For d = 2, using particularities of AUC formula, we described an algorithm which lowered the number of evaluations of AUC from n(2) to n(n-1) + 1. For d > 2 our proposed solution is an approximate method by considering equidistant points on the unit sphere in R(d) where we evaluate AUC.
Results: The algorithms were applied to data from our lab to predict response of treatment by a set of molecular markers in cervical cancers patients. In order to evaluate the strength of our algorithms a simulation was added.
Conclusions: In the case of no normality presented algorithms are feasible. For many variables computation time could be increased but acceptable.
{"title":"Tools to identify linear combination of prognostic factors which maximizes area under receiver operator curve.","authors":"Nicolae Todor, Irina Todor, Gavril Săplăcan","doi":"10.1186/2043-9113-4-10","DOIUrl":"https://doi.org/10.1186/2043-9113-4-10","url":null,"abstract":"<p><strong>Background: </strong>The linear combination of variables is an attractive method in many medical analyses targeting a score to classify patients. In the case of ROC curves the most popular problem is to identify the linear combination which maximizes area under curve (AUC). This problem is complete closed when normality assumptions are met. With no assumption of normality search algorithm are avoided because it is accepted that we have to evaluate AUC n(d) times where n is the number of distinct observation and d is the number of variables.</p><p><strong>Methods: </strong>For d = 2, using particularities of AUC formula, we described an algorithm which lowered the number of evaluations of AUC from n(2) to n(n-1) + 1. For d > 2 our proposed solution is an approximate method by considering equidistant points on the unit sphere in R(d) where we evaluate AUC.</p><p><strong>Results: </strong>The algorithms were applied to data from our lab to predict response of treatment by a set of molecular markers in cervical cancers patients. In order to evaluate the strength of our algorithms a simulation was added.</p><p><strong>Conclusions: </strong>In the case of no normality presented algorithms are feasible. For many variables computation time could be increased but acceptable.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"10"},"PeriodicalIF":0.0,"publicationDate":"2014-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-10","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32539949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-06-10eCollection Date: 2014-01-01DOI: 10.1186/2043-9113-4-9
Jitsuki Sawamura, Shigeru Morishita, Jun Ishigooka
THE STEVENS CLASSIFICATION OF LEVELS OF MEASUREMENT INVOLVES FOUR TYPES OF SCALE: "Nominal", "Ordinal", "Interval" and "Ratio". This classification has been used widely in medical fields and has accomplished an important role in composition and interpretation of scale. With this classification, levels of measurements appear organized and validated. However, a group theory-like systematization beckons as an alternative because of its logical consistency and unexceptional applicability in the natural sciences but which may offer great advantages in clinical medicine. According to this viewpoint, the Stevens classification is reformulated within an abstract algebra-like scheme; 'Abelian modulo additive group' for "Ordinal scale" accompanied with 'zero', 'Abelian additive group' for "Interval scale", and 'field' for "Ratio scale". Furthermore, a vector-like display arranges a mixture of schemes describing the assessment of patient states. With this vector-like notation, data-mining and data-set combination is possible on a higher abstract structure level based upon a hierarchical-cluster form. Using simple examples, we show that operations acting on the corresponding mixed schemes of this display allow for a sophisticated means of classifying, updating, monitoring, and prognosis, where better data mining/data usage and efficacy is expected.
{"title":"Interpretation for scales of measurement linking with abstract algebra.","authors":"Jitsuki Sawamura, Shigeru Morishita, Jun Ishigooka","doi":"10.1186/2043-9113-4-9","DOIUrl":"10.1186/2043-9113-4-9","url":null,"abstract":"<p><p>THE STEVENS CLASSIFICATION OF LEVELS OF MEASUREMENT INVOLVES FOUR TYPES OF SCALE: \"Nominal\", \"Ordinal\", \"Interval\" and \"Ratio\". This classification has been used widely in medical fields and has accomplished an important role in composition and interpretation of scale. With this classification, levels of measurements appear organized and validated. However, a group theory-like systematization beckons as an alternative because of its logical consistency and unexceptional applicability in the natural sciences but which may offer great advantages in clinical medicine. According to this viewpoint, the Stevens classification is reformulated within an abstract algebra-like scheme; 'Abelian modulo additive group' for \"Ordinal scale\" accompanied with 'zero', 'Abelian additive group' for \"Interval scale\", and 'field' for \"Ratio scale\". Furthermore, a vector-like display arranges a mixture of schemes describing the assessment of patient states. With this vector-like notation, data-mining and data-set combination is possible on a higher abstract structure level based upon a hierarchical-cluster form. Using simple examples, we show that operations acting on the corresponding mixed schemes of this display allow for a sophisticated means of classifying, updating, monitoring, and prognosis, where better data mining/data usage and efficacy is expected. </p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"9"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32472854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-05-23eCollection Date: 2014-01-01DOI: 10.1186/2043-9113-4-8
Mohammad Ali Moni, Pietro Liò
Background: The diagnosis of comorbidities, which refers to the coexistence of different acute and chronic diseases, is difficult due to the modern extreme specialisation of physicians. We envisage that a software dedicated to comorbidity diagnosis could result in an effective aid to the health practice.
Results: We have developed an R software comoR to compute novel estimators of the disease comorbidity associations. Starting from an initial diagnosis, genetic and clinical data of a patient the software identifies the risk of disease comorbidity. Then it provides a pipeline with different causal inference packages (e.g. pcalg, qtlnet etc) to predict the causal relationship of diseases. It also provides a pipeline with network regression and survival analysis tools (e.g. Net-Cox, rbsurv etc) to predict more accurate survival probability of patients. The input of this software is the initial diagnosis for a patient and the output provides evidences of disease comorbidity mapping.
Conclusions: The functions of the comoR offer flexibility for diagnostic applications to predict disease comorbidities, and can be easily integrated to high-throughput and clinical data analysis pipelines.
{"title":"comoR: a software for disease comorbidity risk assessment.","authors":"Mohammad Ali Moni, Pietro Liò","doi":"10.1186/2043-9113-4-8","DOIUrl":"https://doi.org/10.1186/2043-9113-4-8","url":null,"abstract":"<p><strong>Background: </strong>The diagnosis of comorbidities, which refers to the coexistence of different acute and chronic diseases, is difficult due to the modern extreme specialisation of physicians. We envisage that a software dedicated to comorbidity diagnosis could result in an effective aid to the health practice.</p><p><strong>Results: </strong>We have developed an R software comoR to compute novel estimators of the disease comorbidity associations. Starting from an initial diagnosis, genetic and clinical data of a patient the software identifies the risk of disease comorbidity. Then it provides a pipeline with different causal inference packages (e.g. pcalg, qtlnet etc) to predict the causal relationship of diseases. It also provides a pipeline with network regression and survival analysis tools (e.g. Net-Cox, rbsurv etc) to predict more accurate survival probability of patients. The input of this software is the initial diagnosis for a patient and the output provides evidences of disease comorbidity mapping.</p><p><strong>Conclusions: </strong>The functions of the comoR offer flexibility for diagnostic applications to predict disease comorbidities, and can be easily integrated to high-throughput and clinical data analysis pipelines.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2014-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32520603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-04-25eCollection Date: 2014-01-01DOI: 10.1186/2043-9113-4-7
Yu Teng, Nan Kong, Wanzhu Tu
Background: Chlamydial infection is a common bacterial sexually transmitted infection worldwide, caused by C. trachomatis. The screening for C. trachomatis has been proven to be successful. However, such success is not fully realized through tailoring the recommended screening strategies for different age groups. This is partly due to the knowledge gap in understanding how the infection is correlated with age. In this paper, we estimate age-dependent risks of acquiring C. trachomatis by adolescent women via unprotected heterosexual acts.
Methods: We develop a time-varying Markov state-transition model and compute the incidences of chlamydial infection at discrete age points by simulating the state-transition model with candidate per-encounter acquisition risks and sampled numbers of unit-time unprotected coital events at different age points. We solve an optimization problem to identify the age-dependent estimates that offer the closest matches to the observed infection incidences. We also investigate the impact of antimicrobial treatment effectiveness on the parameter estimates and the differences between the acquisition risks for the first-time infections and repeated infections.
Results: Our case study supports the beliefs that age is an inverse predictor of C. trachomatis transmission and that protective immunity developed after initial infection is only partial.
Conclusions: Our modeling method offers a flexible and expandable platform for investigating STI transmission.
{"title":"Estimating age-dependent per-encounter chlamydia trachomatis acquisition risk via a Markov-based state-transition model.","authors":"Yu Teng, Nan Kong, Wanzhu Tu","doi":"10.1186/2043-9113-4-7","DOIUrl":"10.1186/2043-9113-4-7","url":null,"abstract":"<p><strong>Background: </strong>Chlamydial infection is a common bacterial sexually transmitted infection worldwide, caused by C. trachomatis. The screening for C. trachomatis has been proven to be successful. However, such success is not fully realized through tailoring the recommended screening strategies for different age groups. This is partly due to the knowledge gap in understanding how the infection is correlated with age. In this paper, we estimate age-dependent risks of acquiring C. trachomatis by adolescent women via unprotected heterosexual acts.</p><p><strong>Methods: </strong>We develop a time-varying Markov state-transition model and compute the incidences of chlamydial infection at discrete age points by simulating the state-transition model with candidate per-encounter acquisition risks and sampled numbers of unit-time unprotected coital events at different age points. We solve an optimization problem to identify the age-dependent estimates that offer the closest matches to the observed infection incidences. We also investigate the impact of antimicrobial treatment effectiveness on the parameter estimates and the differences between the acquisition risks for the first-time infections and repeated infections.</p><p><strong>Results: </strong>Our case study supports the beliefs that age is an inverse predictor of C. trachomatis transmission and that protective immunity developed after initial infection is only partial.</p><p><strong>Conclusions: </strong>Our modeling method offers a flexible and expandable platform for investigating STI transmission.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2014-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4022339/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32378765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-04-16eCollection Date: 2014-01-01DOI: 10.1186/2043-9113-4-6
Cyprien Mbogning, Hervé Perdry, Wilson Toussile, Philippe Broët
Background: Dissecting the genomic spectrum of clinical disease entities is a challenging task. Recursive partitioning (or classification trees) methods provide powerful tools for exploring complex interplay among genomic factors, with respect to a main factor, that can reveal hidden genomic patterns. To take confounding variables into account, the partially linear tree-based regression (PLTR) model has been recently published. It combines regression models and tree-based methodology. It is however computationally burdensome and not well suited for situations for which a large number of exploratory variables is expected.
Methods: We developed a novel procedure that represents an alternative to the original PLTR procedure, and considered different selection criteria. A simulation study with different scenarios has been performed to compare the performances of the proposed procedure to the original PLTR strategy.
Results: The proposed procedure with a Bayesian Information Criterion (BIC) achieved good performances to detect the hidden structure as compared to the original procedure. The novel procedure was used for analyzing patterns of copy-number alterations in lung adenocarcinomas, with respect to Kirsten Rat Sarcoma Viral Oncogene Homolog gene (KRAS) mutation status, while controlling for a cohort effect. Results highlight two subgroups of pure or nearly pure wild-type KRAS tumors with particular copy-number alteration patterns.
Conclusions: The proposed procedure with a BIC criterion represents a powerful and practical alternative to the original procedure. Our procedure performs well in a general framework and is simple to implement.
{"title":"A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities.","authors":"Cyprien Mbogning, Hervé Perdry, Wilson Toussile, Philippe Broët","doi":"10.1186/2043-9113-4-6","DOIUrl":"10.1186/2043-9113-4-6","url":null,"abstract":"<p><strong>Background: </strong>Dissecting the genomic spectrum of clinical disease entities is a challenging task. Recursive partitioning (or classification trees) methods provide powerful tools for exploring complex interplay among genomic factors, with respect to a main factor, that can reveal hidden genomic patterns. To take confounding variables into account, the partially linear tree-based regression (PLTR) model has been recently published. It combines regression models and tree-based methodology. It is however computationally burdensome and not well suited for situations for which a large number of exploratory variables is expected.</p><p><strong>Methods: </strong>We developed a novel procedure that represents an alternative to the original PLTR procedure, and considered different selection criteria. A simulation study with different scenarios has been performed to compare the performances of the proposed procedure to the original PLTR strategy.</p><p><strong>Results: </strong>The proposed procedure with a Bayesian Information Criterion (BIC) achieved good performances to detect the hidden structure as compared to the original procedure. The novel procedure was used for analyzing patterns of copy-number alterations in lung adenocarcinomas, with respect to Kirsten Rat Sarcoma Viral Oncogene Homolog gene (KRAS) mutation status, while controlling for a cohort effect. Results highlight two subgroups of pure or nearly pure wild-type KRAS tumors with particular copy-number alteration patterns.</p><p><strong>Conclusions: </strong>The proposed procedure with a BIC criterion represents a powerful and practical alternative to the original procedure. Our procedure performs well in a general framework and is simple to implement.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2014-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4129184/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32267694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: A comprehensive view on all relevant genomic data is instrumental for understanding the complex patterns of molecular alterations typically found in cancer cells. One of the most effective ways to rapidly obtain an overview of genomic alterations in large amounts of genomic data is the integrative visualization of genomic events.
Results: We developed FISH Oracle 2, a web server for the interactive visualization of different kinds of downstream processed genomics data typically available in cancer research. A powerful search interface and a fast visualization engine provide a highly interactive visualization for such data. High quality image export enables the life scientist to easily communicate their results. A comprehensive data administration allows to keep track of the available data sets. We applied FISH Oracle 2 to published data and found evidence that, in colorectal cancer cells, the gene TTC28 may be inactivated in two different ways, a fact that has not been published before.
Conclusions: The interactive nature of FISH Oracle 2 and the possibility to store, select and visualize large amounts of downstream processed data support life scientists in generating hypotheses. The export of high quality images supports explanatory data visualization, simplifying the communication of new biological findings. A FISH Oracle 2 demo server and the software is available at http://www.zbh.uni-hamburg.de/fishoracle.
{"title":"FISH Oracle 2: a web server for integrative visualization of genomic data in cancer research.","authors":"Malte Mader, Ronald Simon, Stefan Kurtz","doi":"10.1186/2043-9113-4-5","DOIUrl":"https://doi.org/10.1186/2043-9113-4-5","url":null,"abstract":"<p><strong>Background: </strong>A comprehensive view on all relevant genomic data is instrumental for understanding the complex patterns of molecular alterations typically found in cancer cells. One of the most effective ways to rapidly obtain an overview of genomic alterations in large amounts of genomic data is the integrative visualization of genomic events.</p><p><strong>Results: </strong>We developed FISH Oracle 2, a web server for the interactive visualization of different kinds of downstream processed genomics data typically available in cancer research. A powerful search interface and a fast visualization engine provide a highly interactive visualization for such data. High quality image export enables the life scientist to easily communicate their results. A comprehensive data administration allows to keep track of the available data sets. We applied FISH Oracle 2 to published data and found evidence that, in colorectal cancer cells, the gene TTC28 may be inactivated in two different ways, a fact that has not been published before.</p><p><strong>Conclusions: </strong>The interactive nature of FISH Oracle 2 and the possibility to store, select and visualize large amounts of downstream processed data support life scientists in generating hypotheses. The export of high quality images supports explanatory data visualization, simplifying the communication of new biological findings. A FISH Oracle 2 demo server and the software is available at http://www.zbh.uni-hamburg.de/fishoracle.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 1","pages":"5"},"PeriodicalIF":0.0,"publicationDate":"2014-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32221304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bin Liu, Xiaotian Zhang, Honglan Huang, Ying Zhang, Fengfeng Zhou, Guoqing Wang
Different subtypes of Mycobacterium tuberculosis (MTB) may induce diverse severe human infections, and some of their symptoms are similar to other pathogenes, e.g. Nontuberculosis mycobacteria (NTM). So determination of mycobacterium subtypes facilitates the effective control of MTB infection and proliferation. This study exploits a novel DNA barcoding visualization method for molecular typing of 17 mycobacteria genomes published in the NCBI prokaryotic genome database. Three mycobacterium genes (Rv0279c, Rv3508 and Rv3514) from the PE/PPE family of MT Band were detected to best represent the inter-strain pathogenetic variations. An accurate and fast MTB substrain typing method was proposed based on the combination of the aforementioned three biomarker genes and the 16S rRNA gene. The protocol of establishing a bacterial substrain typing system used in this study may also be applied to the other pathogenes.
{"title":"A novel molecular typing method of Mycobacteria based on DNA barcoding visualization.","authors":"Bin Liu, Xiaotian Zhang, Honglan Huang, Ying Zhang, Fengfeng Zhou, Guoqing Wang","doi":"10.1186/2043-9113-4-4","DOIUrl":"https://doi.org/10.1186/2043-9113-4-4","url":null,"abstract":"<p><p>Different subtypes of Mycobacterium tuberculosis (MTB) may induce diverse severe human infections, and some of their symptoms are similar to other pathogenes, e.g. Nontuberculosis mycobacteria (NTM). So determination of mycobacterium subtypes facilitates the effective control of MTB infection and proliferation. This study exploits a novel DNA barcoding visualization method for molecular typing of 17 mycobacteria genomes published in the NCBI prokaryotic genome database. Three mycobacterium genes (Rv0279c, Rv3508 and Rv3514) from the PE/PPE family of MT Band were detected to best represent the inter-strain pathogenetic variations. An accurate and fast MTB substrain typing method was proposed based on the combination of the aforementioned three biomarker genes and the 16S rRNA gene. The protocol of establishing a bacterial substrain typing system used in this study may also be applied to the other pathogenes. </p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 1","pages":"4"},"PeriodicalIF":0.0,"publicationDate":"2014-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32141346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Fusion genes have been recognized to play key roles in oncogenesis. Though, many techniques have been developed for genome-wide analysis of fusion genes, a more efficient method is desired.
Results: We introduced a new method of detecting the novel fusion gene by using GeneChip Exon Array that enables exon expression analysis on a whole-genome scale and TAIL-PCR. To screen genes with abnormal exon expression profiles, we developed computational program, and confirmed that the program was able to search the fusion partner gene using Exon Array data of T-cell acute lymphocytic leukemia (T-ALL) cell lines. It was reported that the T-ALL cell lines, ALL-SIL, BE13 and LOUCY, harbored the fusion gene NUP214-ABL1, NUP214-ABL1 and SET-NUP214, respectively. The program extracted the candidate genes with abnormal exon expression profiles: 1 gene in ALL-SIL, 1 gene in BE13, and 2 genes in LOUCY. The known fusion partner gene NUP214 was included in the genes in ALL-SIL and LOUCY. Thus, we applied the proposed program to the detection of fusion partner genes in other tumors. To discover novel fusion genes, we examined 24 breast cancer cell lines and 20 pancreatic cancer cell lines by using the program. As a result, 20 and 23 candidate genes were obtained for the breast and pancreatic cancer cell lines respectively, and seven genes were selected as the final candidate gene based on information of the EST data base, comparison with normal cell samples and visual inspection of Exon expression profile. Finding of fusion partners for the final candidate genes was tried by TAIL-PCR, and three novel fusion genes were identified.
Conclusions: The usefulness of our detection method was confirmed. Using this method for more samples, it is thought that fusion genes can be identified.
{"title":"Development of detection method for novel fusion gene using GeneChip exon array.","authors":"Yusaku Wada, Masaaki Matsuura, Minoru Sugawara, Masaru Ushijima, Satoshi Miyata, Koichi Nagasaki, Tetsuo Noda, Yoshio Miki","doi":"10.1186/2043-9113-4-3","DOIUrl":"https://doi.org/10.1186/2043-9113-4-3","url":null,"abstract":"<p><strong>Background: </strong>Fusion genes have been recognized to play key roles in oncogenesis. Though, many techniques have been developed for genome-wide analysis of fusion genes, a more efficient method is desired.</p><p><strong>Results: </strong>We introduced a new method of detecting the novel fusion gene by using GeneChip Exon Array that enables exon expression analysis on a whole-genome scale and TAIL-PCR. To screen genes with abnormal exon expression profiles, we developed computational program, and confirmed that the program was able to search the fusion partner gene using Exon Array data of T-cell acute lymphocytic leukemia (T-ALL) cell lines. It was reported that the T-ALL cell lines, ALL-SIL, BE13 and LOUCY, harbored the fusion gene NUP214-ABL1, NUP214-ABL1 and SET-NUP214, respectively. The program extracted the candidate genes with abnormal exon expression profiles: 1 gene in ALL-SIL, 1 gene in BE13, and 2 genes in LOUCY. The known fusion partner gene NUP214 was included in the genes in ALL-SIL and LOUCY. Thus, we applied the proposed program to the detection of fusion partner genes in other tumors. To discover novel fusion genes, we examined 24 breast cancer cell lines and 20 pancreatic cancer cell lines by using the program. As a result, 20 and 23 candidate genes were obtained for the breast and pancreatic cancer cell lines respectively, and seven genes were selected as the final candidate gene based on information of the EST data base, comparison with normal cell samples and visual inspection of Exon expression profile. Finding of fusion partners for the final candidate genes was tried by TAIL-PCR, and three novel fusion genes were identified.</p><p><strong>Conclusions: </strong>The usefulness of our detection method was confirmed. Using this method for more samples, it is thought that fusion genes can be identified.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2014-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32123518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}