{"title":"A Data Mining Approach for Biomarker Discovery Using Transcriptomics in Endometriosis","authors":"Sadia Akter, Dong Xu, S. Nagel, T. Joshi","doi":"10.1109/BIBM.2018.8621150","DOIUrl":null,"url":null,"abstract":"Endometriosis is a complex and common gynecological disorder affecting 5-10% of reproductive age women. Due to the lack of definitive diagnostic symptoms and expensive invasive procedures for diagnosing endometriosis, the average time for the diagnosis can be up to 10 years. This diagnostic latency has a very significant impact on endometriosis patients, and early diagnosis is desired in order to increase quality of life. In this study, we analyzed 38 RNA-seq transcriptomics samples (16 endometriosis and 22 controls) and identified genomic signatures as potential biomarkers. We applied innovative data mining approaches including a combination of a normalization techniques, generalized linear model (GLM) for identifying the differentially expressed genes and a decision tree algorithm for constructing models with higher predictive performance. A total of 5 candidate genes were identified as potential biomarkers of endometriosis, which outperformed the results from the Biosigner tool using a leave-one-out cross-validation technique. Our data mining approach can successfully distinguish the endometriosis patients from the non-endometriosis and can be potentially used as a prediction-based diagnostic tool for other diseases in future.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"19 1","pages":"969-972"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2018.8621150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Endometriosis is a complex and common gynecological disorder affecting 5-10% of reproductive age women. Due to the lack of definitive diagnostic symptoms and expensive invasive procedures for diagnosing endometriosis, the average time for the diagnosis can be up to 10 years. This diagnostic latency has a very significant impact on endometriosis patients, and early diagnosis is desired in order to increase quality of life. In this study, we analyzed 38 RNA-seq transcriptomics samples (16 endometriosis and 22 controls) and identified genomic signatures as potential biomarkers. We applied innovative data mining approaches including a combination of a normalization techniques, generalized linear model (GLM) for identifying the differentially expressed genes and a decision tree algorithm for constructing models with higher predictive performance. A total of 5 candidate genes were identified as potential biomarkers of endometriosis, which outperformed the results from the Biosigner tool using a leave-one-out cross-validation technique. Our data mining approach can successfully distinguish the endometriosis patients from the non-endometriosis and can be potentially used as a prediction-based diagnostic tool for other diseases in future.