Pub Date : 2015-07-30DOI: 10.4172/2090-4924.1000113
K. Borthwick, D. Smelser, Jonathan A. Bock, J. Elmore, Evan J. Ryer, Z. Ye, J. Pacheco, D. Carrell, M. Michalkiewicz, William K. Thompson, Jyotishman Pathak, S. Bielinski, J. Denny, J. Linneman, P. Peissig, A. Kho, O. Gottesman, Harpreet Parmar, I. Kullo, C. McCarty, E. Böttinger, E. Larson, G. Jarvik, J. Harley, T. Bajwa, D. P. Franklin, D. Carey, H. Kuivaniemi, G. Tromp
Background and objective We designed an algorithm to identify abdominal aortic aneurysm cases and controls from electronic health records to be shared and executed within the “electronic Medical Records and Genomics” (eMERGE) Network. Materials and methods Structured Query Language, was used to script the algorithm utilizing “Current Procedural Terminology” and “International Classification of Diseases” codes, with demographic and encounter data to classify individuals as case, control, or excluded. The algorithm was validated using blinded manual chart review at three eMERGE Network sites and one non-eMERGE Network site. Validation comprised evaluation of an equal number of predicted cases and controls selected at random from the algorithm predictions. After validation at the three eMERGE Network sites, the remaining eMERGE Network sites performed verification only. Finally, the algorithm was implemented as a workflow in the Konstanz Information Miner, which represented the logic graphically while retaining intermediate data for inspection at each node. The algorithm was configured to be independent of specific access to data and was exportable (without data) to other sites. Results The algorithm demonstrated positive predictive values (PPV) of 92.8% (CI: 86.8-96.7) and 100% (CI: 97.0-100) for cases and controls, respectively. It performed well also outside the eMERGE Network. Implementation of the transportable executable algorithm as a Konstanz Information Miner workflow required much less effort than implementation from pseudo code, and ensured that the logic was as intended. Discussion and conclusion This ePhenotyping algorithm identifies abdominal aortic aneurysm cases and controls from the electronic health record with high case and control PPV necessary for research purposes, can be disseminated easily, and applied to high-throughput genetic and other studies.
背景和目的我们设计了一种算法,从电子健康记录中识别腹主动脉瘤病例和对照组,并在“电子医疗记录和基因组学”(eMERGE)网络中共享和执行。材料和方法使用结构化查询语言编写算法脚本,使用“现行程序术语”和“国际疾病分类”代码,并使用人口统计学和遭遇数据将个体分为病例、对照或排除。该算法在三个eMERGE Network站点和一个非eMERGE Network站点使用盲法手工图表评审进行验证。验证包括对从算法预测中随机选择的相同数量的预测病例和对照进行评估。在三个eMERGE Network站点进行验证后,其余的eMERGE Network站点仅执行验证。最后,将该算法以工作流的形式在Konstanz Information Miner中实现,以图形化的方式表示逻辑,同时在每个节点保留中间数据以供检查。该算法被配置为独立于特定的数据访问,并且可以导出(没有数据)到其他站点。结果该算法对病例和对照组的阳性预测值(PPV)分别为92.8% (CI: 86.8 ~ 96.7)和100% (CI: 97.0 ~ 100)。在eMERGE Network之外,它的表现也不错。将可传输的可执行算法实现为Konstanz Information Miner工作流比从伪代码实现所需的工作量要少得多,并确保逻辑符合预期。该ePhenotyping算法从电子健康记录中识别腹主动脉瘤病例和对照,具有研究所需的高病例和对照PPV,易于传播,并可应用于高通量遗传和其他研究。
{"title":"ePhenotyping for Abdominal Aortic Aneurysm in the Electronic Medical Records and Genomics (eMERGE) Network: Algorithm Development and Konstanz Information Miner Workflow","authors":"K. Borthwick, D. Smelser, Jonathan A. Bock, J. Elmore, Evan J. Ryer, Z. Ye, J. Pacheco, D. Carrell, M. Michalkiewicz, William K. Thompson, Jyotishman Pathak, S. Bielinski, J. Denny, J. Linneman, P. Peissig, A. Kho, O. Gottesman, Harpreet Parmar, I. Kullo, C. McCarty, E. Böttinger, E. Larson, G. Jarvik, J. Harley, T. Bajwa, D. P. Franklin, D. Carey, H. Kuivaniemi, G. Tromp","doi":"10.4172/2090-4924.1000113","DOIUrl":"https://doi.org/10.4172/2090-4924.1000113","url":null,"abstract":"Background and objective We designed an algorithm to identify abdominal aortic aneurysm cases and controls from electronic health records to be shared and executed within the “electronic Medical Records and Genomics” (eMERGE) Network. Materials and methods Structured Query Language, was used to script the algorithm utilizing “Current Procedural Terminology” and “International Classification of Diseases” codes, with demographic and encounter data to classify individuals as case, control, or excluded. The algorithm was validated using blinded manual chart review at three eMERGE Network sites and one non-eMERGE Network site. Validation comprised evaluation of an equal number of predicted cases and controls selected at random from the algorithm predictions. After validation at the three eMERGE Network sites, the remaining eMERGE Network sites performed verification only. Finally, the algorithm was implemented as a workflow in the Konstanz Information Miner, which represented the logic graphically while retaining intermediate data for inspection at each node. The algorithm was configured to be independent of specific access to data and was exportable (without data) to other sites. Results The algorithm demonstrated positive predictive values (PPV) of 92.8% (CI: 86.8-96.7) and 100% (CI: 97.0-100) for cases and controls, respectively. It performed well also outside the eMERGE Network. Implementation of the transportable executable algorithm as a Konstanz Information Miner workflow required much less effort than implementation from pseudo code, and ensured that the logic was as intended. Discussion and conclusion This ePhenotyping algorithm identifies abdominal aortic aneurysm cases and controls from the electronic health record with high case and control PPV necessary for research purposes, can be disseminated easily, and applied to high-throughput genetic and other studies.","PeriodicalId":91279,"journal":{"name":"International journal of biomedical data mining","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70970298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}