pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis.

IF 3.1 4区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Neuroinformatics Pub Date : 2022-04-01 Epub Date: 2022-01-03 DOI:10.1007/s12021-021-09553-4
Cailey I Kerley, Shikha Chaganti, Tin Q Nguyen, Camilo Bermudez, Laurie E Cutting, Lori L Beason-Held, Thomas Lasko, Bennett A Landman
{"title":"pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis.","authors":"Cailey I Kerley, Shikha Chaganti, Tin Q Nguyen, Camilo Bermudez, Laurie E Cutting, Lori L Beason-Held, Thomas Lasko, Bennett A Landman","doi":"10.1007/s12021-021-09553-4","DOIUrl":null,"url":null,"abstract":"<p><p>Along with the increasing availability of electronic medical record (EMR) data, phenome-wide association studies (PheWAS) and phenome-disease association studies (PheDAS) have become a prominent, first-line method of analysis for uncovering the secrets of EMR. Despite this recent growth, there is a lack of approachable software tools for conducting these analyses on large-scale EMR cohorts. In this article, we introduce pyPheWAS, an open-source python package for conducting PheDAS and related analyses. This toolkit includes 1) data preparation, such as cohort censoring and age-matching; 2) traditional PheDAS analysis of ICD-9 and ICD-10 billing codes; 3) PheDAS analysis applied to a novel EMR phenotype mapping: current procedural terminology (CPT) codes; and 4) novelty analysis of significant disease-phenotype associations found through PheDAS. The pyPheWAS toolkit is approachable and comprehensive, encapsulating data prep through result visualization all within a simple command-line interface. The toolkit is designed for the ever-growing scale of available EMR data, with the ability to analyze cohorts of 100,000 + patients in less than 2 h. Through a case study of Down Syndrome and other intellectual developmental disabilities, we demonstrate the ability of pyPheWAS to discover both known and potentially novel disease-phenotype associations across different experiment designs and disease groups. The software and user documentation are available in open source at https://github.com/MASILab/pyPheWAS .</p>","PeriodicalId":49761,"journal":{"name":"Neuroinformatics","volume":"20 2","pages":"483-505"},"PeriodicalIF":3.1000,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9250547/pdf/nihms-1799852.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuroinformatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12021-021-09553-4","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/1/3 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Along with the increasing availability of electronic medical record (EMR) data, phenome-wide association studies (PheWAS) and phenome-disease association studies (PheDAS) have become a prominent, first-line method of analysis for uncovering the secrets of EMR. Despite this recent growth, there is a lack of approachable software tools for conducting these analyses on large-scale EMR cohorts. In this article, we introduce pyPheWAS, an open-source python package for conducting PheDAS and related analyses. This toolkit includes 1) data preparation, such as cohort censoring and age-matching; 2) traditional PheDAS analysis of ICD-9 and ICD-10 billing codes; 3) PheDAS analysis applied to a novel EMR phenotype mapping: current procedural terminology (CPT) codes; and 4) novelty analysis of significant disease-phenotype associations found through PheDAS. The pyPheWAS toolkit is approachable and comprehensive, encapsulating data prep through result visualization all within a simple command-line interface. The toolkit is designed for the ever-growing scale of available EMR data, with the ability to analyze cohorts of 100,000 + patients in less than 2 h. Through a case study of Down Syndrome and other intellectual developmental disabilities, we demonstrate the ability of pyPheWAS to discover both known and potentially novel disease-phenotype associations across different experiment designs and disease groups. The software and user documentation are available in open source at https://github.com/MASILab/pyPheWAS .

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
pyPheWAS:用于电子病历分析的表型-疾病关联工具。
随着电子病历(EMR)数据的日益普及,全表型关联研究(PheWAS)和表型-疾病关联研究(PheDAS)已成为揭示 EMR 秘密的重要一线分析方法。尽管最近出现了这种增长,但仍缺乏可用于在大规模 EMR 队列中进行这些分析的软件工具。在本文中,我们将介绍 pyPheWAS,这是一个用于进行 PheDAS 和相关分析的开源 python 软件包。该工具包包括:1)数据准备,如队列普查和年龄匹配;2)ICD-9 和 ICD-10 账单代码的传统 PheDAS 分析;3)应用于新型 EMR 表型映射的 PheDAS 分析:当前程序术语 (CPT) 代码;以及 4)通过 PheDAS 发现的重大疾病表型关联的新颖性分析。pyPheWAS 工具包平易近人、功能全面,从数据预处理到结果可视化,全部封装在一个简单的命令行界面中。通过对唐氏综合症和其他智力发育障碍的案例研究,我们展示了 pyPheWAS 在不同实验设计和疾病组中发现已知和潜在新疾病表型关联的能力。该软件和用户文档的开放源代码见 https://github.com/MASILab/pyPheWAS。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Neuroinformatics
Neuroinformatics 医学-计算机:跨学科应用
CiteScore
6.00
自引率
6.70%
发文量
54
审稿时长
3 months
期刊介绍: Neuroinformatics publishes original articles and reviews with an emphasis on data structure and software tools related to analysis, modeling, integration, and sharing in all areas of neuroscience research. The editors particularly invite contributions on: (1) Theory and methodology, including discussions on ontologies, modeling approaches, database design, and meta-analyses; (2) Descriptions of developed databases and software tools, and of the methods for their distribution; (3) Relevant experimental results, such as reports accompanie by the release of massive data sets; (4) Computational simulations of models integrating and organizing complex data; and (5) Neuroengineering approaches, including hardware, robotics, and information theory studies.
期刊最新文献
Deep Learning-Based Classification of Temporal Stages of AT8-Labeled Tau Pathology After Experimental Traumatic Brain Injury. Towards Multi-Brain Decoding in Autism: A Self-Supervised Learning Approach. Revealing Structural Brain-Cognition Relationships in Children: A Comparison of Morphometric Similarity and INverse Divergence Networks. Impact of Neuron Models on Spiking Neural Network Performance: A Complexity-based Classification Approach. Application of Fully Convolutional Neural Networks in the Assessment of Cerebral White Matter Involvement in Primary Sjögren's Syndrome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1