{"title":"Data Exploration in Secondary Use of Healthcare Data","authors":"Jian Wang","doi":"10.1109/BIBM.2011.129","DOIUrl":null,"url":null,"abstract":"Real world data sets (as opposed to data from randomized, controlled clinical trials) are becoming increasing available from the healthcare industry. Large databases from EMRs/EHRs, insurance claims, pharmacy records, disease registries etc present unique challenges when they are utilized to support pharmaceutical R&D activities. Such \"secondary use\" of healthcare data usually starts with an exploratory phase when the researcher takes a high-level view of the available data and starts to \"connect the dots\". Data exploration is a highly dynamic process: exploratory paths change frequently, sometimes converging, other times diverging, and often resulting in dead ends. Only a small subset of exploratory results end up being formally analyzed to derive quantitative insights. Because of this dynamic nature of data exploration, it is critical that researchers who generate hypotheses, the domain experts, can directly explore in the available data space. Data exploration on large healthcare data sets is often a bottleneck because these data sets tend to be poorly understood in terms of their quality, completeness, consistency, etc. We will discuss this emerging landscape, focusing on case studies to illustrate the powerful convergence of real-world data and technological advancements to help leverage this data.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"43 1","pages":"658-658"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2011.129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Real world data sets (as opposed to data from randomized, controlled clinical trials) are becoming increasing available from the healthcare industry. Large databases from EMRs/EHRs, insurance claims, pharmacy records, disease registries etc present unique challenges when they are utilized to support pharmaceutical R&D activities. Such "secondary use" of healthcare data usually starts with an exploratory phase when the researcher takes a high-level view of the available data and starts to "connect the dots". Data exploration is a highly dynamic process: exploratory paths change frequently, sometimes converging, other times diverging, and often resulting in dead ends. Only a small subset of exploratory results end up being formally analyzed to derive quantitative insights. Because of this dynamic nature of data exploration, it is critical that researchers who generate hypotheses, the domain experts, can directly explore in the available data space. Data exploration on large healthcare data sets is often a bottleneck because these data sets tend to be poorly understood in terms of their quality, completeness, consistency, etc. We will discuss this emerging landscape, focusing on case studies to illustrate the powerful convergence of real-world data and technological advancements to help leverage this data.