Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records

Q3 Computer Science Open Bioinformatics Journal Pub Date : 2017-07-31 DOI:10.2174/1875036201709010001

O. Montvida, Ognjen Arandjelovic, E. Reiner, S. Paul

{"title":"Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records","authors":"O. Montvida, Ognjen Arandjelovic, E. Reiner, S. Paul","doi":"10.2174/1875036201709010001","DOIUrl":null,"url":null,"abstract":"\n \n Electronic Medical Records (EMRs) from primary/ ambulatory care systems present a new and promising source of information for conducting clinical and translational research.\n \n \n \n To address the methodological and computational challenges in order to extract reliable medication information from raw data which is often complex, incomplete and erroneous. To assess whether the use of specific chaining fields of medication information may additionally improve the data quality.\n \n \n \n Guided by a range of challenges associated with missing and internally inconsistent data, we introduce two methods for the robust extraction of patient-level medication data. First method relies on chaining fields to estimate duration of treatment (“chaining”), while second disregards chaining fields and relies on the chronology of records (“continuous”). Centricity EMR database was used to estimate treatment duration with both methods for two widely prescribed drugs among type 2 diabetes patients: insulin and glucagon-like peptide-1 receptor agonists.\n \n \n \n At individual patient level the “chaining” approach could identify the treatment alterations longitudinally and produced more robust estimates of treatment duration for individual drugs, while the “continuous” method was unable to capture that dynamics. At population level, both methods produced similar estimates of average treatment duration, however, notable differences were observed at individual-patient level.\n \n \n \n The proposed algorithms explicitly identify and handle longitudinal erroneous or missing entries and estimate treatment duration with specific drug(s) of interest, which makes them a valuable tool for future EMR based clinical and pharmaco-epidemiological studies. To improve accuracy of real-world based studies, implementing chaining fields of medication information is recommended.\n","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"10 1","pages":"1-15"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Bioinformatics Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/1875036201709010001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 13

Abstract

Electronic Medical Records (EMRs) from primary/ ambulatory care systems present a new and promising source of information for conducting clinical and translational research. To address the methodological and computational challenges in order to extract reliable medication information from raw data which is often complex, incomplete and erroneous. To assess whether the use of specific chaining fields of medication information may additionally improve the data quality. Guided by a range of challenges associated with missing and internally inconsistent data, we introduce two methods for the robust extraction of patient-level medication data. First method relies on chaining fields to estimate duration of treatment (“chaining”), while second disregards chaining fields and relies on the chronology of records (“continuous”). Centricity EMR database was used to estimate treatment duration with both methods for two widely prescribed drugs among type 2 diabetes patients: insulin and glucagon-like peptide-1 receptor agonists. At individual patient level the “chaining” approach could identify the treatment alterations longitudinally and produced more robust estimates of treatment duration for individual drugs, while the “continuous” method was unable to capture that dynamics. At population level, both methods produced similar estimates of average treatment duration, however, notable differences were observed at individual-patient level. The proposed algorithms explicitly identify and handle longitudinal erroneous or missing entries and estimate treatment duration with specific drug(s) of interest, which makes them a valuable tool for future EMR based clinical and pharmaco-epidemiological studies. To improve accuracy of real-world based studies, implementing chaining fields of medication information is recommended.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从纵向电子病历估计药物治疗持续时间的数据挖掘方法

初级/门诊医疗系统的电子病历（EMR）为进行临床和转化研究提供了一种新的、有前景的信息来源。解决方法和计算方面的挑战，以便从通常复杂、不完整和错误的原始数据中提取可靠的药物信息。评估药物信息的特定链接字段的使用是否可以额外提高数据质量。在与数据缺失和内部不一致相关的一系列挑战的指导下，我们介绍了两种稳健提取患者级药物数据的方法。第一种方法依赖于链接字段来估计治疗的持续时间（“链接”），而第二种方法忽略了链接字段并依赖于记录的年表（“连续”）。Centricity EMR数据库用于估计两种方法在2型糖尿病患者中广泛使用的药物的治疗持续时间：胰岛素和胰高血糖素样肽-1受体激动剂。在个体患者层面，“连锁”方法可以纵向识别治疗变化，并对单个药物的治疗持续时间产生更可靠的估计，而“连续”方法无法捕捉这种动态。在人群水平上，两种方法对平均治疗持续时间的估计值相似，但在个体患者水平上观察到显著差异。所提出的算法明确识别和处理纵向错误或缺失条目，并估计感兴趣的特定药物的治疗持续时间，这使其成为未来基于EMR的临床和药物流行病学研究的宝贵工具。为了提高基于现实世界的研究的准确性，建议实施药物信息的链接字段。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Open Bioinformatics Journal Computer Science-Computer Science (miscellaneous)

CiteScore

2.40

自引率

0.00%

发文量

期刊介绍： The Open Bioinformatics Journal is an Open Access online journal, which publishes research articles, reviews/mini-reviews, letters, clinical trial studies and guest edited single topic issues in all areas of bioinformatics and computational biology. The coverage includes biomedicine, focusing on large data acquisition, analysis and curation, computational and statistical methods for the modeling and analysis of biological data, and descriptions of new algorithms and databases. The Open Bioinformatics Journal, a peer reviewed journal, is an important and reliable source of current information on the developments in the field. The emphasis will be on publishing quality articles rapidly and freely available worldwide.