{"title":"Temporal weighting of clinical events in electronic health records for pharmacovigilance","authors":"Jing Zhao","doi":"10.1109/BIBM.2015.7359710","DOIUrl":null,"url":null,"abstract":"Electronic health records (EHRs) have recently been identified as a potentially valuable source for monitoring adverse drug events (ADEs). However, ADEs are heavily under-reported in EHRs. Using machine learning algorithms to automatically detect patients that should have had ADEs reported in their health records is an efficient and effective solution. One of the challenges to that end is how to take into account temporality when using clinical events, which are time stamped in EHRs, as features for machine learning algorithms to exploit. Previous research on this topic suggests that representing EHR data as a bag of temporally weighted clinical events is promising; however, how to assign weights in an optimal manner remains unexplored. In this study, nine different temporal weighting strategies are proposed and evaluated using data extracted from a Swedish EHR database, where the predictive performance of models constructed with the random forest learning algorithm is compared. Moreover, variable importance is analyzed to obtain a deeper understanding as to why a certain weighting strategy is favored over another, as well as which clinical events undergo the biggest changes in importance with the various weighting strategies. The results show that the choice of weighting strategy has a significant impact on the predictive performance for ADE detection, and that the best choice of weighting strategy depends on the target ADE and, specifically, on its dose-dependency.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2015.7359710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Electronic health records (EHRs) have recently been identified as a potentially valuable source for monitoring adverse drug events (ADEs). However, ADEs are heavily under-reported in EHRs. Using machine learning algorithms to automatically detect patients that should have had ADEs reported in their health records is an efficient and effective solution. One of the challenges to that end is how to take into account temporality when using clinical events, which are time stamped in EHRs, as features for machine learning algorithms to exploit. Previous research on this topic suggests that representing EHR data as a bag of temporally weighted clinical events is promising; however, how to assign weights in an optimal manner remains unexplored. In this study, nine different temporal weighting strategies are proposed and evaluated using data extracted from a Swedish EHR database, where the predictive performance of models constructed with the random forest learning algorithm is compared. Moreover, variable importance is analyzed to obtain a deeper understanding as to why a certain weighting strategy is favored over another, as well as which clinical events undergo the biggest changes in importance with the various weighting strategies. The results show that the choice of weighting strategy has a significant impact on the predictive performance for ADE detection, and that the best choice of weighting strategy depends on the target ADE and, specifically, on its dose-dependency.