Integrating Machine Learning With Microsimulation to Classify Hypothetical, Novel Patients for Predicting Pregabalin Treatment Response Based on Observational and Randomized Data in Patients With Painful Diabetic Peripheral Neuropathy
J. Alexander, R. Edwards, L. Manca, Roberto Grugni, Gianluca Bonfanti, B. Emir, E. Whalen, S. Watt, M. Brodsky, B. Parsons
{"title":"Integrating Machine Learning With Microsimulation to Classify Hypothetical, Novel Patients for Predicting Pregabalin Treatment Response Based on Observational and Randomized Data in Patients With Painful Diabetic Peripheral Neuropathy","authors":"J. Alexander, R. Edwards, L. Manca, Roberto Grugni, Gianluca Bonfanti, B. Emir, E. Whalen, S. Watt, M. Brodsky, B. Parsons","doi":"10.2147/POR.S214412","DOIUrl":null,"url":null,"abstract":"Purpose Variability in patient treatment responses can be a barrier to effective care. Utilization of available patient databases may improve the prediction of treatment responses. We evaluated machine learning methods to predict novel, individual patient responses to pregabalin for painful diabetic peripheral neuropathy, utilizing an agent-based modeling and simulation platform that integrates real-world observational study (OS) data and randomized clinical trial (RCT) data. Patients and methods The best supervised machine learning methods were selected (through literature review) and combined in a novel way for aligning patients with relevant subgroups that best enable prediction of pregabalin responses. Data were derived from a German OS of pregabalin (N=2642) and nine international RCTs (N=1320). Coarsened exact matching of OS and RCT patients was used and a hierarchical cluster analysis was implemented. We tested which machine learning methods would best align candidate patients with specific clusters that predict their pain scores over time. Cluster alignments would trigger assignments of cluster-specific time-series regressions with lagged variables as inputs in order to simulate “virtual” patients and generate 1000 trajectory variations for given novel patients. Results Instance-based machine learning methods (k-nearest neighbor, supervised fuzzy c-means) were selected for quantitative analyses. Each method alone correctly classified 56.7% and 39.1% of patients, respectively. An “ensemble method” (combining both methods) correctly classified 98.4% and 95.9% of patients in the training and testing datasets, respectively. Conclusion An ensemble combination of two instance-based machine learning techniques best accommodated different data types (dichotomous, categorical, continuous) and performed better than either technique alone in assigning novel patients to subgroups for predicting treatment outcomes using microsimulation. Assignment of novel patients to a cluster of similar patients has the potential to improve prediction of patient outcomes for chronic conditions in which initial treatment response can be incorporated using microsimulation. Clinical trial registries www.clinicaltrials.gov: NCT00156078, NCT00159679, NCT00143156, NCT00553475.","PeriodicalId":20399,"journal":{"name":"Pragmatic and Observational Research","volume":"10 1","pages":"67 - 76"},"PeriodicalIF":2.3000,"publicationDate":"2019-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2147/POR.S214412","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pragmatic and Observational Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2147/POR.S214412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 5
Abstract
Purpose Variability in patient treatment responses can be a barrier to effective care. Utilization of available patient databases may improve the prediction of treatment responses. We evaluated machine learning methods to predict novel, individual patient responses to pregabalin for painful diabetic peripheral neuropathy, utilizing an agent-based modeling and simulation platform that integrates real-world observational study (OS) data and randomized clinical trial (RCT) data. Patients and methods The best supervised machine learning methods were selected (through literature review) and combined in a novel way for aligning patients with relevant subgroups that best enable prediction of pregabalin responses. Data were derived from a German OS of pregabalin (N=2642) and nine international RCTs (N=1320). Coarsened exact matching of OS and RCT patients was used and a hierarchical cluster analysis was implemented. We tested which machine learning methods would best align candidate patients with specific clusters that predict their pain scores over time. Cluster alignments would trigger assignments of cluster-specific time-series regressions with lagged variables as inputs in order to simulate “virtual” patients and generate 1000 trajectory variations for given novel patients. Results Instance-based machine learning methods (k-nearest neighbor, supervised fuzzy c-means) were selected for quantitative analyses. Each method alone correctly classified 56.7% and 39.1% of patients, respectively. An “ensemble method” (combining both methods) correctly classified 98.4% and 95.9% of patients in the training and testing datasets, respectively. Conclusion An ensemble combination of two instance-based machine learning techniques best accommodated different data types (dichotomous, categorical, continuous) and performed better than either technique alone in assigning novel patients to subgroups for predicting treatment outcomes using microsimulation. Assignment of novel patients to a cluster of similar patients has the potential to improve prediction of patient outcomes for chronic conditions in which initial treatment response can be incorporated using microsimulation. Clinical trial registries www.clinicaltrials.gov: NCT00156078, NCT00159679, NCT00143156, NCT00553475.
期刊介绍:
Pragmatic and Observational Research is an international, peer-reviewed, open-access journal that publishes data from studies designed to closely reflect medical interventions in real-world clinical practice, providing insights beyond classical randomized controlled trials (RCTs). While RCTs maximize internal validity for cause-and-effect relationships, they often represent only specific patient groups. This journal aims to complement such studies by providing data that better mirrors real-world patients and the usage of medicines, thus informing guidelines and enhancing the applicability of research findings across diverse patient populations encountered in everyday clinical practice.