Xiayuan Huang, Ross Kleiman, David Page, Scott Hebbring
We recently demonstrated that electronically constructed family pedigrees (e-pedigrees) have great value in epidemiologic research using electronic health record (EHR) data. Prior to this work, it has been well accepted that family health history is a major predictor for a wide spectrum of diseases, reflecting shared effects of genetics, environment, and lifestyle. With the widespread digitalization of patient data via EHRs, there is an unprecedented opportunity to use machine learning algorithms to better predict disease risk. Although predictive models have previously been constructed for a few important diseases, we currently know very little about how accurately the risk for most diseases can be predicted. It is further unknown if the incorporation of e-pedigrees in machine learning can improve the value of these models. In this study, we devised a family pedigree-driven high-throughput machine learning pipeline to simultaneously predict risks for thousands of diagnosis codes using thousands of input features. Models were built to predict future disease risk for three time windows using both Logistic Regression and XGBoost. For example, we achieved average areas under the receiver operating characteristic curves (AUCs) of 0.82, 0.77 and 0.71 for 1, 6, and 24 months, respectively using XGBoost and without e-pedigrees. When adding e-pedigree features to the XGBoost pipeline, AUCs increased to 0.83, 0.79 and 0.74 for the same three time periods, respectively. E-pedigrees similarly improved the predictions when using Logistic Regression. These results emphasize the potential value of incorporating family health history via e-pedigrees into machine learning with no further human time.
{"title":"Automated Family Histories Significantly Improve Risk Prediction in an EHR.","authors":"Xiayuan Huang, Ross Kleiman, David Page, Scott Hebbring","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We recently demonstrated that electronically constructed family pedigrees (e-pedigrees) have great value in epidemiologic research using electronic health record (EHR) data. Prior to this work, it has been well accepted that family health history is a major predictor for a wide spectrum of diseases, reflecting shared effects of genetics, environment, and lifestyle. With the widespread digitalization of patient data via EHRs, there is an unprecedented opportunity to use machine learning algorithms to better predict disease risk. Although predictive models have previously been constructed for a few important diseases, we currently know very little about how accurately the risk for most diseases can be predicted. It is further unknown if the incorporation of e-pedigrees in machine learning can improve the value of these models. In this study, we devised a family pedigree-driven high-throughput machine learning pipeline to simultaneously predict risks for thousands of diagnosis codes using thousands of input features. Models were built to predict future disease risk for three time windows using both Logistic Regression and XGBoost. For example, we achieved average areas under the receiver operating characteristic curves (AUCs) of 0.82, 0.77 and 0.71 for 1, 6, and 24 months, respectively using XGBoost and without e-pedigrees. When adding e-pedigree features to the XGBoost pipeline, AUCs increased to 0.83, 0.79 and 0.74 for the same three time periods, respectively. E-pedigrees similarly improved the predictions when using Logistic Regression. These results emphasize the potential value of incorporating family health history via e-pedigrees into machine learning with no further human time.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"221-229"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141855/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arif Ahmed, Gondy Leroy, Stephen A Rains, Philip Harber, David Kauchak, Prosanta Barai
Health literacy is crucial to supporting good health and is a major national goal. Audio delivery of information is becoming more popular for informing oneself. In this study, we evaluate the effect of audio enhancements in the form of information emphasis and pauses with health texts of varying difficulty and we measure health information comprehension and retention. We produced audio snippets from difficult and easy text and conducted the study on Amazon Mechanical Turk (AMT). Our findings suggest that emphasis matters for both information comprehension and retention. When there is no added pause, emphasizing significant information can lower the perceived difficulty for difficult and easy texts. Comprehension is higher (54%) with correctly placed emphasis for the difficult texts compared to not adding emphasis (50%). Adding a pause lowers perceived difficulty and can improve retention but adversely affects information comprehension.
{"title":"Effects of Added Emphasis and Pause in Audio Delivery of Health Information.","authors":"Arif Ahmed, Gondy Leroy, Stephen A Rains, Philip Harber, David Kauchak, Prosanta Barai","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Health literacy is crucial to supporting good health and is a major national goal. Audio delivery of information is becoming more popular for informing oneself. In this study, we evaluate the effect of audio enhancements in the form of information emphasis and pauses with health texts of varying difficulty and we measure health information comprehension and retention. We produced audio snippets from difficult and easy text and conducted the study on Amazon Mechanical Turk (AMT). Our findings suggest that emphasis matters for both information comprehension and retention. When there is no added pause, emphasizing significant information can lower the perceived difficulty for difficult and easy texts. Comprehension is higher (54%) with correctly placed emphasis for the difficult texts compared to not adding emphasis (50%). Adding a pause lowers perceived difficulty and can improve retention but adversely affects information comprehension.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"54-62"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141844/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Purpura, Natasha Mulligan, Uri Kartoun, Eileen Koski, Vibha Anand, Joao Bettencourt-Silva
This paper addresses the challenge of binary relation classification in biomedical Natural Language Processing (NLP), focusing on diverse domains including gene-disease associations, compound protein interactions, and social determinants of health (SDOH). We evaluate different approaches, including fine-tuning Bidirectional Encoder Representations from Transformers (BERT) models and generative Large Language Models (LLMs), and examine their performance in zero and few-shot settings. We also introduce a novel dataset of biomedical text annotated with social and clinical entities to facilitate research into relation classification. Our results underscore the continued complexity of this task for both humans and models. BERT-based models trained on domain-specific data excelled in certain domains and achieved comparable performance and generalization power to generative LLMs in others. Despite these encouraging results, these models are still far from achieving human-level performance. We also highlight the significance of high-quality training data and domain-specific fine-tuning on the performance of all the considered models.
{"title":"Investigating Cross-Domain Binary Relation Classification in Biomedical Natural Language Processing.","authors":"Alberto Purpura, Natasha Mulligan, Uri Kartoun, Eileen Koski, Vibha Anand, Joao Bettencourt-Silva","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper addresses the challenge of binary relation classification in biomedical Natural Language Processing (NLP), focusing on diverse domains including gene-disease associations, compound protein interactions, and social determinants of health (SDOH). We evaluate different approaches, including fine-tuning Bidirectional Encoder Representations from Transformers (BERT) models and generative Large Language Models (LLMs), and examine their performance in zero and few-shot settings. We also introduce a novel dataset of biomedical text annotated with social and clinical entities to facilitate research into relation classification. Our results underscore the continued complexity of this task for both humans and models. BERT-based models trained on domain-specific data excelled in certain domains and achieved comparable performance and generalization power to generative LLMs in others. Despite these encouraging results, these models are still far from achieving human-level performance. We also highlight the significance of high-quality training data and domain-specific fine-tuning on the performance of all the considered models.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"384-390"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141837/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carly Hudelson, Melissa A Gunderson, Debbie Pestka, Tori Christiaansen, Bret Stotka, Lynn Kissock, Rebecca Markowitz, Sameer Badlani, Genevieve B Melton
Electronic health record (EHR) documentation is a leading reason for clinician burnout. While technology-enabled solutions like virtual and digital scribes aim to improve this, there is limited evidence of their effectiveness and minimal guidance for healthcare systems around solution selection and implementation. A transdisciplinary approach, informed by clinician interviews and other considerations, was used to evaluate and select a virtual scribe solution to pilot in a rapid iterative sprint over 12 weeks. Surveys, interviews, and EHR metadata were analyzed over a staggered 30 day implementation with live and asynchronous virtual scribe solutions. Among 16 pilot clinicians, documentation burden metrics decreased for some but not all. Some clinicians had highly positive comments, and others had concerns regarding scribe training and quality. Our findings demonstrate that virtual scribes may reduce documentation burden for some clinicians and describe a method for a collaborative and iterative technology selection process for digital tools in practice.
{"title":"Selection and Implementation of Virtual Scribe Solutions to Reduce Documentation Burden: A Mixed Methods Pilot.","authors":"Carly Hudelson, Melissa A Gunderson, Debbie Pestka, Tori Christiaansen, Bret Stotka, Lynn Kissock, Rebecca Markowitz, Sameer Badlani, Genevieve B Melton","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic health record (EHR) documentation is a leading reason for clinician burnout. While technology-enabled solutions like virtual and digital scribes aim to improve this, there is limited evidence of their effectiveness and minimal guidance for healthcare systems around solution selection and implementation. A transdisciplinary approach, informed by clinician interviews and other considerations, was used to evaluate and select a virtual scribe solution to pilot in a rapid iterative sprint over 12 weeks. Surveys, interviews, and EHR metadata were analyzed over a staggered 30 day implementation with live and asynchronous virtual scribe solutions. Among 16 pilot clinicians, documentation burden metrics decreased for some but not all. Some clinicians had highly positive comments, and others had concerns regarding scribe training and quality. Our findings demonstrate that virtual scribes may reduce documentation burden for some clinicians and describe a method for a collaborative and iterative technology selection process for digital tools in practice.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"230-238"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hang Yu, Michael Kotlyar, Paul Thuras, Sheena Dufresne, Serguei Vs Pakhomov
Consumer-grade heart rate (HR) sensors are widely used for tracking physical and mental health status. We explore the feasibility of using Polar H10 electrocardiogram (ECG) sensor to detect and predict cigarette smoking events in naturalistic settings with several machine learning approaches. We have collected and analyzed data for 28 participants observed over a two-week period. We found that using bidirectional long short-term memory (BiLSTM) with ECG-derived and GPS location input features yielded the highest mean accuracy of 69% for smoking event detection. For predicting smoking events, the highest accuracy of 67% was achieved using the fine-tuned LSTM approach. We also found a significant correlation between accuracy and the number of smoking events available from each participant. Our findings indicate that both detection and prediction of smoking events are feasible but require an individualized approach to training the models, particularly for prediction.
{"title":"Towards Predicting Smoking Events for Just-in-time Interventions.","authors":"Hang Yu, Michael Kotlyar, Paul Thuras, Sheena Dufresne, Serguei Vs Pakhomov","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Consumer-grade heart rate (HR) sensors are widely used for tracking physical and mental health status. We explore the feasibility of using Polar H10 electrocardiogram (ECG) sensor to detect and predict cigarette smoking events in naturalistic settings with several machine learning approaches. We have collected and analyzed data for 28 participants observed over a two-week period. We found that using bidirectional long short-term memory (BiLSTM) with ECG-derived and GPS location input features yielded the highest mean accuracy of 69% for smoking event detection. For predicting smoking events, the highest accuracy of 67% was achieved using the fine-tuned LSTM approach. We also found a significant correlation between accuracy and the number of smoking events available from each participant. Our findings indicate that both detection and prediction of smoking events are feasible but require an individualized approach to training the models, particularly for prediction.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"468-477"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chancellor R Woolsey, Prakash Bisht, Joshua Rothman, Gondy Leroy
An important problem impacting healthcare is the lack of available experts. Machine learning (ML) models may help resolve this by aiding in screening and diagnosing patients. However, creating large, representative datasets to train models is expensive. We evaluated large language models (LLMs) for data creation. Using Autism Spectrum Disorders (ASD), we prompted GPT-3.5 and GPT-4 to generate 4,200 synthetic examples of behaviors to augment existing medical observations. Our goal is to label behaviors corresponding to autism criteria and improve model accuracy with synthetic training data. We used a BERT classifier pretrained on biomedical literature to assess differences in performance between models. A random sample (N=140) from the LLM-generated data was also evaluated by a clinician and found to contain 83% correct behavioral example-label pairs. Augmenting the dataset increased recall by 13% but decreased precision by 16%. Future work will investigate how different synthetic data characteristics affect ML outcomes.
{"title":"Utilizing Large Language Models to Generate Synthetic Data to Increase the Performance of BERT-Based Neural Networks.","authors":"Chancellor R Woolsey, Prakash Bisht, Joshua Rothman, Gondy Leroy","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>An important problem impacting healthcare is the lack of available experts. Machine learning (ML) models may help resolve this by aiding in screening and diagnosing patients. However, creating large, representative datasets to train models is expensive. We evaluated large language models (LLMs) for data creation. Using Autism Spectrum Disorders (ASD), we prompted GPT-3.5 and GPT-4 to generate 4,200 synthetic examples of behaviors to augment existing medical observations. Our goal is to label behaviors corresponding to autism criteria and improve model accuracy with synthetic training data. We used a BERT classifier pretrained on biomedical literature to assess differences in performance between models. A random sample (N=140) from the LLM-generated data was also evaluated by a clinician and found to contain 83% correct behavioral example-label pairs. Augmenting the dataset increased recall by 13% but decreased precision by 16%. Future work will investigate how different synthetic data characteristics affect ML outcomes.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"429-438"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141799/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephen P Ma, Ebru Hosgur, Conor K Corbin, Ivan Lopez, Amy Chang, Jonathan H Chen
This study explored the efficacy of electronic phenotyping in data labeling for machine learning with a focus on urinary tract infections (UTIs). We contrasted labels from electronic phenotyping against previously published labels such as urine culture positivity. In comparison, electronic phenotyping showed the potential to enhance specificity in UTI labeling while maintaining similar sensitivity and was easily scaled for application to a large dataset suitable for machine learning, which we used to train and validate a machine learning model. Electronic phenotyping offers a valuable method for machine learning label generation in healthcare, with potential benefits for patient care and antimicrobial stewardship. Further research will expand its application and optimize techniques for increased performance.
{"title":"Electronic Phenotyping of Urinary Tract Infections as a Silver Standard Label for Machine Learning.","authors":"Stephen P Ma, Ebru Hosgur, Conor K Corbin, Ivan Lopez, Amy Chang, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study explored the efficacy of electronic phenotyping in data labeling for machine learning with a focus on urinary tract infections (UTIs). We contrasted labels from electronic phenotyping against previously published labels such as urine culture positivity. In comparison, electronic phenotyping showed the potential to enhance specificity in UTI labeling while maintaining similar sensitivity and was easily scaled for application to a large dataset suitable for machine learning, which we used to train and validate a machine learning model. Electronic phenotyping offers a valuable method for machine learning label generation in healthcare, with potential benefits for patient care and antimicrobial stewardship. Further research will expand its application and optimize techniques for increased performance.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"182-189"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141812/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into feature groups by their source (e.g. medication orders, diagnosis codes and lab results) and feature categories based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.
{"title":"Pathophysiological Features in Electronic Medical Records Sustain Model Performance under Temporal Dataset Shift.","authors":"Raphael Brosula, Conor K Corbin, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into <i>feature groups</i> by their source (e.g. medication orders, diagnosis codes and lab results) and <i>feature categories</i> based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"95-104"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141811/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew J King, Lisa Higgins, Carly Au, Salim Malakouti, Edvin Music, Kyle Kalchthaler, Gilles Clermont, William Garrard, David T Huang, Bryan J McVerry, Christopher W Seymour, Kelsey Linstrum, Amanda McNamara, Cameron Green, India Loar, Tracey Roberts, Oscar Marroquin, Derek C Angus, Christopher M Horvat
Objectives: To automatically populate the case report forms (CRFs) for an international, pragmatic, multifactorial, response-adaptive, Bayesian COVID-19 platform trial.
Methods: The locations of focus included 27 hospitals and 2 large electronic health record (EHR) instances (1 Cerner Millennium and 1 Epic) that are part of the same health system in the United States. This paper describes our efforts to use EHR data to automatically populate four of the trial's forms: baseline, daily, discharge, and response-adaptive randomization.
Results: Between April 2020 and May 2022, 417 patients from the UPMC health system were enrolled in the trial. A MySQL-based extract, transform, and load pipeline automatically populated 499 of 526 CRF variables. The populated forms were statistically and manually reviewed and then reported to the trial's international data coordinating center.
Conclusions: We accomplished automatic population of CRFs in a large platform trial and made recommendations for improving this process for future trials.
{"title":"Automatic Population of the Case Report Forms for an International Multifactorial Adaptive Platform Trial Amid the COVID-19 Pandemic.","authors":"Andrew J King, Lisa Higgins, Carly Au, Salim Malakouti, Edvin Music, Kyle Kalchthaler, Gilles Clermont, William Garrard, David T Huang, Bryan J McVerry, Christopher W Seymour, Kelsey Linstrum, Amanda McNamara, Cameron Green, India Loar, Tracey Roberts, Oscar Marroquin, Derek C Angus, Christopher M Horvat","doi":"","DOIUrl":"","url":null,"abstract":"<p><strong>Objectives: </strong>To automatically populate the case report forms (CRFs) for an international, pragmatic, multifactorial, response-adaptive, Bayesian COVID-19 platform trial.</p><p><strong>Methods: </strong>The locations of focus included 27 hospitals and 2 large electronic health record (EHR) instances (1 Cerner Millennium and 1 Epic) that are part of the same health system in the United States. This paper describes our efforts to use EHR data to automatically populate four of the trial's forms: baseline, daily, discharge, and response-adaptive randomization.</p><p><strong>Results: </strong>Between April 2020 and May 2022, 417 patients from the UPMC health system were enrolled in the trial. A MySQL-based extract, transform, and load pipeline automatically populated 499 of 526 CRF variables. The populated forms were statistically and manually reviewed and then reported to the trial's international data coordinating center.</p><p><strong>Conclusions: </strong>We accomplished automatic population of CRFs in a large platform trial and made recommendations for improving this process for future trials.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"276-284"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141839/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liver transplantation often faces fairness challenges across subgroups defined by sensitive attributes such as age group, gender, and race/ethnicity. Machine learning models for outcome prediction can introduce additional biases. Therefore, we introduce Fairness through the Equitable Rate of Improvement in Multitask Learning (FERI) algorithm for fair predictions of graft failure risk in liver transplant patients. FERI constrains subgroup loss by balancing learning rates and preventing subgroup dominance in the training process. Our results show that FERI maintained high predictive accuracy with AUROC and AUPRC comparable to baseline models. More importantly, FERI demonstrated an ability to improve fairness without sacrificing accuracy. Specifically, for the gender, FERI reduced the demographic parity disparity by 71.74%, and for the age group, it decreased the equalized odds disparity by 40.46%. Therefore, the FERI algorithm advanced fairness-aware predictive modeling in healthcare and provides an invaluable tool for equitable healthcare systems.
{"title":"FERI: A Multitask-based Fairness Achieving Algorithm with Applications to Fair Organ Transplantation.","authors":"Can Li, Dejian Lai, Xiaoqian Jiang, Kai Zhang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Liver transplantation often faces fairness challenges across subgroups defined by sensitive attributes such as age group, gender, and race/ethnicity. Machine learning models for outcome prediction can introduce additional biases. Therefore, we introduce <b>F</b>airness through the <b>E</b>quitable <b>R</b>ate of <b>I</b>mprovement in Multitask Learning (FERI) algorithm for fair predictions of graft failure risk in liver transplant patients. FERI constrains subgroup loss by balancing learning rates and preventing subgroup dominance in the training process. Our results show that FERI maintained high predictive accuracy with AUROC and AUPRC comparable to baseline models. More importantly, FERI demonstrated an ability to improve fairness without sacrificing accuracy. Specifically, for the gender, FERI reduced the demographic parity disparity by 71.74%, and for the age group, it decreased the equalized odds disparity by 40.46%. Therefore, the FERI algorithm advanced fairness-aware predictive modeling in healthcare and provides an invaluable tool for equitable healthcare systems.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"593-602"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141863/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}