Pub Date : 2024-08-07DOI: 10.1101/2024.08.05.24311485
Ethan Ethan, Robert Gallo, Eric Strong, Yingjie Weng, Hannah Kerman, Jason Freed, Josephine A Cool, Zahir Kanjee, Kathleen Lane, Andrew S Parsons, Neera Ahuja, Eric Horvitz, Daniel Yang, Arnold Milstein, Andrew PJ Olson, Jason Hom, Jonathan H. Chen, Adam Rodman
Importance: Large language model (LLM) artificial intelligence (AI) systems have shown promise in diagnostic reasoning, but their utility in management reasoning with no clear right answers is unknown. Objective: To determine whether LLM assistance improves physician performance on open-ended management reasoning tasks compared to conventional resources. Design: Prospective, randomized controlled trial conducted from 30 November 2023 to 21 April 2024. Setting: Multi-institutional study from Stanford University, Beth Israel Deaconess Medical Center, and the University of Virginia involving physicians from across the United States. Participants: 92 practicing attending physicians and residents with training in internal medicine, family medicine, or emergency medicine. Intervention: Five expert-developed clinical case vignettes were presented with multiple open-ended management questions and scoring rubrics created through a Delphi process. Physicians were randomized to use either GPT-4 via ChatGPT Plus in addition to conventional resources (e.g., UpToDate, Google), or conventional resources alone. Main Outcomes and Measures: The primary outcome was difference in total score between groups on expert-developed scoring rubrics. Secondary outcomes included domain-specific scores and time spent per case. Results: Physicians using the LLM scored higher compared to those using conventional resources (mean difference 6.5 %, 95% CI 2.7-10.2, p<0.001). Significant improvements were seen in management decisions (6.1%, 95% CI 2.5-9.7, p=0.001), diagnostic decisions (12.1%, 95% CI 3.1-21.0, p=0.009), and case-specific (6.2%, 95% CI 2.4-9.9, p=0.002) domains. GPT-4 users spent more time per case (mean difference 119.3 seconds, 95% CI 17.4-221.2, p=0.02). There was no significant difference between GPT-4-augmented physicians and GPT-4 alone (-0.9%, 95% CI -9.0 to 7.2, p=0.8). Conclusions and Relevance: LLM assistance improved physician management reasoning compared to conventional resources, with particular gains in contextual and patient-specific decision-making. These findings indicate that LLMs can augment management decision-making in complex cases. Trial Registration ClinicalTrials.gov Identifier: NCT06208423; https://classic.clinicaltrials.gov/ct2/show/NCT06208423
重要性:大型语言模型(LLM)人工智能(AI)系统在诊断推理中大有可为,但其在没有明确正确答案的管理推理中的实用性尚不得而知:与传统资源相比,确定 LLM 辅助是否能提高医生在开放式管理推理任务中的表现:设计:2023年11月30日至2024年4月21日进行的前瞻性随机对照试验:来自斯坦福大学、贝斯以色列女执事医疗中心和弗吉尼亚大学的多机构研究,涉及美国各地的医生。参与者:92 名接受过内科、家庭医学或急诊医学培训的执业主治医师和住院医师。干预措施:五个由专家开发的临床病例小故事中包含多个开放式管理问题,以及通过德尔菲流程创建的评分标准。医生被随机分配在使用传统资源(如 UpToDate、Google)的同时通过 ChatGPT Plus 使用 GPT-4,或仅使用传统资源:主要结果是各组在专家开发的评分标准上的总分差异。次要结果包括特定领域得分和每个病例花费的时间:与使用传统资源的医生相比,使用 LLM 的医生得分更高(平均差异为 6.5%,95% CI 为 2.7-10.2,p<0.001)。在管理决策(6.1%,95% CI 2.5-9.7,p=0.001)、诊断决策(12.1%,95% CI 3.1-21.0,p=0.009)和特定病例(6.2%,95% CI 2.4-9.9,p=0.002)方面均有显著改善。GPT-4 用户在每个病例上花费的时间更长(平均差异 119.3 秒,95% CI 17.4-221.2,p=0.02)。GPT-4增强型医生与GPT-4单独型医生之间没有明显差异(-0.9%,95% CI -9.0至7.2,p=0.8):与传统资源相比,LLM 辅助提高了医生的管理推理能力,尤其是在针对具体情况和患者的决策方面。这些研究结果表明,LLM 可以增强复杂病例的管理决策。试验注册 ClinicalTrials.gov Identifier:NCT06208423; https://classic.clinicaltrials.gov/ct2/show/NCT06208423
{"title":"Large Language Model Influence on Management Reasoning: A Randomized Controlled Trial","authors":"Ethan Ethan, Robert Gallo, Eric Strong, Yingjie Weng, Hannah Kerman, Jason Freed, Josephine A Cool, Zahir Kanjee, Kathleen Lane, Andrew S Parsons, Neera Ahuja, Eric Horvitz, Daniel Yang, Arnold Milstein, Andrew PJ Olson, Jason Hom, Jonathan H. Chen, Adam Rodman","doi":"10.1101/2024.08.05.24311485","DOIUrl":"https://doi.org/10.1101/2024.08.05.24311485","url":null,"abstract":"Importance: Large language model (LLM) artificial intelligence (AI) systems have shown promise in diagnostic reasoning, but their utility in management reasoning with no clear right answers is unknown.\u0000Objective: To determine whether LLM assistance improves physician performance on open-ended management reasoning tasks compared to conventional resources.\u0000Design: Prospective, randomized controlled trial conducted from 30 November 2023 to 21 April 2024.\u0000Setting: Multi-institutional study from Stanford University, Beth Israel Deaconess Medical Center, and the University of Virginia involving physicians from across the United States.\u0000Participants: 92 practicing attending physicians and residents with training in internal medicine, family medicine, or emergency medicine. Intervention: Five expert-developed clinical case vignettes were presented with multiple open-ended management questions and scoring rubrics created through a Delphi process. Physicians were randomized to use either GPT-4 via ChatGPT Plus in addition to conventional resources (e.g., UpToDate, Google), or conventional resources alone.\u0000Main Outcomes and Measures: The primary outcome was difference in total score between groups on expert-developed scoring rubrics. Secondary outcomes included domain-specific scores and time spent per case.\u0000Results: Physicians using the LLM scored higher compared to those using conventional resources (mean difference 6.5 %, 95% CI 2.7-10.2, p<0.001). Significant improvements were seen in management decisions (6.1%, 95% CI 2.5-9.7, p=0.001), diagnostic decisions (12.1%, 95% CI 3.1-21.0, p=0.009), and case-specific (6.2%, 95% CI 2.4-9.9, p=0.002) domains. GPT-4 users spent more time per case (mean difference 119.3 seconds, 95% CI 17.4-221.2, p=0.02). There was no significant difference between GPT-4-augmented physicians and GPT-4 alone (-0.9%, 95% CI -9.0 to 7.2, p=0.8).\u0000Conclusions and Relevance: LLM assistance improved physician management reasoning compared to conventional resources, with particular gains in contextual and patient-specific decision-making. These findings indicate that LLMs can augment management decision-making in complex cases. Trial Registration ClinicalTrials.gov Identifier: NCT06208423; https://classic.clinicaltrials.gov/ct2/show/NCT06208423","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"369 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1101/2024.08.07.24311604
Heli Julkunen, Juho Rousu
Understanding how risk factors interact to jointly influence disease risk can provide insights into disease development and improve risk prediction. We introduce survivalFM, a machine learning extension to the widely used Cox proportional hazards model that incorporates estimation of all potential pairwise interaction effects on time-to-event outcomes. The method relies on learning a low-rank factorized approximation of the interaction effects, hence overcoming the computational and statistical limitations of fitting these terms in models involving many predictor variables. The resulting model is fully interpretable, providing access to the estimates of both individual effects and the approximated interactions. Comprehensive evaluation of survivalFM using the UK Biobank dataset across ten disease examples and a variety of clinical risk factors and omics data modalities shows improved discrimination and reclassification performance (65% and 97.5% of the scenarios tested, respectively). Considering a clinical scenario of cardiovascular risk prediction using predictors from the established QRISK3 model, we further show that the comprehensive interaction modelling adds predictive value beyond the individual and age interaction effects currently included. These results demonstrate that comprehensive modelling of interactions can facilitate advanced insights into disease development and improve risk predictions.
{"title":"Machine learning for comprehensive interaction modelling improves disease risk prediction in the UK Biobank","authors":"Heli Julkunen, Juho Rousu","doi":"10.1101/2024.08.07.24311604","DOIUrl":"https://doi.org/10.1101/2024.08.07.24311604","url":null,"abstract":"Understanding how risk factors interact to jointly influence disease risk can provide insights into disease development and improve risk prediction. We introduce survivalFM, a machine learning extension to the widely used Cox proportional hazards model that incorporates estimation of all potential pairwise interaction effects on time-to-event outcomes. The method relies on learning a low-rank factorized approximation of the interaction effects, hence overcoming the computational and statistical limitations of fitting these terms in models involving\u0000many predictor variables. The resulting model is fully interpretable, providing access to the estimates of both individual effects and the approximated interactions. Comprehensive evaluation of survivalFM using the UK Biobank dataset across ten disease examples and a variety\u0000of clinical risk factors and omics data modalities shows improved discrimination and reclassification performance (65% and 97.5% of the scenarios tested, respectively). Considering a clinical scenario of cardiovascular risk prediction using predictors from the established\u0000QRISK3 model, we further show that the comprehensive interaction modelling adds predictive value beyond the individual and age interaction effects currently included. These results demonstrate that comprehensive modelling of interactions can facilitate advanced insights into disease development and improve risk predictions.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1101/2024.08.07.24311596
Christopher James Duckworth, Dan K Burns, Carlos Lamas-Fernandez, Mark Wright, Rachael Leyland, Matthew Stammers, Michael George, Michael Boniface
Early identification of patients who require onward referral for social care can prevent delays to discharge from hospital. We introduce a machine learning (ML) model to identify potential social care needs at the first point of admission. The model performance is comparable to clinician's predictions of discharge care needs, despite working with only a subset of the information available to the clinician. We find that ML and clinician perform better for identifying different types of care needs, highlighting the added value of a potential system supporting decision making. We also demonstrate the ability for ML to provide automated initial discharge need assessments, in the instance where initial clinical assessment is delayed. Finally, we demonstrate that combining clinician and machine predictions, in a hybrid model, provides even more accurate early predictions of onward social care requirements and demonstrates the potential for human-in-the-loop decision support systems in clinical practice.
{"title":"Predicting onward care needs at admission to reduce discharge delay using machine learning","authors":"Christopher James Duckworth, Dan K Burns, Carlos Lamas-Fernandez, Mark Wright, Rachael Leyland, Matthew Stammers, Michael George, Michael Boniface","doi":"10.1101/2024.08.07.24311596","DOIUrl":"https://doi.org/10.1101/2024.08.07.24311596","url":null,"abstract":"Early identification of patients who require onward referral for social care can prevent delays to discharge from hospital. We introduce a machine learning (ML) model to identify potential social care needs at the first point of admission. The model performance is comparable to clinician's predictions of discharge care needs, despite working with only a subset of the information available to the clinician. We find that ML and clinician perform better for identifying different types of care needs, highlighting the added value of a potential system supporting decision making. We also demonstrate the ability for ML to provide automated initial discharge need assessments, in the instance where initial clinical assessment is delayed. Finally, we demonstrate that combining clinician and machine predictions, in a hybrid model, provides even more accurate early predictions of onward social care requirements and demonstrates the potential for human-in-the-loop decision support systems in clinical practice.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1101/2024.08.06.24311535
Richard Williams, Thomas Bolton, David Jenkins, Mehrdad A Mizani, Matthew Sperrin, Cathie Sudlow, Angela Wood, Adrian Heald, Niels Peek, CVD-COVID-UK/COVID-IMPACT Consortium
The ability to reproduce the work of others is an essential part of the scientific disciplines. However, in practice it is hard, with several authors describing a "replication crisis" in research. For observational studies using electronic health record (EHR) data, replication is also important. However, replicating observational studies using EHR data can be challenging for many reasons, including complexities in data access, variations in EHR systems across institutions, and the potential for confounding variables that may not be fully accounted for. Observational research is typically given less weight in systematic reviews and clinical guidelines, in favour of more conclusive research such as randomised control trials. Observational research that is replicable has more impact. In this study we aimed to replicate a previous study that had examined the risk of hospitalisation following a positive COVID-19 test in individuals with diabetes. Using EHR data from the NHS England's Secure Data Environment covering the whole of England, UK (population 57m), we sought to replicate findings from the original study, which used data from Greater Manchester (a large urban region in the UK, population 2.9m). Both analyses were conducted in Trusted Research Environments (TREs) or Secure Data Environments (SDEs), containing linked primary and secondary care data. However, the small differences between the environments and the data sources led to several challenges in assessing reproducibility. In this paper we describe the differences between the environments, reflect on the challenges faced, and produce a list of recommendations for TREs and SDEs to assist future replication studies.
{"title":"The challenges of replication: a worked example of methods reproducibility using electronic health record data","authors":"Richard Williams, Thomas Bolton, David Jenkins, Mehrdad A Mizani, Matthew Sperrin, Cathie Sudlow, Angela Wood, Adrian Heald, Niels Peek, CVD-COVID-UK/COVID-IMPACT Consortium","doi":"10.1101/2024.08.06.24311535","DOIUrl":"https://doi.org/10.1101/2024.08.06.24311535","url":null,"abstract":"The ability to reproduce the work of others is an essential part of the scientific disciplines. However, in practice it is hard, with several authors describing a \"replication crisis\" in research. For observational studies using electronic health record (EHR) data, replication is also important. However, replicating observational studies using EHR data can be challenging for many reasons, including complexities in data access, variations in EHR systems across institutions, and the potential for confounding variables that may not be fully accounted for. Observational research is typically given less weight in systematic reviews and clinical guidelines, in favour of more conclusive research such as randomised control trials. Observational research that is replicable has more impact.\u0000In this study we aimed to replicate a previous study that had examined the risk of hospitalisation following a positive COVID-19 test in individuals with diabetes. Using EHR data from the NHS England's Secure Data Environment covering the whole of England, UK (population 57m), we sought to replicate findings from the original study, which used data from Greater Manchester (a large urban region in the UK, population 2.9m). Both analyses were conducted in Trusted Research Environments (TREs) or Secure Data Environments (SDEs), containing linked primary and secondary\u0000care data. However, the small differences between the environments and the data sources led to several challenges in assessing reproducibility. In this paper we describe the differences between the environments, reflect on the challenges faced, and produce a list of recommendations for TREs and SDEs to assist future replication studies.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-03DOI: 10.1101/2024.08.01.24311392
Haoxin Chen, Will Simmons, Malak Hashish, Jiancheng Ye
Objective: To evaluate the utilization patterns, effectiveness, and patient satisfaction of telehealth services among individuals with hypertension and/or diabetes, and to investigate the influence of social determinants of health (SDOH) on telehealth access and utilization in this population. Methods: We conducted a cross-sectional analysis using data from the 2022 Health Information National Trends Survey (HINTS 6) by the National Cancer Institute. The study sample included 3,009 respondents with self-reported diabetes, hypertension, or both conditions. Telehealth usage was assessed through 14 survey questions, and participant characteristics were analyzed using sociodemographic, baseline health, and SDOH data. Results: Of the 6,252 HINTS 6 survey respondents, 3,009 met the inclusion criteria. Significant sociodemographic differences were observed across the diabetes and/or hypertension groups. No significant differences were found in telehealth usage among the groups, with 43.9% of respondents utilizing telehealth in the past year. Common reasons for telehealth use included provider recommendation, convenience, and infection avoidance. Social determinants of health, such as food insecurity and transportation issues, were more prevalent among individuals with both conditions, though no significant differences in telehealth experiences were noted across groups. Conclusion: Telehealth shows potential for managing chronic conditions like hypertension and diabetes, demonstrating substantial adoption and universal accessibility. However, disparities influenced by SDOH highlight the need for targeted interventions to ensure equitable access. Addressing privacy concerns, leveraging healthcare providers' recommendations, and tackling SDOH barriers are crucial for fostering wider telehealth adoption and improving outcomes. Future research should focus on the long-term impacts of telehealth and further investigate SDOH factors to develop tailored interventions that enhance engagement and equitable access across diverse patient populations.
{"title":"Telehealth Utilization and Patient Experiences: The Role of Social Determinants of Health Among Individuals with Hypertension and Diabetes","authors":"Haoxin Chen, Will Simmons, Malak Hashish, Jiancheng Ye","doi":"10.1101/2024.08.01.24311392","DOIUrl":"https://doi.org/10.1101/2024.08.01.24311392","url":null,"abstract":"Objective:\u0000To evaluate the utilization patterns, effectiveness, and patient satisfaction of telehealth services among individuals with hypertension and/or diabetes, and to investigate the influence of social determinants of health (SDOH) on telehealth access and utilization in this population. Methods: We conducted a cross-sectional analysis using data from the 2022 Health Information National Trends Survey (HINTS 6) by the National Cancer Institute. The study sample included 3,009 respondents with self-reported diabetes, hypertension, or both conditions. Telehealth usage was assessed through 14 survey questions, and participant characteristics were analyzed using sociodemographic, baseline health, and SDOH data. Results: Of the 6,252 HINTS 6 survey respondents, 3,009 met the inclusion criteria. Significant sociodemographic differences were observed across the diabetes and/or hypertension groups. No significant differences were found in telehealth usage among the groups, with 43.9% of respondents utilizing telehealth in the past year. Common reasons for telehealth use included provider recommendation, convenience, and infection avoidance. Social determinants of health, such as food insecurity and transportation issues, were more prevalent among individuals with both conditions, though no significant differences in telehealth experiences were noted across groups. Conclusion:\u0000Telehealth shows potential for managing chronic conditions like hypertension and diabetes, demonstrating substantial adoption and universal accessibility. However, disparities influenced by SDOH highlight the need for targeted interventions to ensure equitable access. Addressing privacy concerns, leveraging healthcare providers' recommendations, and tackling SDOH barriers are crucial for fostering wider telehealth adoption and improving outcomes. Future research should focus on the long-term impacts of telehealth and further investigate SDOH factors to develop tailored interventions that enhance engagement and equitable access across diverse patient populations.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1101/2024.07.31.24311297
Ashley Lewis, Yash Samir Khandwala, Tina Hernandez-Boussard, James Brooks
This study investigates the potential of multimodal data for prostate cancer (PCa) risk prediction using the All of Us (AoU) research program dataset. By integrating polygenic risk scores (PRSs) with diverse clinical, survey, and genomic data, we developed a model that identifies established PCa risk factors, such as age and family history, and a novel factor: recent healthcare visits are linked to reduced risk. The model's performance, notably the false positive rate, is improved compared to traditional methods, despite the lack of Prostate-Specific Antigen (PSA) data. The findings demonstrate that incorporating comprehensive multimodal data from AoU can enhance PCa risk prediction and provide a robust framework for future clinical applications.
{"title":"SDoH-Aware Approach to Prostate Cancer Screening: Addressing Overdiagnosis of Prostate Cancer using PSA","authors":"Ashley Lewis, Yash Samir Khandwala, Tina Hernandez-Boussard, James Brooks","doi":"10.1101/2024.07.31.24311297","DOIUrl":"https://doi.org/10.1101/2024.07.31.24311297","url":null,"abstract":"This study investigates the potential of multimodal data for prostate cancer (PCa) risk prediction using the All of Us (AoU) research program dataset. By integrating polygenic risk scores (PRSs) with diverse clinical, survey, and genomic data, we developed a model that identifies established PCa risk factors, such as age and family history, and a novel factor: recent healthcare visits are linked to reduced risk. The model's performance, notably the false positive rate, is improved compared to traditional methods, despite the lack of Prostate-Specific Antigen (PSA) data. The findings demonstrate that incorporating comprehensive multimodal data from AoU can enhance PCa risk prediction and provide a robust framework for future clinical applications.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141883408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1101/2024.07.31.24311287
Amanda Momenzadeh, Caleb W Cranney, So Yung Choi, Catherine Bresee, Mourad Tighiouart, Roma Gianchandani, Joshua Pevnick, Jason Moore, Jesse Meyer
Objective: A multitude of factors affect a hospitalized individual's blood glucose (BG), making BG difficult to predict and manage. Beyond medications well established to alter BG, such as beta-blockers, there are likely many medications with undiscovered effects on BG variability. Identification of these medications and the strength and timing of these relationships has potential to improve glycemic management and patient safety. Materials and Methods: EHR data from 103,871 inpatient encounters over 8 years within a large, urban health system was used to extract over 500 medications, laboratory measurements, and clinical predictors of BG. Feature selection was performed using an optimized Lasso model with repeated 5-fold cross-validation on the 80% training set, followed by a linear mixed regression model to evaluate statistical significance. Significant medication predictors were then evaluated for novelty against a comprehensive adverse drug event database. Results: We found 29 statistically significant features associated with BG; 24 were medications including 10 medications not previously documented to alter BG. The remaining five factors were Black/African American race, history of type 2 diabetes mellitus, prior BG (mean and last) and creatinine. Discussion: The unexpected medications, including several agents involved in gastrointestinal motility, found to affect BG were supported by available studies. This study may bring to light medications to use with caution in individuals with hyper- or hypoglycemia. Further investigation of these potential candidates is needed to enhance clinical utility of these findings. Conclusion: This study uniquely identifies medications involved in gastrointestinal transit to be predictors of BG that may not well established and recognized in clinical practice.
{"title":"Medications that Regulate Gastrointestinal Transit Influence Inpatient Blood Glucose","authors":"Amanda Momenzadeh, Caleb W Cranney, So Yung Choi, Catherine Bresee, Mourad Tighiouart, Roma Gianchandani, Joshua Pevnick, Jason Moore, Jesse Meyer","doi":"10.1101/2024.07.31.24311287","DOIUrl":"https://doi.org/10.1101/2024.07.31.24311287","url":null,"abstract":"Objective: A multitude of factors affect a hospitalized individual's blood glucose (BG), making BG difficult to predict and manage. Beyond medications well established to alter BG, such as beta-blockers, there are likely many medications with undiscovered effects on BG variability. Identification of these medications and the strength and timing of these relationships has potential to improve glycemic management and patient safety.\u0000Materials and Methods: EHR data from 103,871 inpatient encounters over 8 years within a large, urban health system was used to extract over 500 medications, laboratory measurements, and clinical predictors of BG. Feature selection was performed using an optimized Lasso model with repeated 5-fold cross-validation on the 80% training set, followed by a linear mixed regression model to evaluate statistical significance. Significant medication predictors were then evaluated for novelty against a comprehensive adverse drug event database. Results: We found 29 statistically significant features associated with BG; 24 were medications including 10 medications not previously documented to alter BG. The remaining five factors were Black/African American race, history of type 2 diabetes mellitus, prior BG (mean and last) and creatinine. Discussion: The unexpected medications, including several agents involved in gastrointestinal motility, found to affect BG were supported by available studies. This study may bring to light medications to use with caution in individuals with hyper- or hypoglycemia. Further investigation of these potential candidates is needed to enhance clinical utility of these findings. Conclusion: This study uniquely identifies medications involved in gastrointestinal transit to be predictors of BG that may not well established and recognized in clinical practice.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141883407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1101/2024.07.31.24311182
Amir Bahmani, Kexin Cha, Arash Alavi, Amit Dixit, Antony Ross, Ryan Park, Francesca Goncalves, Shirley Ma, Paul Saxman, Ramesh Nair, Ramin Akhavan Sarraf, Xin Zhou, Meng Wang, Kevin Contrepois, Jennifer Li Pook Than, Emma Monte, David Jose Florez Rodriguez, Jaslene Lai, Mohan Babu, Abtin Tondar, Sophia Miryam Schussler-Fiorenza Rose, Ilya Akbari, Xinyue Zhang, Kritika Yegnashankaran, Joseph Yracheta, Kali Dale, Alison Derbenwick Miller, Scott Edmiston, Eva M McGhee, Camille Nebeker, Joseph C Wu, Anshul Kundaje, Michael Snyder
Precision medicine promises significant health benefits but faces challenges such as the need for complex data management and analytics, interdisciplinary collaboration, and education of researchers, healthcare professionals, and participants. Addressing these needs requires the integration of computational experts, engineers, designers, and healthcare professionals to develop user-friendly systems and shared terminologies. The widespread adoption of large language models (LLMs) like GPT-4 and Claude 3 highlights the importance of making complex data accessible to non-specialists. The Stanford Data Ocean (SDO) strives to mitigate these challenges through a scalable, cloud-based platform that supports data management for various data types, advanced research, and personalized learning in precision medicine. SDO provides AI tutors and AI-powered data visualization tools to enhance educational and research outcomes and make data analysis accessible for users from diverse educational backgrounds. By extending engagement and cutting-edge research capabilities globally, SDO particularly benefits economically disadvantaged and historically marginalized communities, fostering interdisciplinary biomedical research and bridging the gap between education and practical application in the biomedical field.
精准医疗有望带来巨大的健康效益,但也面临着各种挑战,例如需要复杂的数据管理和分析、跨学科合作以及对研究人员、医疗保健专业人员和参与者的教育。要满足这些需求,就必须整合计算专家、工程师、设计师和医疗保健专业人员,开发用户友好型系统和共享术语。GPT-4 和 Claude 3 等大型语言模型(LLM)的广泛采用凸显了让非专业人员也能访问复杂数据的重要性。斯坦福数据海洋(SDO)致力于通过一个可扩展的云平台来缓解这些挑战,该平台支持各种数据类型的数据管理、高级研究和精准医学中的个性化学习。SDO 提供人工智能辅导员和人工智能驱动的数据可视化工具,以提高教育和研究成果,使来自不同教育背景的用户都能进行数据分析。通过在全球范围内扩大参与和尖端研究能力,SDO 尤其惠及经济上处于不利地位和历史上被边缘化的社区,促进跨学科生物医学研究,缩小生物医学领域教育与实际应用之间的差距。
{"title":"Achieving Inclusive Healthcare through Integrating Education and Research with AI and Personalized Curricula","authors":"Amir Bahmani, Kexin Cha, Arash Alavi, Amit Dixit, Antony Ross, Ryan Park, Francesca Goncalves, Shirley Ma, Paul Saxman, Ramesh Nair, Ramin Akhavan Sarraf, Xin Zhou, Meng Wang, Kevin Contrepois, Jennifer Li Pook Than, Emma Monte, David Jose Florez Rodriguez, Jaslene Lai, Mohan Babu, Abtin Tondar, Sophia Miryam Schussler-Fiorenza Rose, Ilya Akbari, Xinyue Zhang, Kritika Yegnashankaran, Joseph Yracheta, Kali Dale, Alison Derbenwick Miller, Scott Edmiston, Eva M McGhee, Camille Nebeker, Joseph C Wu, Anshul Kundaje, Michael Snyder","doi":"10.1101/2024.07.31.24311182","DOIUrl":"https://doi.org/10.1101/2024.07.31.24311182","url":null,"abstract":"Precision medicine promises significant health benefits but faces challenges such as the need for complex data management and analytics, interdisciplinary collaboration, and education of researchers, healthcare professionals, and participants. Addressing these needs requires the integration of computational experts, engineers, designers, and healthcare professionals to develop user-friendly systems and shared terminologies. The widespread adoption of large language models (LLMs) like GPT-4 and Claude 3 highlights the importance of making complex data accessible to non-specialists. The Stanford Data Ocean (SDO) strives to mitigate these challenges through a scalable, cloud-based platform that supports data management for various data types, advanced research, and personalized learning in precision medicine. SDO provides AI tutors and AI-powered data visualization tools to enhance educational and research outcomes and make data analysis accessible for users from diverse educational backgrounds. By extending engagement and cutting-edge research capabilities globally, SDO particularly benefits economically disadvantaged and historically marginalized communities, fostering interdisciplinary biomedical research and bridging the gap between education and practical application in the biomedical field.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141867169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1101/2024.07.30.24311178
Marirena Bafaloukou, Ann-Kathrin Schalkamp, Nan Victoria Fletcher-Lloyd, Alexander Capstick, Chloe Walsh, Cynthia Sandor, Samaneh Kouchaki, Ramin Nilforooshan, Payam Barnaghi
Background Agitation affects around 30% of people living with dementia (PLwD), increasing carer burden and straining care services. Agitation screening typically relies on subjective clinical scales and direct patient observation, which are resource-intensive and challenging to incorporate into routine care. Clinical applicability of data-driven methods for agitation screening is limited by constraints such as short observational periods, data granularity, and lack of interpretability and generalisability. Current interventions for agitation are primarily medication-based, which may lead to severe side effects and lack personalisation. Understanding how real-world factors affect agitation within home settings offers a promising avenue towards identifying potential personalised non-pharmacological interventions. Methods We used longitudinal data (32,896 person-days from n=63 PLwD) collected using in-home monitoring devices. Employing machine learning techniques, we developed a screening tool to determine the weekly risk of agitation. We incorporated a traffic-light system for risk stratification to aid clinical decision-making and employed the SHapley Additive exPlanations (SHAP) framework to increase interpretability. We designed an interactive tool that enables the exploration of personalised non-pharmacological interventions, such as modifying ambient light and temperature. Results Light Gradient-boosting Machine (LightGBM) achieved the highest performance in identifying agitation with a sensitivity of 71.32±7.38% and specificity of 75.28±10.43%. Implementing the traffic-light system for risk stratification increased specificity by 15% and improved all metrics. Significant contributors to agitation included low nocturnal respiratory rate, heightened alertness during sleep, and increased indoor illuminance, as revealed by statistical and feature importance analysis. Using our interactive tool, we identified that adjusting indoor lighting levels and temperature were promising and feasible interventions within our cohort. Conclusions Our interpretable framework for agitation screening, developed using data from a dementia care study, showcases significant clinical value. The accompanying interactive interface allows for the in-silico simulation of non-pharmacological interventions, facilitating the design of personalised interventions that can improve in-home dementia care.
{"title":"An Interpretable Machine Learning Tool for In-Home Screening of Agitation Episodes in People Living with Dementia","authors":"Marirena Bafaloukou, Ann-Kathrin Schalkamp, Nan Victoria Fletcher-Lloyd, Alexander Capstick, Chloe Walsh, Cynthia Sandor, Samaneh Kouchaki, Ramin Nilforooshan, Payam Barnaghi","doi":"10.1101/2024.07.30.24311178","DOIUrl":"https://doi.org/10.1101/2024.07.30.24311178","url":null,"abstract":"Background\u0000Agitation affects around 30% of people living with dementia (PLwD), increasing carer burden and straining care services. Agitation screening typically relies on subjective clinical scales and direct patient observation, which are resource-intensive and challenging to incorporate into routine care. Clinical applicability of data-driven methods for agitation screening is limited by constraints such as short observational periods, data granularity, and lack of interpretability and generalisability. Current interventions for agitation are primarily medication-based, which may lead to severe side effects and lack personalisation. Understanding how real-world factors affect agitation within home settings offers a promising avenue towards identifying potential personalised non-pharmacological interventions. Methods\u0000We used longitudinal data (32,896 person-days from n=63 PLwD) collected using in-home monitoring devices. Employing machine learning techniques, we developed a screening tool to determine the weekly risk of agitation. We incorporated a traffic-light system for risk stratification to aid clinical decision-making and employed the SHapley Additive exPlanations (SHAP) framework to increase interpretability. We designed an interactive tool that enables the exploration of personalised non-pharmacological interventions, such as modifying ambient light and temperature. Results\u0000Light Gradient-boosting Machine (LightGBM) achieved the highest performance in identifying agitation with a sensitivity of 71.32±7.38% and specificity of 75.28±10.43%. Implementing the traffic-light system for risk stratification increased specificity by 15% and improved all metrics. Significant contributors to agitation included low nocturnal respiratory rate, heightened alertness during sleep, and increased indoor illuminance, as revealed by statistical and feature importance analysis. Using our interactive tool, we identified that adjusting indoor lighting levels and temperature were promising and feasible interventions within our cohort. Conclusions\u0000Our interpretable framework for agitation screening, developed using data from a dementia care study, showcases significant clinical value. The accompanying interactive interface allows for the in-silico simulation of non-pharmacological interventions, facilitating the design of personalised interventions that can improve in-home dementia care.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"208 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141867167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-31DOI: 10.1101/2024.07.29.24310315
Arash Alavi, Kexin Cha, Delara P Esfarjani, Bhavesh Patel, Jennifer Li Pook Than, Aaron Y Lee, Camille Nebeker, Michael Snyder, Amir Bahmani
Large Language Models (LLMs) have gained significant attention and are increasingly used by researchers. Concurrently, publicly accessible datasets containing individual-level health information are becoming more available. Some of these datasets, such as the recently released Artificial Intelligence Ready and Equitable Atlas for Diabetes Insights (AI-READI) dataset, include individual-level data from digital wearable technologies. The application of LLMs to gain insights about health from wearable sensor data specific to diabetes is underexplored. This study presents a comprehensive evaluation of multiple LLMs, including GPT-3.5, GPT-4, GPT-4o, Gemini, Gemini 1.5 Pro, and Claude 3 Sonnet, on various diabetes research tasks using diverse prompting methods to evaluate their performance and gain new insights into diabetes and glucose dysregulation. Notably, GPT-4o showed promising performance across tasks with a chain-of-thought prompt design (aggregate performance score of 95.5%). Moreover, using this model, we identified new insights from the dataset, such as the heightened sensitivity to stress among diabetic participants during glucose level fluctuations, which underscores the complex interplay between metabolic and psychological factors. These results demonstrate that LLMs can enhance the pace of discovery and also enable automated interpretation of data for users of wearable devices, including both the research team and the individual wearing the device. Meanwhile, we also emphasize the critical limitations, such as privacy and ethical risks and dataset biases, that must be resolved for real-world application in diabetes health settings. This study highlights the potential and challenges of integrating LLMs into diabetes research and, more broadly, wearables, paving the way for future healthcare advancements, particularly in disadvantaged communities.
大型语言模型(LLMs)已受到广泛关注,并越来越多地被研究人员使用。与此同时,包含个人健康信息的可公开访问的数据集也越来越多。其中一些数据集,如最近发布的 "人工智能就绪与糖尿病洞察公平图集"(AI-READI)数据集,包含了来自数字可穿戴技术的个人层面数据。应用 LLM 从专门针对糖尿病的可穿戴传感器数据中获取健康洞察力的研究还很欠缺。本研究采用不同的提示方法,对多种 LLMs(包括 GPT-3.5、GPT-4、GPT-4o、Gemini、Gemini 1.5 Pro 和 Claude 3 Sonnet)在各种糖尿病研究任务中的表现进行了综合评估,以评价它们的性能,并获得有关糖尿病和血糖失调的新见解。值得注意的是,在采用思维链提示设计的任务中,GPT-4o 表现出色(总分 95.5%)。此外,利用该模型,我们还从数据集中发现了新的见解,例如糖尿病患者在血糖水平波动期间对压力的敏感性增强,这凸显了代谢和心理因素之间复杂的相互作用。这些结果表明,LLM 可以加快发现的速度,还能为可穿戴设备的用户(包括研究团队和佩戴设备的个人)自动解读数据。同时,我们也强调了在糖尿病健康环境中实际应用时必须解决的关键限制,如隐私和伦理风险以及数据集偏差。这项研究强调了将 LLMs 纳入糖尿病研究以及更广泛的可穿戴设备的潜力和挑战,为未来医疗保健的进步铺平了道路,尤其是在弱势群体中。
{"title":"Perspective on Harnessing Large Language Models to Uncover Insights in Diabetes Wearable Data","authors":"Arash Alavi, Kexin Cha, Delara P Esfarjani, Bhavesh Patel, Jennifer Li Pook Than, Aaron Y Lee, Camille Nebeker, Michael Snyder, Amir Bahmani","doi":"10.1101/2024.07.29.24310315","DOIUrl":"https://doi.org/10.1101/2024.07.29.24310315","url":null,"abstract":"Large Language Models (LLMs) have gained significant attention and are increasingly used by researchers. Concurrently, publicly accessible datasets containing individual-level health information are becoming more available. Some of these datasets, such as the recently released Artificial Intelligence Ready and Equitable Atlas for Diabetes Insights (AI-READI) dataset, include individual-level data from digital wearable technologies. The application of LLMs to gain insights about health from wearable sensor data specific to diabetes is underexplored. This study presents a comprehensive evaluation of multiple LLMs, including GPT-3.5, GPT-4, GPT-4o, Gemini, Gemini 1.5 Pro, and Claude 3 Sonnet, on various diabetes research tasks using diverse prompting methods to evaluate their performance and gain new insights into diabetes and glucose dysregulation. Notably, GPT-4o showed promising performance across tasks with a chain-of-thought prompt design (aggregate performance score of 95.5%). Moreover, using this model, we identified new insights from the dataset, such as the heightened sensitivity to stress among diabetic participants during glucose level fluctuations, which underscores the complex interplay between metabolic and psychological factors. These results demonstrate that LLMs can enhance the pace of discovery and also enable automated interpretation of data for users of wearable devices, including both the research team and the individual wearing the device. Meanwhile, we also emphasize the critical limitations, such as privacy and ethical risks and dataset biases, that must be resolved for real-world application in diabetes health settings. This study highlights the potential and challenges of integrating LLMs into diabetes research and, more broadly, wearables, paving the way for future healthcare advancements, particularly in disadvantaged communities.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"149 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141867171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}