The results of clinical trials are a valuable source of evidence for researchers, policy makers, and healthcare professionals. However, online trial registries do not always contain links to the publications that report on their results, instead requiring a time-consuming manual search. Here, we explored the application of pre-trained transformer-based language models to automatically identify result-reporting publications of cancer clinical trials by computing dense vectors and performing semantic search. Models were fine-tuned on text data from trial registry fields and article metadata using a contrastive learning approach. The best performing model was PubMedBERT, which achieved a mean average precision of 0.592 and ranked 70.3% of a trial's publications in the top 5 results when tested on the holdout test trials. Our results suggest that semantic search using embeddings from transformer models may be an effective approach to the task of linking trials to their publications.
{"title":"Linking Cancer Clinical Trials to their Result Publications.","authors":"Evan Pan, Kirk Roberts","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The results of clinical trials are a valuable source of evidence for researchers, policy makers, and healthcare professionals. However, online trial registries do not always contain links to the publications that report on their results, instead requiring a time-consuming manual search. Here, we explored the application of pre-trained transformer-based language models to automatically identify result-reporting publications of cancer clinical trials by computing dense vectors and performing semantic search. Models were fine-tuned on text data from trial registry fields and article metadata using a contrastive learning approach. The best performing model was PubMedBERT, which achieved a mean average precision of 0.592 and ranked 70.3% of a trial's publications in the top 5 results when tested on the holdout test trials. Our results suggest that semantic search using embeddings from transformer models may be an effective approach to the task of linking trials to their publications.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaodan Zhang, Nabasmita Talukdar, Sandeep Vemulapalli, Sumyeong Ahn, Jiankun Wang, Han Meng, Sardar Mehtab Bin Murtaza, Dmitry Leshchiner, Aakash Ajay Dave, Dimitri F Joseph, Martin Witteveen-Lane, Dave Chesla, Jiayu Zhou, Bin Chen
The emerging large language models (LLMs) are actively evaluated in various fields including healthcare. Most studies have focused on established benchmarks and standard parameters; however, the variation and impact of prompt engineering and fine-tuning strategies have not been fully explored. This study benchmarks GPT-3.5 Turbo, GPT-4, and Llama-7B against BERT models and medical fellows' annotations in identifying patients with metastatic cancer from discharge summaries. Results revealed that clear, concise prompts incorporating reasoning steps significantly enhanced performance. GPT-4 exhibited superior performance among all models. Notably, one-shot learning and fine-tuning provided no incremental benefit. The model's accuracy sustained even when keywords for metastatic cancer were removed or when half of the input tokens were randomly discarded. These findings underscore GPT-4's potential to substitute specialized models, such as PubMedBERT, through strategic prompt engineering, and suggest opportunities to improve open-source models, which are better suited to use in clinical settings.
{"title":"Comparison of Prompt Engineering and Fine-Tuning Strategies in Large Language Models in the Classification of Clinical Notes.","authors":"Xiaodan Zhang, Nabasmita Talukdar, Sandeep Vemulapalli, Sumyeong Ahn, Jiankun Wang, Han Meng, Sardar Mehtab Bin Murtaza, Dmitry Leshchiner, Aakash Ajay Dave, Dimitri F Joseph, Martin Witteveen-Lane, Dave Chesla, Jiayu Zhou, Bin Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The emerging large language models (LLMs) are actively evaluated in various fields including healthcare. Most studies have focused on established benchmarks and standard parameters; however, the variation and impact of prompt engineering and fine-tuning strategies have not been fully explored. This study benchmarks GPT-3.5 Turbo, GPT-4, and Llama-7B against BERT models and medical fellows' annotations in identifying patients with metastatic cancer from discharge summaries. Results revealed that clear, concise prompts incorporating reasoning steps significantly enhanced performance. GPT-4 exhibited superior performance among all models. Notably, one-shot learning and fine-tuning provided no incremental benefit. The model's accuracy sustained even when keywords for metastatic cancer were removed or when half of the input tokens were randomly discarded. These findings underscore GPT-4's potential to substitute specialized models, such as PubMedBERT, through strategic prompt engineering, and suggest opportunities to improve open-source models, which are better suited to use in clinical settings.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141826/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141199710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vivien Song, David Kauchak, John Hamre, Nick Morgenstein, Gondy Leroy
Critical to producing accessible content is an understanding of what characteristics affect understanding and comprehension. To answer this question, we are producing a large corpus of health-related texts with associated questions that can be read or listened to by study participants to measure the difficulty of the underlying content, which can later be used to better understand text difficulty and user comprehension. In this paper, we examine methods for automatically generating multiple-choice questions using Google's related questions and ChatGPT. Overall, we find both algorithms generate reasonable questions that are complementary; ChatGPT questions are more similar to the snippet while Google related-search questions have more lexical variation.
{"title":"A Comparison of Google and ChatGPT for Automatic Generation of Health-related Multiple-choice Questions.","authors":"Vivien Song, David Kauchak, John Hamre, Nick Morgenstein, Gondy Leroy","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Critical to producing accessible content is an understanding of what characteristics affect understanding and comprehension. To answer this question, we are producing a large corpus of health-related texts with associated questions that can be read or listened to by study participants to measure the difficulty of the underlying content, which can later be used to better understand text difficulty and user comprehension. In this paper, we examine methods for automatically generating multiple-choice questions using Google's related questions and ChatGPT. Overall, we find both algorithms generate reasonable questions that are complementary; ChatGPT questions are more similar to the snippet while Google related-search questions have more lexical variation.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141817/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruiming Wu, Bing He, Bojian Hou, Andrew J Saykin, Jingwen Yan, Li Shen
Over the past decade, Alzheimer's disease (AD) has become increasingly severe and gained greater attention. Mild Cognitive Impairment (MCI) serves as an important prodromal stage of AD, highlighting the urgency of early diagnosis for timely treatment and control of the condition. Identifying the subtypes of MCI patients exhibits importance for dissecting the heterogeneity of this complex disorder and facilitating more effective target discovery and therapeutic development. Conventional method uses clinical measurements such as cognitive score and neurophysical assessment to stratify MCI patients into two groups with early MCI (EMCI) and late MCI (LMCI), which shows their progressive stages. However, such clinical method is not designed to de-convolute the heterogeneity of the disorder. This study uses a data-driven approach to divide MCI patients into a novel grouping of two subtypes based on an amyloid dataset of 68 cortical features from positron emission tomography (PET), where each subtype has a homogeneous cortical amyloid burden pattern. Experimental evaluation including visual two-dimensional cluster distribution, Kaplan-Meier plot, genetic association studies, and biomarker distribution analysis demonstrates that the identified subtypes performs better across all metrics than the conventional EMCI and LMCI grouping.
{"title":"Cluster Analysis of Cortical Amyloid Burden for Identifying Imaging-driven Subtypes in Mild Cognitive Impairment.","authors":"Ruiming Wu, Bing He, Bojian Hou, Andrew J Saykin, Jingwen Yan, Li Shen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Over the past decade, Alzheimer's disease (AD) has become increasingly severe and gained greater attention. Mild Cognitive Impairment (MCI) serves as an important prodromal stage of AD, highlighting the urgency of early diagnosis for timely treatment and control of the condition. Identifying the subtypes of MCI patients exhibits importance for dissecting the heterogeneity of this complex disorder and facilitating more effective target discovery and therapeutic development. Conventional method uses clinical measurements such as cognitive score and neurophysical assessment to stratify MCI patients into two groups with early MCI (EMCI) and late MCI (LMCI), which shows their progressive stages. However, such clinical method is not designed to de-convolute the heterogeneity of the disorder. This study uses a data-driven approach to divide MCI patients into a novel grouping of two subtypes based on an amyloid dataset of 68 cortical features from positron emission tomography (PET), where each subtype has a homogeneous cortical amyloid burden pattern. Experimental evaluation including visual two-dimensional cluster distribution, Kaplan-Meier plot, genetic association studies, and biomarker distribution analysis demonstrates that the identified subtypes performs better across all metrics than the conventional EMCI and LMCI grouping.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141199047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lorna Pairman, Paul Chin, Sharon J Gardiner, Matthew Doogue
The aim was to assess how making the indication field compulsory in our electronic prescribing system influenced free text documentation and to visualise prescriber behaviour. The indication field was made compulsory for seven antibacterial medicines. Text recorded in the indication field was manually classified as 'indication present', 'other text', 'rubbish text', or 'blank'. The proportion of prescriptions with an indication was compared for four weeks before and after the intervention. Indication provision increased from 10.6% to 72.4% (p<0.01) post-intervention. 'Other text' increased from 7.6% to 25.1% (p<0.01), and 'rubbish text' from 0.0% to 0.6% (p<0.01). Introducing the compulsory indication field increased indication documentation substantially with only a small increase in 'rubbish text'. An interactive report was developed using a live data extract to illustrate indication provision for all medicines prescribed at our tertiary hospital. The interactive report was validated and locally published to support audit and quality improvement projects.
{"title":"Compulsory Indications in Hospital Prescribing Software Tested with Antibacterial Prescriptions.","authors":"Lorna Pairman, Paul Chin, Sharon J Gardiner, Matthew Doogue","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The aim was to assess how making the indication field compulsory in our electronic prescribing system influenced free text documentation and to visualise prescriber behaviour. The indication field was made compulsory for seven antibacterial medicines. Text recorded in the indication field was manually classified as 'indication present', 'other text', 'rubbish text', or 'blank'. The proportion of prescriptions with an indication was compared for four weeks before and after the intervention. Indication provision increased from 10.6% to 72.4% (p<0.01) post-intervention. 'Other text' increased from 7.6% to 25.1% (p<0.01), and 'rubbish text' from 0.0% to 0.6% (p<0.01). Introducing the compulsory indication field increased indication documentation substantially with only a small increase in 'rubbish text'. An interactive report was developed using a live data extract to illustrate indication provision for all medicines prescribed at our tertiary hospital. The interactive report was validated and locally published to support audit and quality improvement projects.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141823/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shiwei Lin, Shiqiang Tao, Wei-Chun Chou, Guo-Qiang Zhang, Xiaojin Li
Clinical research data visualization is integral to making sense of biomedical research and healthcare data. The complexity and diversity of data, along with the need for solid programming skills, can hinder advances in clinical research data visualization. To overcome these challenges, we introduce VisualSphere, a web-based interactive visualization system that directly interfaces with clinical research data repositories, streamlining and simplifying the visualization workflow. VisualSphere is founded on three primary component modules: Connection, Configuration, and Visualization. An end-user can set up connections to the data repositories, create charts by selecting the desired tables and variables, and render visualization dashboards generated by Plotly and R/Shiny. We performed a preliminary evaluation of VisualSphere, which achieved high user satisfaction. VisualSphere has the potential to serve as a versatile tool for various clinical research data repositories, enabling researchers to explore and interact with clinical research data efficiently and effectively.
{"title":"VisualSphere: a Web-based Interactive Visualization System for Clinical Research Data.","authors":"Shiwei Lin, Shiqiang Tao, Wei-Chun Chou, Guo-Qiang Zhang, Xiaojin Li","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical research data visualization is integral to making sense of biomedical research and healthcare data. The complexity and diversity of data, along with the need for solid programming skills, can hinder advances in clinical research data visualization. To overcome these challenges, we introduce VisualSphere, a web-based interactive visualization system that directly interfaces with clinical research data repositories, streamlining and simplifying the visualization workflow. VisualSphere is founded on three primary component modules: Connection, Configuration, and Visualization. An end-user can set up connections to the data repositories, create charts by selecting the desired tables and variables, and render visualization dashboards generated by Plotly and R/Shiny. We performed a preliminary evaluation of VisualSphere, which achieved high user satisfaction. VisualSphere has the potential to serve as a versatile tool for various clinical research data repositories, enabling researchers to explore and interact with clinical research data efficiently and effectively.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141841/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chronic obstructive pulmonary disease (COPD) is a global health issue causing significant illness and death. Pulmonary Rehabilitation (PR) offers non-pharmacological treatment, including education, exercise, and psychological support which was shown to improve clinical outcomes. In both stable COPD and after an acute exacerbation, PR has been demonstrated to increase exercise capacity, decrease dyspnea, and enhance quality of life. Despite these benefits, referrals for PR for COPD treatment remain low. This study aims to evaluate the perceptions of healthcare providers for referring a COPD patient to PR. Semi-structured qualitative interviews were conducted with pulmonary specialists, hospitalists, and emergency department physicians. Domains and constructs from the Consolidated Framework for Implementation Research (CFIR) were applied to the qualitative data to organize, analyze, and identify the barriers and facilitators to referring COPD patients. The findings from this study will help guide strategies to improve the referral process for PR.
{"title":"Assessing the Barriers and Facilitators to Pulmonary Rehabilitation Referrals Using the Consolidated Framework for Implementation Research (CFIR).","authors":"Aileen S Gabriel, Joseph Finkelstein","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Chronic obstructive pulmonary disease (COPD) is a global health issue causing significant illness and death. Pulmonary Rehabilitation (PR) offers non-pharmacological treatment, including education, exercise, and psychological support which was shown to improve clinical outcomes. In both stable COPD and after an acute exacerbation, PR has been demonstrated to increase exercise capacity, decrease dyspnea, and enhance quality of life. Despite these benefits, referrals for PR for COPD treatment remain low. This study aims to evaluate the perceptions of healthcare providers for referring a COPD patient to PR. Semi-structured qualitative interviews were conducted with pulmonary specialists, hospitalists, and emergency department physicians. Domains and constructs from the Consolidated Framework for Implementation Research (CFIR) were applied to the qualitative data to organize, analyze, and identify the barriers and facilitators to referring COPD patients. The findings from this study will help guide strategies to improve the referral process for PR.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141829/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rishivardhan Krishnamoorthy, Vishal Nagarajan, Hayden Pour, Supreeth P Shashikumar, Aaron Boussina, Emilia Farcas, Shamim Nemati, Christopher S Josef
Social Determinants of Health (SDoH) have been shown to have profound impacts on health-related outcomes, yet this data suffers from high rates of missingness in electronic health records (EHR). Moreover, limited English proficiency in the United States can be a barrier to communication with health care providers. In this study, we have designed a multilingual conversational agent capable of conducting SDoH surveys for use in healthcare environments. The agent asks questions in the patient's native language, translates responses into English, and subsequently maps these responses via a large language model (LLM) to structured options in a SDoH survey. This tool can be extended to a variety of survey instruments in either hospital or home settings, enabling the extraction of structured insights from free-text answers. The proposed approach heralds a shift towards more inclusive and insightful data collection, marking a significant stride in SDoH data enrichment for optimizing health outcome predictions and interventions.
{"title":"Voice-Enabled Response Analysis Agent (VERAA): Leveraging Large Language Models to Map Voice Responses in SDoH Survey.","authors":"Rishivardhan Krishnamoorthy, Vishal Nagarajan, Hayden Pour, Supreeth P Shashikumar, Aaron Boussina, Emilia Farcas, Shamim Nemati, Christopher S Josef","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Social Determinants of Health (SDoH) have been shown to have profound impacts on health-related outcomes, yet this data suffers from high rates of missingness in electronic health records (EHR). Moreover, limited English proficiency in the United States can be a barrier to communication with health care providers. In this study, we have designed a multilingual conversational agent capable of conducting SDoH surveys for use in healthcare environments. The agent asks questions in the patient's native language, translates responses into English, and subsequently maps these responses via a large language model (LLM) to structured options in a SDoH survey. This tool can be extended to a variety of survey instruments in either hospital or home settings, enabling the extraction of structured insights from free-text answers. The proposed approach heralds a shift towards more inclusive and insightful data collection, marking a significant stride in SDoH data enrichment for optimizing health outcome predictions and interventions.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141834/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurately determining and classifying different types of skin cancers is critical for early diagnosis. In this work, we propose a novel use of deep learning for classification of benign and malignant skin lesions using dermoscopy images. We obtained 770 de-identified dermoscopy images from the University of Missouri (MU) Healthcare. We created three unique image datasets that contained the original images and images obtained after applying a hair removal algorithm. We trained three popular deep learning models, namely, ResNet50, DenseNet121, and Inception-V3. We evaluated the accuracy and the area under the curve (AUC) receiver operating characteristic (ROC) for each model and dataset. DenseNet121 achieved the best accuracy (80.52%) and AUC ROC score (0.81) on the third dataset. For this dataset, the sensitivity and specificity were 0.80 and 0.81, respectively. We also present the SHAP (SHapley Additive exPlanations) values for the predictions made by different models to understand their interpretability.
{"title":"Comparison of Three Deep Learning Models in Accurate Classification of 770 Dermoscopy Skin Lesion Images.","authors":"Abdulmateen Adebiyi, Praveen Rao, Jesse Hirner, Anya Anokhin, Emily Hoffman Smith, Eduardo J Simoes, Mirna Becevic","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Accurately determining and classifying different types of skin cancers is critical for early diagnosis. In this work, we propose a novel use of deep learning for classification of benign and malignant skin lesions using dermoscopy images. We obtained 770 de-identified dermoscopy images from the University of Missouri (MU) Healthcare. We created three unique image datasets that contained the original images and images obtained after applying a hair removal algorithm. We trained three popular deep learning models, namely, ResNet50, DenseNet121, and Inception-V3. We evaluated the accuracy and the area under the curve (AUC) receiver operating characteristic (ROC) for each model and dataset. DenseNet121 achieved the best accuracy (80.52%) and AUC ROC score (0.81) on the third dataset. For this dataset, the sensitivity and specificity were 0.80 and 0.81, respectively. We also present the SHAP (SHapley Additive exPlanations) values for the predictions made by different models to understand their interpretability.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141796/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141199974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Megan Su, Stephanie Hu, Hong Xiong, Elias Baedorf Kassis, Li-Wei H Lehman
Sepsis is a life-threatening condition that occurs when the body's normal response to an infection is out of balance. A key part of managing sepsis involves the administration of intravenous fluids and vasopressors. In this work, we explore the application of G-Net, a deep sequential modeling framework for g-computation, to predict outcomes under counterfactual fluid treatment strategies in a real-world cohort of sepsis patients. Utilizing observational data collected from the intensive care unit (ICU), we evaluate the performance of multiple deep learning implementations of G-Net and compare their predictive performance with linear models in forecasting patient outcomes and trajectories over time under the observational treatment regime. We then demonstrate that G-Net can generate counterfactual prediction of covariate trajectories that align with clinical expectations across various fluid limiting regimes. Our study demonstrates the potential clinical utility of G-Net in predicting counterfactual treatment outcomes, aiding clinicians in informed decision-making for sepsis patients in the ICU.
{"title":"Counterfactual Sepsis Outcome Prediction Under Dynamic and Time-Varying Treatment Regimes.","authors":"Megan Su, Stephanie Hu, Hong Xiong, Elias Baedorf Kassis, Li-Wei H Lehman","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Sepsis is a life-threatening condition that occurs when the body's normal response to an infection is out of balance. A key part of managing sepsis involves the administration of intravenous fluids and vasopressors. In this work, we explore the application of G-Net, a deep sequential modeling framework for g-computation, to predict outcomes under counterfactual fluid treatment strategies in a real-world cohort of sepsis patients. Utilizing observational data collected from the intensive care unit (ICU), we evaluate the performance of multiple deep learning implementations of G-Net and compare their predictive performance with linear models in forecasting patient outcomes and trajectories over time under the observational treatment regime. We then demonstrate that G-Net can generate counterfactual prediction of covariate trajectories that align with clinical expectations across various fluid limiting regimes. Our study demonstrates the potential clinical utility of G-Net in predicting counterfactual treatment outcomes, aiding clinicians in informed decision-making for sepsis patients in the ICU.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}