Zainab A Balogun, Pronob K Barman, Bianka K Onwumbiko, Tera L Reynolds
Mobile personal health records (mPHR) are smartphone apps that grant patients portable and continuous access to their medical records, thereby increasing the potential for patients to play an active role in managing their health. An extensive body of literature has focused on understanding user experiences with web-based tethered PHRs (i.e., patient portals) offered by healthcare organizations. However, patients' opinions of smartphone-based PHRs have received less attention. To address this gap, we used a computationally-guided qualitative analysis approach to analyze user reviews of six tethered and four interconnected mPHR apps available on both Google Play and Apple app stores. This approach resulted in identifying dimensions of user experiences related to usability, usefulness, and important features to users. Our findings reveal many similarities in user experiences for HCO-tethered and HCO-independent interconnected PHRs. However, there are some differences in user experiences between the types of PHRs and the different devices and platforms.
{"title":"A Computationally-guided Qualitative Analysis to Understand User Experiences with Different Types of Mobile Personal Health Records.","authors":"Zainab A Balogun, Pronob K Barman, Bianka K Onwumbiko, Tera L Reynolds","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Mobile personal health records (mPHR) are smartphone apps that grant patients portable and continuous access to their medical records, thereby increasing the potential for patients to play an active role in managing their health. An extensive body of literature has focused on understanding user experiences with web-based tethered PHRs (i.e., patient portals) offered by healthcare organizations. However, patients' opinions of smartphone-based PHRs have received less attention. To address this gap, we used a computationally-guided qualitative analysis approach to analyze user reviews of six tethered and four interconnected mPHR apps available on both Google Play and Apple app stores. This approach resulted in identifying dimensions of user experiences related to usability, usefulness, and important features to users. Our findings reveal many similarities in user experiences for HCO-tethered and HCO-independent interconnected PHRs. However, there are some differences in user experiences between the types of PHRs and the different devices and platforms.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"162-171"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099338/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aisha Urooj, Theo Dapamede, Bhavika Patel, Chadi Ayoub, Reza Arsanjani, William Charles O'Neill, Hari Trivedi, Imon Banerjee
Screening mammogram is a standard and cost-efficient imaging procedure to measure breast cancer risk among 45+ year old women. Quantifying breast arterial calcification (BAC) from screening mammograms is a non-invasive and cost-efficient approach to assess the future risk of adverse cardiovascular events among women, such as heart attack and stroke. However, segmentation of breast arterial calcification is an involved task and poses several technical challenges such as extremely small BAC finding, low breast arteries to breast area ratio in the mammogram images, tissue features such as breast folds and heterogeneous density, have very similar imaging appearance. In this work, we aim to address the shortcomings of existing SOTA methods, e.g., SCUNet, and analyze the comparative performance. Given the fact that we will not be able to simply resize mammogram to preserve the microscopic BAC details, we adopted a patch-based methodology for segmentation using the original resolution which may hinder the model understanding of whole mammogram. We propose a multi-task learning approach for patch-based BAC segmentation by adding an auxiliary task of patch position prediction which forces the model to learn breast anatomy to comprehend the locations where BAC will not occur, such as breast boundary. The proposed method achieves state-of-the-art performance compared to the baselines. To demonstrate the utility, we also validate our method on external data and provide survival analysis for adverse cardiac events based on difference in BAC score and provide a comparison with coronary calcium score (CAC).
{"title":"A Multi-Task Learning Approach for Segmentation of Breast Arterial Calcifications in Screening Mammograms.","authors":"Aisha Urooj, Theo Dapamede, Bhavika Patel, Chadi Ayoub, Reza Arsanjani, William Charles O'Neill, Hari Trivedi, Imon Banerjee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Screening mammogram is a standard and cost-efficient imaging procedure to measure breast cancer risk among 45+ year old women. Quantifying breast arterial calcification (BAC) from screening mammograms is a non-invasive and cost-efficient approach to assess the future risk of adverse cardiovascular events among women, such as heart attack and stroke. However, segmentation of breast arterial calcification is an involved task and poses several technical challenges such as extremely small BAC finding, low breast arteries to breast area ratio in the mammogram images, tissue features such as breast folds and heterogeneous density, have very similar imaging appearance. In this work, we aim to address the shortcomings of existing SOTA methods, e.g., SCUNet, and analyze the comparative performance. Given the fact that we will not be able to simply resize mammogram to preserve the microscopic BAC details, we adopted a patch-based methodology for segmentation using the original resolution which may hinder the model understanding of whole mammogram. We propose a multi-task learning approach for patch-based BAC segmentation by adding an auxiliary task of patch position prediction which forces the model to learn breast anatomy to comprehend the locations where BAC will not occur, such as breast boundary. The proposed method achieves state-of-the-art performance compared to the baselines. To demonstrate the utility, we also validate our method on external data and provide survival analysis for adverse cardiac events based on difference in BAC score and provide a comparison with coronary calcium score (CAC).</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1139-1148"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099439/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ting He, Kory Kreimeyer, Mimi Najjar, Jonathan Spiker, Maria Fatteh, Valsamo Anagnostou, Taxiarchis Botsis
The delivery of effective targeted therapies requires comprehensive analyses of the molecular profiling of tumors and matching with clinical phenotypes in the context of existing knowledge described in biomedical literature, registries, and knowledge bases. We evaluated the performance of natural language processing (NLP) approaches in supporting knowledge retrieval and synthesis from the biomedical literature. We tested PubTator 3.0, Bidirectional Encoder Representations from Transformers (BERT), and Large Language Models (LLMs) and evaluated their ability to support named entity recognition (NER) and relation extraction (RE) from biomedical texts. PubTator 3.0 and the BioBERT model performed best in the NER task (best F1-score 0.93 and 0.89, respectively), while BioBERT outperformed all other solutions in the RE task (best F1-score 0.79) and a specific use case it was applied to by recognizing nearly all entity mentions and most of the relations. Our findings support the use of AI-assisted approaches in facilitating precision oncology decision-making.
{"title":"Artificial Intelligence-assisted Biomedical Literature Knowledge Synthesis to Support Decision-making in Precision Oncology.","authors":"Ting He, Kory Kreimeyer, Mimi Najjar, Jonathan Spiker, Maria Fatteh, Valsamo Anagnostou, Taxiarchis Botsis","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The delivery of effective targeted therapies requires comprehensive analyses of the molecular profiling of tumors and matching with clinical phenotypes in the context of existing knowledge described in biomedical literature, registries, and knowledge bases. We evaluated the performance of natural language processing (NLP) approaches in supporting knowledge retrieval and synthesis from the biomedical literature. We tested PubTator 3.0, Bidirectional Encoder Representations from Transformers (BERT), and Large Language Models (LLMs) and evaluated their ability to support named entity recognition (NER) and relation extraction (RE) from biomedical texts. PubTator 3.0 and the BioBERT model performed best in the NER task (best F1-score 0.93 and 0.89, respectively), while BioBERT outperformed all other solutions in the RE task (best F1-score 0.79) and a specific use case it was applied to by recognizing nearly all entity mentions and most of the relations. Our findings support the use of AI-assisted approaches in facilitating precision oncology decision-making.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"513-522"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099343/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lourdes A Valdez, Edgar Javier Hernandez, O'Connor Matthews, Matthew Mulvey, Hillary Crandall, Karen Eilbeck
Electronic health records (EHRs) are information systems designed to collect and manage clinical data in order to support various clinical activities. They have emerged as valuable sources of data for outcomes research, offering vast repositories of patient information for analysis. Definitions for pediatric sepsis diagnosis are ambiguous, resulting in delayed diagnosis and treatment, highlighting the need for precise and efficient patient categorizing techniques. Nevertheless, the use of EHRs in research poses challenges. Although EHRs were originally created to document patient encounters, the medical coding was designed to satisfy billing requirements. As a result, EHR data may lack granularity, potentially leading to misclassification and incomplete representation of patient conditions. We compared data-driven ICD code categories to chart review using probabilistic graphical models (PGMs) due to their ability to handle uncertainty and incorporate prior knowledge. Overall, this paper demonstrates the potential of using PGMs to address these challenges and improve the analysis of ICD codes for sepsis outcomes research.
{"title":"Probabilistic Graphical Models for Evaluating the Utility of Data-Driven ICD Code Categories in Pediatric Sepsis.","authors":"Lourdes A Valdez, Edgar Javier Hernandez, O'Connor Matthews, Matthew Mulvey, Hillary Crandall, Karen Eilbeck","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic health records (EHRs) are information systems designed to collect and manage clinical data in order to support various clinical activities. They have emerged as valuable sources of data for outcomes research, offering vast repositories of patient information for analysis. Definitions for pediatric sepsis diagnosis are ambiguous, resulting in delayed diagnosis and treatment, highlighting the need for precise and efficient patient categorizing techniques. Nevertheless, the use of EHRs in research poses challenges. Although EHRs were originally created to document patient encounters, the medical coding was designed to satisfy billing requirements. As a result, EHR data may lack granularity, potentially leading to misclassification and incomplete representation of patient conditions. We compared data-driven ICD code categories to chart review using probabilistic graphical models (PGMs) due to their ability to handle uncertainty and incorporate prior knowledge. Overall, this paper demonstrates the potential of using PGMs to address these challenges and improve the analysis of ICD codes for sepsis outcomes research.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1149-1158"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Inko Bovenzi, Adi Carmel, Michael Hu, Rebecca Hurwitz, Fiona McBride, Leo Benac, José Roberto Tello Ayala, Finale Doshi-Velez
In aims to uncover insights into medical decision-making embedded within observational data from clinical settings, we present a novel application of Inverse Reinforcement Learning (IRL) that identifies suboptimal clinician actions based on the actions of their peers. This approach centers two stages of IRL with an intermediate step to prune trajectories displaying behavior that deviates significantly from the consensus. This enables us to effectively identify clinical priorities and values from ICU data containing both optimal and suboptimal clinician decisions. We observe that the benefits of removing suboptimal actions vary by disease and differentially impact certain demographic groups.
{"title":"Pruning the Path to Optimal Care: Identifying Systematically Suboptimal Medical Decision-Making with Inverse Reinforcement Learning.","authors":"Inko Bovenzi, Adi Carmel, Michael Hu, Rebecca Hurwitz, Fiona McBride, Leo Benac, José Roberto Tello Ayala, Finale Doshi-Velez","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In aims to uncover insights into medical decision-making embedded within observational data from clinical settings, we present a novel application of Inverse Reinforcement Learning (IRL) that identifies suboptimal clinician actions based on the actions of their peers. This approach centers two stages of IRL with an intermediate step to prune trajectories displaying behavior that deviates significantly from the consensus. This enables us to effectively identify clinical priorities and values from ICU data containing both optimal and suboptimal clinician decisions. We observe that the benefits of removing suboptimal actions vary by disease and differentially impact certain demographic groups.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"202-211"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099440/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simone Marini, Alexander Barquero, Anisha Ashok Wadhwani, Jiang Bian, Jaime Ruiz, Christina Boucher, Mattia Prosperi
Portable genomic sequencers such as Oxford Nanopore's MinION enable real-time applications in clinical and environmental health. However, there is a bottleneck in the downstream analytics when bioinformatics pipelines are unavailable, e.g., when cloud processing is unreachable due to absence of Internet connection, or only low-end computing devices can be carried on site. Here we present a platform-friendly software for portable metagenomic analysis of Nanopore data, the Oligomer-based Classifier of Taxonomic Operational and Pan-genome Units via Singletons (OCTOPUS). OCTOPUS is written in Java, reimplements several features of the popular Kraken2 and KrakenUniq software, with original components for improving metagenomics classification on incomplete/sampled reference databases, making it ideal for running on smartphones or tablets. OCTOPUS obtains sensitivity and precision comparable to Kraken2, while dramatically decreasing (4- to 16-fold) the false positive rate, and yielding high correlation on real-word data. OCTOPUS is available along with customized databases at https://github.com/DataIntellSystLab/OCTOPUS and https://github.com/Ruiz-HCI-Lab/OctopusMobile.
{"title":"OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier.","authors":"Simone Marini, Alexander Barquero, Anisha Ashok Wadhwani, Jiang Bian, Jaime Ruiz, Christina Boucher, Mattia Prosperi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Portable genomic sequencers such as Oxford Nanopore's MinION enable real-time applications in clinical and environmental health. However, there is a bottleneck in the downstream analytics when bioinformatics pipelines are unavailable, e.g., when cloud processing is unreachable due to absence of Internet connection, or only low-end computing devices can be carried on site. Here we present a platform-friendly software for portable metagenomic analysis of Nanopore data, the Oligomer-based Classifier of Taxonomic Operational and Pan-genome Units via Singletons (OCTOPUS). OCTOPUS is written in Java, reimplements several features of the popular Kraken2 and KrakenUniq software, with original components for improving metagenomics classification on incomplete/sampled reference databases, making it ideal for running on smartphones or tablets. OCTOPUS obtains sensitivity and precision comparable to Kraken2, while dramatically decreasing (4- to 16-fold) the false positive rate, and yielding high correlation on real-word data. OCTOPUS is available along with customized databases at https://github.com/DataIntellSystLab/OCTOPUS and https://github.com/Ruiz-HCI-Lab/OctopusMobile.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"798-807"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099329/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large electronic health records (EHR) have been widely implemented and are available for research activities. The magnitude of such databases often requires storage and computing infrastructure that are distributed at different sites. Restrictions on data-sharing due to privacy concerns have been another driving force behind the development of a large class of distributed and/or federated machine learning methods. While missing data problem is also present in distributed EHRs, albeit potentially more complex, distributed multiple imputation (MI) methods have not received as much attention. An important advantage of distributed MI, as well as distributed analysis, is that it allows researchers to borrow information across data sites, mitigating potential fairness issues for minority groups that do not have enough volume at certain sites. In this paper, we propose a communication-efficient and privacy-preserving distributed MI algorithms for variables that are missing not at random.
{"title":"Federated Multiple Imputation for Variables that Are Missing Not At Random in Distributed Electronic Health Records.","authors":"Yi Lian, Xiaoqian Jiang, Qi Long","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Large electronic health records (EHR) have been widely implemented and are available for research activities. The magnitude of such databases often requires storage and computing infrastructure that are distributed at different sites. Restrictions on data-sharing due to privacy concerns have been another driving force behind the development of a large class of distributed and/or federated machine learning methods. While missing data problem is also present in distributed EHRs, albeit potentially more complex, distributed multiple imputation (MI) methods have not received as much attention. An important advantage of distributed MI, as well as distributed analysis, is that it allows researchers to borrow information across data sites, mitigating potential fairness issues for minority groups that do not have enough volume at certain sites. In this paper, we propose a communication-efficient and privacy-preserving distributed MI algorithms for variables that are missing not at random.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"703-712"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099382/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hejie Cui, Zhuocheng Shen, Jieyu Zhang, Hui Shao, Lianhui Qin, Joyce C Ho, Carl Yang
Electronic health records (EHRs) contain valuable patient data for health-related prediction tasks, such as disease prediction. Traditional approaches rely on supervised learning methods that require large labeled datasets, which can be expensive and challenging to obtain. In this study, we investigate the feasibility of applying Large Language Models (LLMs) to convert structured patient visit data (e.g., diagnoses, labs, prescriptions) into natural language narratives. We evaluate the zero-shot and few-shot performance of LLMs using various EHR-prediction-oriented prompting strategies. Furthermore, we propose a novel approach that utilizes LLM agents with different roles: a predictor agent that makes predictions and generates reasoning processes and a critic agent that analyzes incorrect predictions and provides guidance for improving the reasoning of the predictor agent. Our results demonstrate that with the proposed approach, LLMs can achieve decent few-shot performance compared to traditional supervised learning methods in EHR-based disease predictions, suggesting its potential for health-oriented applications.
{"title":"LLMs-based Few-Shot Disease Predictions using EHR: A Novel Approach Combining Predictive Agent Reasoning and Critical Agent Instruction.","authors":"Hejie Cui, Zhuocheng Shen, Jieyu Zhang, Hui Shao, Lianhui Qin, Joyce C Ho, Carl Yang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic health records (EHRs) contain valuable patient data for health-related prediction tasks, such as disease prediction. Traditional approaches rely on supervised learning methods that require large labeled datasets, which can be expensive and challenging to obtain. In this study, we investigate the feasibility of applying Large Language Models (LLMs) to convert structured patient visit data (e.g., diagnoses, labs, prescriptions) into natural language narratives. We evaluate the zero-shot and few-shot performance of LLMs using various EHR-prediction-oriented prompting strategies. Furthermore, we propose a novel approach that utilizes LLM agents with different roles: a predictor agent that makes predictions and generates reasoning processes and a critic agent that analyzes incorrect predictions and provides guidance for improving the reasoning of the predictor agent. Our results demonstrate that with the proposed approach, LLMs can achieve decent few-shot performance compared to traditional supervised learning methods in EHR-based disease predictions, suggesting its potential for health-oriented applications.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"319-328"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099430/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Gong, Jaren Bresnick, Aidong Zhang, Cathy Wu, Kishlay Jha
Social determinants of health (SDoH) significantly impacts health outcomes and contributes to perpetuating health disparities across healthcare applications. However, automatic extraction of SDoH information from Electronic Health Records (EHRs) is challenging due to the unstructured nature of clinical narratives that contain SDoH related information. Recent advances in Large Language Models (LLMs) have shown great promise for automated SDoH extraction. However, their performance suffers for the imbalanced SDoH categories due to the data scarcity issues. To address this, we propose an innovative approach that augments LLMs with semantic knowledge obtained from the Unified Medical Language Systems (UMLS). This strategy enriches the feature representations of imbalanced SDoH classes, leading to accurate SDoH extraction. More specifically, our proposed data augmentation strategy generates semantically enriched clinical narratives at the LLM pre-finetuning stage. This approach enables the LLM to better adapt to the target data and leads to a good initialization for the finetuning stage. Through extensive experiments using publicly available MIMIC-SDoH data, the proposed approach demonstrates significant improvement in results for the SDoH extraction, especially for the imbalanced classes.
{"title":"Boosting Social Determinants of Health Extraction with Semantic Knowledge Augmented Large Language Model.","authors":"Lei Gong, Jaren Bresnick, Aidong Zhang, Cathy Wu, Kishlay Jha","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Social determinants of health (SDoH) significantly impacts health outcomes and contributes to perpetuating health disparities across healthcare applications. However, automatic extraction of SDoH information from Electronic Health Records (EHRs) is challenging due to the unstructured nature of clinical narratives that contain SDoH related information. Recent advances in Large Language Models (LLMs) have shown great promise for automated SDoH extraction. However, their performance suffers for the imbalanced SDoH categories due to the data scarcity issues. To address this, we propose an innovative approach that augments LLMs with semantic knowledge obtained from the Unified Medical Language Systems (UMLS). This strategy enriches the feature representations of imbalanced SDoH classes, leading to accurate SDoH extraction. More specifically, our proposed data augmentation strategy generates semantically enriched clinical narratives at the LLM pre-finetuning stage. This approach enables the LLM to better adapt to the target data and leads to a good initialization for the finetuning stage. Through extensive experiments using publicly available MIMIC-SDoH data, the proposed approach demonstrates significant improvement in results for the SDoH extraction, especially for the imbalanced classes.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"453-462"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyi Wu, Christian J Tejeda, Joanne Roman Jones, Allison B McCoy, Pamela M Garabedian, Lipika Samal, Patricia C Dykes
The transition from hospital to home can be a vulnerable and challenging period for patients, especially those living with multiple chronic conditions (MCC), as evidenced by their disproportionately high rates of readmission.1 Low health literacy, complexity of a new medication schedule, and "post-hospital syndrome" can all contribute to suboptimal adherence to discharge instructions.2 Timely and adequate support during transitional care has the potential to prevent adverse events and avoidable hospital readmissions. The use of mobile technology has been shown to improve health outcomes among those living with chronic illness by promoting self-management and adherence behavior.3 However, current digital interventions focus on the long-term management of a single chronic illness, failing to target the pivotal transition from hospital to home and to address the complex care needs required by those living with MCC. In this study, we describe the stakeholder requirement-gathering process used to inform the design of an EHR-integrated electronic tool to effectively address common care transition challenges for patients with MCC.
{"title":"Identifying Stakeholder Requirements for the Development of an Electronic Care Transitions Tool to Improve Health Outcomes for Patients with Multiple Chronic Conditions.","authors":"Hongyi Wu, Christian J Tejeda, Joanne Roman Jones, Allison B McCoy, Pamela M Garabedian, Lipika Samal, Patricia C Dykes","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The transition from hospital to home can be a vulnerable and challenging period for patients, especially those living with multiple chronic conditions (MCC), as evidenced by their disproportionately high rates of readmission.<sup>1</sup> Low health literacy, complexity of a new medication schedule, and \"post-hospital syndrome\" can all contribute to suboptimal adherence to discharge instructions.<sup>2</sup> Timely and adequate support during transitional care has the potential to prevent adverse events and avoidable hospital readmissions. The use of mobile technology has been shown to improve health outcomes among those living with chronic illness by promoting self-management and adherence behavior.<sup>3</sup> However, current digital interventions focus on the long-term management of a single chronic illness, failing to target the pivotal transition from hospital to home and to address the complex care needs required by those living with MCC. In this study, we describe the stakeholder requirement-gathering process used to inform the design of an EHR-integrated electronic tool to effectively address common care transition challenges for patients with MCC.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1255-1264"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099407/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}