Amanda M Lam, Mariana C Singletary, Theresa Cullen
Objective: This communication presents the results of defining a tribal health jurisdiction by a combination of tribal affiliation (TA) and case address.
Materials and methods: Through a county-tribal partnership, Geographic Information System (GIS) software and custom code were used to extract tribal data from county data by identifying reservation addresses in county extracts of COVID-19 case records from December 30, 2019, to December 31, 2022 (n = 374 653) and COVID-19 vaccination records from December 1, 2020, to April 18, 2023 (n = 2 355 058).
Results: The tool identified 1.91 times as many case records and 3.76 times as many vaccination records as filtering by TA alone.
Discussion and conclusion: This method of identifying communities by patient address, in combination with TA and enrollment, can help tribal health jurisdictions attain equitable access to public health data, when done in partnership with a data sharing agreement. This methodology has potential applications for other populations underrepresented in public health and clinical research.
{"title":"A GIS software-based method to identify public health data belonging to address-defined communities.","authors":"Amanda M Lam, Mariana C Singletary, Theresa Cullen","doi":"10.1093/jamia/ocae235","DOIUrl":"10.1093/jamia/ocae235","url":null,"abstract":"<p><strong>Objective: </strong>This communication presents the results of defining a tribal health jurisdiction by a combination of tribal affiliation (TA) and case address.</p><p><strong>Materials and methods: </strong>Through a county-tribal partnership, Geographic Information System (GIS) software and custom code were used to extract tribal data from county data by identifying reservation addresses in county extracts of COVID-19 case records from December 30, 2019, to December 31, 2022 (n = 374 653) and COVID-19 vaccination records from December 1, 2020, to April 18, 2023 (n = 2 355 058).</p><p><strong>Results: </strong>The tool identified 1.91 times as many case records and 3.76 times as many vaccination records as filtering by TA alone.</p><p><strong>Discussion and conclusion: </strong>This method of identifying communities by patient address, in combination with TA and enrollment, can help tribal health jurisdictions attain equitable access to public health data, when done in partnership with a data sharing agreement. This methodology has potential applications for other populations underrepresented in public health and clinical research.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2716-2721"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11491637/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142057033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objective: Unplanned readmissions following a hospitalization remain common despite significant efforts to curtail these. Wearable devices may offer help identify patients at high risk for an unplanned readmission.
Materials and methods: We conducted a multi-center retrospective cohort study using data from the All of Us data repository. We included subjects with wearable data and developed a baseline Feedforward Neural Network (FNN) model and a Long Short-Term Memory (LSTM) time-series deep learning model to predict daily, unplanned rehospitalizations up to 90 days from discharge. In addition to demographic and laboratory data from subjects, post-discharge data input features include wearable data and multiscale entropy features based on intraday wearable time series. The most significant features in the LSTM model were determined by permutation feature importance testing.
Results: In sum, 612 patients met inclusion criteria. The complete LSTM model had a higher area under the receiver operating characteristic curve than the FNN model (0.83 vs 0.795). The 5 most important input features included variables from multiscale entropy (steps) and number of active steps per day.
Discussion: Data available from wearable devices can improve ability to predict readmissions. Prior work has focused on predictors available up to discharge or on additional data abstracted from wearable devices. Our results from 35 institutions highlight how multiscale entropy can improve readmission prediction and may impact future work in this domain.
Conclusion: Wearable data and multiscale entropy can improve prediction of a deep-learning model to predict unplanned 90-day readmissions. Prospective studies are needed to validate these findings.
{"title":"Impact of wearable device data and multi-scale entropy analysis on improving hospital readmission prediction.","authors":"Vishal Nagarajan, Supreeth Prajwal Shashikumar, Atul Malhotra, Shamim Nemati, Gabriel Wardi","doi":"10.1093/jamia/ocae242","DOIUrl":"10.1093/jamia/ocae242","url":null,"abstract":"<p><strong>Objective: </strong>Unplanned readmissions following a hospitalization remain common despite significant efforts to curtail these. Wearable devices may offer help identify patients at high risk for an unplanned readmission.</p><p><strong>Materials and methods: </strong>We conducted a multi-center retrospective cohort study using data from the All of Us data repository. We included subjects with wearable data and developed a baseline Feedforward Neural Network (FNN) model and a Long Short-Term Memory (LSTM) time-series deep learning model to predict daily, unplanned rehospitalizations up to 90 days from discharge. In addition to demographic and laboratory data from subjects, post-discharge data input features include wearable data and multiscale entropy features based on intraday wearable time series. The most significant features in the LSTM model were determined by permutation feature importance testing.</p><p><strong>Results: </strong>In sum, 612 patients met inclusion criteria. The complete LSTM model had a higher area under the receiver operating characteristic curve than the FNN model (0.83 vs 0.795). The 5 most important input features included variables from multiscale entropy (steps) and number of active steps per day.</p><p><strong>Discussion: </strong>Data available from wearable devices can improve ability to predict readmissions. Prior work has focused on predictors available up to discharge or on additional data abstracted from wearable devices. Our results from 35 institutions highlight how multiscale entropy can improve readmission prediction and may impact future work in this domain.</p><p><strong>Conclusion: </strong>Wearable data and multiscale entropy can improve prediction of a deep-learning model to predict unplanned 90-day readmissions. Prospective studies are needed to validate these findings.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2679-2688"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11491659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran
Objectives: To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.
Materials and methods: SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.
Results and discussion: 91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.
Conclusion: Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.
{"title":"Towards cross-application model-agnostic federated cohort discovery.","authors":"Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran","doi":"10.1093/jamia/ocae211","DOIUrl":"10.1093/jamia/ocae211","url":null,"abstract":"<p><strong>Objectives: </strong>To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.</p><p><strong>Materials and methods: </strong>SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.</p><p><strong>Results and discussion: </strong>91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.</p><p><strong>Conclusion: </strong>Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2202-2209"},"PeriodicalIF":4.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413448/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anant Vasudevan, Savanna Plombon, Nicholas Piniella, Alison Garber, Maria Malik, Erin O'Fallon, Abhishek Goyal, Esteban Gershanik, Vivek Kumar, Julie Fiskio, Cathy Yoon, Stuart R Lipsitz, Jeffrey L Schnipper, Anuj K Dalal
Objectives: Post-discharge adverse events (AEs) are common and heralded by new and worsening symptoms (NWS). We evaluated the effect of electronic health record (EHR)-integrated digital tools designed to promote quality and safety in hospitalized patients on NWS and AEs after discharge.
Materials and methods: Adult general medicine patients at a community hospital were enrolled. We implemented a dashboard which clinicians used to assess safety risks during interdisciplinary rounds. Post-implementation patients were randomized to complete a discharge checklist whose responses were incorporated into the dashboard. Outcomes were assessed using EHR review and 30-day call data adjudicated by 2 clinicians and analyzed using Poisson regression. We conducted comparisons of each exposure on post-discharge outcomes and used selected variables and NWS as independent predictors to model post-discharge AEs using multivariable logistic regression.
Results: A total of 260 patients (122 pre, 71 post [dashboard], 67 post [dashboard plus discharge checklist]) enrolled. The adjusted incidence rate ratios (aIRR) for NWS and AEs were unchanged in the post- compared to pre-implementation period. For patient-reported NWS, aIRR was non-significantly higher for dashboard plus discharge checklist compared to dashboard participants (1.23 [0.97,1.56], P = .08). For post-implementation patients with an AE, aIRR for duration of injury (>1 week) was significantly lower for dashboard plus discharge checklist compared to dashboard participants (0 [0,0.53], P < .01). In multivariable models, certain patient-reported NWS were associated with AEs (3.76 [1.89,7.82], P < .01).
Discussion: While significant reductions in post-discharge AEs were not observed, checklist participants experiencing a post-discharge AE were more likely to report NWS and had a shorter duration of injury.
Conclusion: Interventions designed to prompt patients to report NWS may facilitate earlier detection of AEs after discharge.
{"title":"Effect of digital tools to promote hospital quality and safety on adverse events after discharge.","authors":"Anant Vasudevan, Savanna Plombon, Nicholas Piniella, Alison Garber, Maria Malik, Erin O'Fallon, Abhishek Goyal, Esteban Gershanik, Vivek Kumar, Julie Fiskio, Cathy Yoon, Stuart R Lipsitz, Jeffrey L Schnipper, Anuj K Dalal","doi":"10.1093/jamia/ocae176","DOIUrl":"10.1093/jamia/ocae176","url":null,"abstract":"<p><strong>Objectives: </strong>Post-discharge adverse events (AEs) are common and heralded by new and worsening symptoms (NWS). We evaluated the effect of electronic health record (EHR)-integrated digital tools designed to promote quality and safety in hospitalized patients on NWS and AEs after discharge.</p><p><strong>Materials and methods: </strong>Adult general medicine patients at a community hospital were enrolled. We implemented a dashboard which clinicians used to assess safety risks during interdisciplinary rounds. Post-implementation patients were randomized to complete a discharge checklist whose responses were incorporated into the dashboard. Outcomes were assessed using EHR review and 30-day call data adjudicated by 2 clinicians and analyzed using Poisson regression. We conducted comparisons of each exposure on post-discharge outcomes and used selected variables and NWS as independent predictors to model post-discharge AEs using multivariable logistic regression.</p><p><strong>Results: </strong>A total of 260 patients (122 pre, 71 post [dashboard], 67 post [dashboard plus discharge checklist]) enrolled. The adjusted incidence rate ratios (aIRR) for NWS and AEs were unchanged in the post- compared to pre-implementation period. For patient-reported NWS, aIRR was non-significantly higher for dashboard plus discharge checklist compared to dashboard participants (1.23 [0.97,1.56], P = .08). For post-implementation patients with an AE, aIRR for duration of injury (>1 week) was significantly lower for dashboard plus discharge checklist compared to dashboard participants (0 [0,0.53], P < .01). In multivariable models, certain patient-reported NWS were associated with AEs (3.76 [1.89,7.82], P < .01).</p><p><strong>Discussion: </strong>While significant reductions in post-discharge AEs were not observed, checklist participants experiencing a post-discharge AE were more likely to report NWS and had a shorter duration of injury.</p><p><strong>Conclusion: </strong>Interventions designed to prompt patients to report NWS may facilitate earlier detection of AEs after discharge.</p><p><strong>Clinicaltrials.gov: </strong>NCT05232656.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2304-2314"},"PeriodicalIF":4.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413445/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Importance: Reinforcement learning (RL) represents a pivotal avenue within natural language processing (NLP), offering a potent mechanism for acquiring optimal strategies in task completion. This literature review studies various NLP applications where RL has demonstrated efficacy, with notable applications in healthcare settings.
Objectives: To systematically explore the applications of RL in NLP, focusing on its effectiveness in acquiring optimal strategies, particularly in healthcare settings, and provide a comprehensive understanding of RL's potential in NLP tasks.
Materials and methods: Adhering to the PRISMA guidelines, an exhaustive literature review was conducted to identify instances where RL has exhibited success in NLP applications, encompassing dialogue systems, machine translation, question-answering, text summarization, and information extraction. Our methodological approach involves closely examining the technical aspects of RL methodologies employed in these applications, analyzing algorithms, states, rewards, actions, datasets, and encoder-decoder architectures.
Results: The review of 93 papers yields insights into RL algorithms, prevalent techniques, emergent trends, and the fusion of RL methods in NLP healthcare applications. It clarifies the strategic approaches employed, datasets utilized, and the dynamic terrain of RL-NLP systems, thereby offering a roadmap for research and development in RL and machine learning techniques in healthcare. The review also addresses ethical concerns to ensure equity, transparency, and accountability in the evolution and application of RL-based NLP technologies, particularly within sensitive domains such as healthcare.
Discussion: The findings underscore the promising role of RL in advancing NLP applications, particularly in healthcare, where its potential to optimize decision-making and enhance patient outcomes is significant. However, the ethical challenges and technical complexities associated with RL demand careful consideration and ongoing research to ensure responsible and effective implementation.
Conclusions: By systematically exploring RL's applications in NLP and providing insights into technical analysis, ethical implications, and potential advancements, this review contributes to a deeper understanding of RL's role for language processing.
{"title":"A review of reinforcement learning for natural language processing and applications in healthcare.","authors":"Ying Liu, Haozhu Wang, Huixue Zhou, Mingchen Li, Yu Hou, Sicheng Zhou, Fang Wang, Rama Hoetzlein, Rui Zhang","doi":"10.1093/jamia/ocae215","DOIUrl":"10.1093/jamia/ocae215","url":null,"abstract":"<p><strong>Importance: </strong>Reinforcement learning (RL) represents a pivotal avenue within natural language processing (NLP), offering a potent mechanism for acquiring optimal strategies in task completion. This literature review studies various NLP applications where RL has demonstrated efficacy, with notable applications in healthcare settings.</p><p><strong>Objectives: </strong>To systematically explore the applications of RL in NLP, focusing on its effectiveness in acquiring optimal strategies, particularly in healthcare settings, and provide a comprehensive understanding of RL's potential in NLP tasks.</p><p><strong>Materials and methods: </strong>Adhering to the PRISMA guidelines, an exhaustive literature review was conducted to identify instances where RL has exhibited success in NLP applications, encompassing dialogue systems, machine translation, question-answering, text summarization, and information extraction. Our methodological approach involves closely examining the technical aspects of RL methodologies employed in these applications, analyzing algorithms, states, rewards, actions, datasets, and encoder-decoder architectures.</p><p><strong>Results: </strong>The review of 93 papers yields insights into RL algorithms, prevalent techniques, emergent trends, and the fusion of RL methods in NLP healthcare applications. It clarifies the strategic approaches employed, datasets utilized, and the dynamic terrain of RL-NLP systems, thereby offering a roadmap for research and development in RL and machine learning techniques in healthcare. The review also addresses ethical concerns to ensure equity, transparency, and accountability in the evolution and application of RL-based NLP technologies, particularly within sensitive domains such as healthcare.</p><p><strong>Discussion: </strong>The findings underscore the promising role of RL in advancing NLP applications, particularly in healthcare, where its potential to optimize decision-making and enhance patient outcomes is significant. However, the ethical challenges and technical complexities associated with RL demand careful consideration and ongoing research to ensure responsible and effective implementation.</p><p><strong>Conclusions: </strong>By systematically exploring RL's applications in NLP and providing insights into technical analysis, ethical implications, and potential advancements, this review contributes to a deeper understanding of RL's role for language processing.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2379-2393"},"PeriodicalIF":4.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413430/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lidan Liu, Lu Liu, Hatem A Wafa, Florence Tydeman, Wanqing Xie, Yanzhong Wang
Objective: This study aims to conduct a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) using speech samples in depression.
Materials and methods: This review included studies reporting diagnostic results of DL algorithms in depression using speech data, published from inception to January 31, 2024, on PubMed, Medline, Embase, PsycINFO, Scopus, IEEE, and Web of Science databases. Pooled accuracy, sensitivity, and specificity were obtained by random-effect models. The diagnostic Precision Study Quality Assessment Tool (QUADAS-2) was used to assess the risk of bias.
Results: A total of 25 studies met the inclusion criteria and 8 of them were used in the meta-analysis. The pooled estimates of accuracy, specificity, and sensitivity for depression detection models were 0.87 (95% CI, 0.81-0.93), 0.85 (95% CI, 0.78-0.91), and 0.82 (95% CI, 0.71-0.94), respectively. When stratified by model structure, the highest pooled diagnostic accuracy was 0.89 (95% CI, 0.81-0.97) in the handcrafted group.
Discussion: To our knowledge, our study is the first meta-analysis on the diagnostic performance of DL for depression detection from speech samples. All studies included in the meta-analysis used convolutional neural network (CNN) models, posing problems in deciphering the performance of other DL algorithms. The handcrafted model performed better than the end-to-end model in speech depression detection.
Conclusions: The application of DL in speech provided a useful tool for depression detection. CNN models with handcrafted acoustic features could help to improve the diagnostic performance.
Protocol registration: The study protocol was registered on PROSPERO (CRD42023423603).
{"title":"Diagnostic accuracy of deep learning using speech samples in depression: a systematic review and meta-analysis.","authors":"Lidan Liu, Lu Liu, Hatem A Wafa, Florence Tydeman, Wanqing Xie, Yanzhong Wang","doi":"10.1093/jamia/ocae189","DOIUrl":"10.1093/jamia/ocae189","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to conduct a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) using speech samples in depression.</p><p><strong>Materials and methods: </strong>This review included studies reporting diagnostic results of DL algorithms in depression using speech data, published from inception to January 31, 2024, on PubMed, Medline, Embase, PsycINFO, Scopus, IEEE, and Web of Science databases. Pooled accuracy, sensitivity, and specificity were obtained by random-effect models. The diagnostic Precision Study Quality Assessment Tool (QUADAS-2) was used to assess the risk of bias.</p><p><strong>Results: </strong>A total of 25 studies met the inclusion criteria and 8 of them were used in the meta-analysis. The pooled estimates of accuracy, specificity, and sensitivity for depression detection models were 0.87 (95% CI, 0.81-0.93), 0.85 (95% CI, 0.78-0.91), and 0.82 (95% CI, 0.71-0.94), respectively. When stratified by model structure, the highest pooled diagnostic accuracy was 0.89 (95% CI, 0.81-0.97) in the handcrafted group.</p><p><strong>Discussion: </strong>To our knowledge, our study is the first meta-analysis on the diagnostic performance of DL for depression detection from speech samples. All studies included in the meta-analysis used convolutional neural network (CNN) models, posing problems in deciphering the performance of other DL algorithms. The handcrafted model performed better than the end-to-end model in speech depression detection.</p><p><strong>Conclusions: </strong>The application of DL in speech provided a useful tool for depression detection. CNN models with handcrafted acoustic features could help to improve the diagnostic performance.</p><p><strong>Protocol registration: </strong>The study protocol was registered on PROSPERO (CRD42023423603).</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2394-2404"},"PeriodicalIF":4.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413444/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matúš Falis, Aryo Pradipta Gema, Hang Dong, Luke Daines, Siddharth Basetti, Michael Holder, Rose S Penfold, Alexandra Birch, Beatrice Alex
Objectives The aim of this study was to investigate GPT-3.5 in generating and coding medical documents with International Classification of Diseases (ICD)-10 codes for data augmentation on low-resource labels. Materials and Methods Employing GPT-3.5 we generated and coded 9606 discharge summaries based on lists of ICD-10 code descriptions of patients with infrequent (or generation) codes within the MIMIC-IV dataset. Combined with the baseline training set, this formed an augmented training set. Neural coding models were trained on baseline and augmented data and evaluated on an MIMIC-IV test set. We report micro- and macro-F1 scores on the full codeset, generation codes, and their families. Weak Hierarchical Confusion Matrices determined within-family and outside-of-family coding errors in the latter codesets. The coding performance of GPT-3.5 was evaluated on prompt-guided self-generated data and real MIMIC-IV data. Clinicians evaluated the clinical acceptability of the generated documents. Results Data augmentation results in slightly lower overall model performance but improves performance for the generation candidate codes and their families, including 1 absent from the baseline training data. Augmented models display lower out-of-family error rates. GPT-3.5 identifies ICD-10 codes by their prompted descriptions but underperforms on real data. Evaluators highlight the correctness of generated concepts while suffering in variety, supporting information, and narrative. Discussion and Conclusion While GPT-3.5 alone given our prompt setting is unsuitable for ICD-10 coding, it supports data augmentation for training neural models. Augmentation positively affects generation code families but mainly benefits codes with existing examples. Augmentation reduces out-of-family errors. Documents generated by GPT-3.5 state prompted concepts correctly but lack variety, and authenticity in narratives.
{"title":"Can GPT-3.5 generate and code discharge summaries?","authors":"Matúš Falis, Aryo Pradipta Gema, Hang Dong, Luke Daines, Siddharth Basetti, Michael Holder, Rose S Penfold, Alexandra Birch, Beatrice Alex","doi":"10.1093/jamia/ocae132","DOIUrl":"https://doi.org/10.1093/jamia/ocae132","url":null,"abstract":"Objectives The aim of this study was to investigate GPT-3.5 in generating and coding medical documents with International Classification of Diseases (ICD)-10 codes for data augmentation on low-resource labels. Materials and Methods Employing GPT-3.5 we generated and coded 9606 discharge summaries based on lists of ICD-10 code descriptions of patients with infrequent (or generation) codes within the MIMIC-IV dataset. Combined with the baseline training set, this formed an augmented training set. Neural coding models were trained on baseline and augmented data and evaluated on an MIMIC-IV test set. We report micro- and macro-F1 scores on the full codeset, generation codes, and their families. Weak Hierarchical Confusion Matrices determined within-family and outside-of-family coding errors in the latter codesets. The coding performance of GPT-3.5 was evaluated on prompt-guided self-generated data and real MIMIC-IV data. Clinicians evaluated the clinical acceptability of the generated documents. Results Data augmentation results in slightly lower overall model performance but improves performance for the generation candidate codes and their families, including 1 absent from the baseline training data. Augmented models display lower out-of-family error rates. GPT-3.5 identifies ICD-10 codes by their prompted descriptions but underperforms on real data. Evaluators highlight the correctness of generated concepts while suffering in variety, supporting information, and narrative. Discussion and Conclusion While GPT-3.5 alone given our prompt setting is unsuitable for ICD-10 coding, it supports data augmentation for training neural models. Augmentation positively affects generation code families but mainly benefits codes with existing examples. Augmentation reduces out-of-family errors. Documents generated by GPT-3.5 state prompted concepts correctly but lack variety, and authenticity in narratives.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"18 1","pages":""},"PeriodicalIF":6.4,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Magdalena Z Raban, Erin Fitzpatrick, Alison Merchant, Bayzidur Rahman, Tim Badgery-Parker, Ling Li, Melissa T Baysari, Peter Barclay, Michael Dickinson, Virginia Mumford, Johanna I Westbrook
Objectives To examine changes in technology-related errors (TREs), their manifestations and underlying mechanisms at 3 time points after the implementation of computerized provider order entry (CPOE) in an electronic health record; and evaluate the clinical decision support (CDS) available to mitigate the TREs at 5-years post-CPOE. Materials and Methods Prescribing errors (n = 1315) of moderate, major, or serious potential harm identified through review of 35 322 orders at 3 time points (immediately, 1-year, and 4-years post-CPOE) were assessed to identify TREs at a tertiary pediatric hospital. TREs were coded using the Technology-Related Error Mechanism classification. TRE rates, percentage of prescribing errors that were TREs, and mechanism rates were compared over time. Each TRE was tested in the CPOE 5-years post-implementation to assess the availability of CDS to mitigate the error. Results TREs accounted for 32.5% (n = 428) of prescribing errors; an adjusted rate of 1.49 TREs/100 orders (95% confidence interval [CI]: 1.06, 1.92). At 1-year post-CPOE, the rate of TREs was 40% lower than immediately post (incident rate ratio [IRR]: 0.60; 95% CI: 0.41, 0.89). However, at 4-years post, the TRE rate was not significantly different to baseline (IRR: 0.80; 95% CI: 0.59, 1.08). “New workflows required by the CPOE” was the most frequent TRE mechanism at all time points. CDS was available to mitigate 32.7% of TREs. Discussion In a pediatric setting, TREs persisted 4-years post-CPOE with no difference in the rate compared to immediately post-CPOE. Conclusion Greater attention is required to address TREs to enhance the safety benefits of systems.
{"title":"Longitudinal study of the manifestations and mechanisms of technology-related prescribing errors in pediatrics","authors":"Magdalena Z Raban, Erin Fitzpatrick, Alison Merchant, Bayzidur Rahman, Tim Badgery-Parker, Ling Li, Melissa T Baysari, Peter Barclay, Michael Dickinson, Virginia Mumford, Johanna I Westbrook","doi":"10.1093/jamia/ocae218","DOIUrl":"https://doi.org/10.1093/jamia/ocae218","url":null,"abstract":"Objectives To examine changes in technology-related errors (TREs), their manifestations and underlying mechanisms at 3 time points after the implementation of computerized provider order entry (CPOE) in an electronic health record; and evaluate the clinical decision support (CDS) available to mitigate the TREs at 5-years post-CPOE. Materials and Methods Prescribing errors (n = 1315) of moderate, major, or serious potential harm identified through review of 35 322 orders at 3 time points (immediately, 1-year, and 4-years post-CPOE) were assessed to identify TREs at a tertiary pediatric hospital. TREs were coded using the Technology-Related Error Mechanism classification. TRE rates, percentage of prescribing errors that were TREs, and mechanism rates were compared over time. Each TRE was tested in the CPOE 5-years post-implementation to assess the availability of CDS to mitigate the error. Results TREs accounted for 32.5% (n = 428) of prescribing errors; an adjusted rate of 1.49 TREs/100 orders (95% confidence interval [CI]: 1.06, 1.92). At 1-year post-CPOE, the rate of TREs was 40% lower than immediately post (incident rate ratio [IRR]: 0.60; 95% CI: 0.41, 0.89). However, at 4-years post, the TRE rate was not significantly different to baseline (IRR: 0.80; 95% CI: 0.59, 1.08). “New workflows required by the CPOE” was the most frequent TRE mechanism at all time points. CDS was available to mitigate 32.7% of TREs. Discussion In a pediatric setting, TREs persisted 4-years post-CPOE with no difference in the rate compared to immediately post-CPOE. Conclusion Greater attention is required to address TREs to enhance the safety benefits of systems.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"6 1","pages":""},"PeriodicalIF":6.4,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hwayoung Cho, Oliver T Nguyen, Michael Weaver, Jennifer Pruitt, Cassie Marcelle, Ramzi G Salloum, Gail Keenan
Objectives Examine electronic health record (EHR) use and factors contributing to documentation burden in acute and critical care nurses. Materials and Methods A mixed-methods design was used guided by Unified Theory of Acceptance and Use of Technology. Key EHR components included, Flowsheets, Medication Administration Records (MAR), Care Plan, Notes, and Navigators. We first identified 5 units with the highest documentation burden in 1 university hospital through EHR log file analyses. Four nurses per unit were recruited and engaged in interviews and surveys designed to examine their perceptions of ease of use and usefulness of the 5 EHR components. A combination of inductive/deductive coding was used for qualitative data analysis. Results Nurses acknowledged the importance of documentation for patient care, yet perceived the required documentation as burdensome with levels varying across the 5 components. Factors contributing to burden included non-EHR issues (patient-to-nurse staffing ratios; patient acuity; suboptimal time management) and EHR usability issues related to design/features. Flowsheets, Care Plan, and Navigators were found to be below acceptable usability and contributed to more burden compared to MAR and Notes. The most troublesome EHR usability issues were data redundancy, poor workflow navigation, and cumbersome data entry based on unit type. Discussion Overall, we used quantitative and qualitative data to highlight challenges with current nursing documentation features in the EHR that contribute to documentation burden. Differences in perceived usability across the EHR documentation components were driven by multiple factors, such as non-alignment with workflows and amount of duplication of prior data entries. Nurses offered several recommendations for improving the EHR, including minimizing redundant or excessive data entry requirements, providing visual cues (eg, clear error messages, highlighting areas where missing or incorrect information are), and integrating decision support. Conclusion Our study generated evidence for nurse EHR use and specific documentation usability issues contributing to burden. Findings can inform the development of solutions for enhancing multi-component EHR usability that accommodates the unique workflow of nurses. Documentation strategies designed to improve nurse working conditions should include non-EHR factors as they also contribute to documentation burden.
{"title":"Electronic health record system use and documentation burden of acute and critical care nurse clinicians: a mixed-methods study","authors":"Hwayoung Cho, Oliver T Nguyen, Michael Weaver, Jennifer Pruitt, Cassie Marcelle, Ramzi G Salloum, Gail Keenan","doi":"10.1093/jamia/ocae239","DOIUrl":"https://doi.org/10.1093/jamia/ocae239","url":null,"abstract":"Objectives Examine electronic health record (EHR) use and factors contributing to documentation burden in acute and critical care nurses. Materials and Methods A mixed-methods design was used guided by Unified Theory of Acceptance and Use of Technology. Key EHR components included, Flowsheets, Medication Administration Records (MAR), Care Plan, Notes, and Navigators. We first identified 5 units with the highest documentation burden in 1 university hospital through EHR log file analyses. Four nurses per unit were recruited and engaged in interviews and surveys designed to examine their perceptions of ease of use and usefulness of the 5 EHR components. A combination of inductive/deductive coding was used for qualitative data analysis. Results Nurses acknowledged the importance of documentation for patient care, yet perceived the required documentation as burdensome with levels varying across the 5 components. Factors contributing to burden included non-EHR issues (patient-to-nurse staffing ratios; patient acuity; suboptimal time management) and EHR usability issues related to design/features. Flowsheets, Care Plan, and Navigators were found to be below acceptable usability and contributed to more burden compared to MAR and Notes. The most troublesome EHR usability issues were data redundancy, poor workflow navigation, and cumbersome data entry based on unit type. Discussion Overall, we used quantitative and qualitative data to highlight challenges with current nursing documentation features in the EHR that contribute to documentation burden. Differences in perceived usability across the EHR documentation components were driven by multiple factors, such as non-alignment with workflows and amount of duplication of prior data entries. Nurses offered several recommendations for improving the EHR, including minimizing redundant or excessive data entry requirements, providing visual cues (eg, clear error messages, highlighting areas where missing or incorrect information are), and integrating decision support. Conclusion Our study generated evidence for nurse EHR use and specific documentation usability issues contributing to burden. Findings can inform the development of solutions for enhancing multi-component EHR usability that accommodates the unique workflow of nurses. Documentation strategies designed to improve nurse working conditions should include non-EHR factors as they also contribute to documentation burden.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"64 1","pages":""},"PeriodicalIF":6.4,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
OBJECTIVESClinical Data Warehouses (CDW) are the designated infrastructures to enable access and analysis of large quantities of electronic health record data. Building and managing such systems implies extensive "data work" and coordination between multiple stakeholders. Our study focuses on the challenges these stakeholders face when designing, operating, and ensuring the durability of CDWs for research.MATERIALS AND METHODSWe conducted semistructured interviews with 21 professionals working with CDWs from France and Belgium. All interviews were recorded, transcribed verbatim, and coded inductively.RESULTSPrompted by the AI boom, healthcare institutions launched initiatives to repurpose data they were generating for care without a clear vision of how to generate value. Difficulties in operating CDWs arose quickly, strengthened by the multiplicity and diversity of stakeholders involved and grand discourses on the possibilities of CDWs, disjointed from their actual capabilities. Without proper management of the information flows, stakeholders struggled to build a shared vision. This was evident in our interviewees' contrasting appreciations of what mattered most to ensure data quality. Participants explained they struggled to manage knowledge inside and across institutions, generating knowledge loss, repeated mistakes, and impeding progress locally and nationally.DISCUSSION AND CONCLUSIONManagement issues strongly affect the deployment and operation of CDWs. This may stem from a simplistic linear vision of how this type of infrastructure operates. CDWs remain promising for research, and their design, implementation, and operation require careful management if they are to be successful. Building on innovation management, complex systems, and organizational learning knowledge will help.
{"title":"\"Goldmine\" or \"big mess\"? An interview study on the challenges of designing, operating, and ensuring the durability of Clinical Data Warehouses in France and Belgium.","authors":"Sonia Priou,Emmanuelle Kempf,Marija Jankovic,Guillaume Lamé","doi":"10.1093/jamia/ocae244","DOIUrl":"https://doi.org/10.1093/jamia/ocae244","url":null,"abstract":"OBJECTIVESClinical Data Warehouses (CDW) are the designated infrastructures to enable access and analysis of large quantities of electronic health record data. Building and managing such systems implies extensive \"data work\" and coordination between multiple stakeholders. Our study focuses on the challenges these stakeholders face when designing, operating, and ensuring the durability of CDWs for research.MATERIALS AND METHODSWe conducted semistructured interviews with 21 professionals working with CDWs from France and Belgium. All interviews were recorded, transcribed verbatim, and coded inductively.RESULTSPrompted by the AI boom, healthcare institutions launched initiatives to repurpose data they were generating for care without a clear vision of how to generate value. Difficulties in operating CDWs arose quickly, strengthened by the multiplicity and diversity of stakeholders involved and grand discourses on the possibilities of CDWs, disjointed from their actual capabilities. Without proper management of the information flows, stakeholders struggled to build a shared vision. This was evident in our interviewees' contrasting appreciations of what mattered most to ensure data quality. Participants explained they struggled to manage knowledge inside and across institutions, generating knowledge loss, repeated mistakes, and impeding progress locally and nationally.DISCUSSION AND CONCLUSIONManagement issues strongly affect the deployment and operation of CDWs. This may stem from a simplistic linear vision of how this type of infrastructure operates. CDWs remain promising for research, and their design, implementation, and operation require careful management if they are to be successful. Building on innovation management, complex systems, and organizational learning knowledge will help.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"45 11 1","pages":""},"PeriodicalIF":6.4,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}