Christine A Sinsky, Lisa Rotenstein, A Jay Holmgren, Nate C Apathy
Objective: To quantify how many patient scheduled hours would result in a 40-h work week (PSH40) for ambulatory physicians and to determine how PSH40 varies by specialty and practice type.
Methods: We calculated PSH40 for 186 188 ambulatory physicians across 395 organizations from November 2021 through April 2022 stratified by specialty.
Results: Median PSH40 for the sample was 33.2 h (IQR: 28.7-36.5). PSH40 was lowest in infectious disease (26.2, IQR: 21.6-31.1), geriatrics (27.2, IQR: 21.5-32.0) and hematology (28.6, IQR: 23.6-32.6) and highest in plastic surgery (35.7, IQR: 32.8-37.7), pain medicine (35.8, IQR: 32.6-37.9) and sports medicine (36.0, IQR: 33.3-38.1).
Discussion: Health system leaders and physicians will benefit from data driven and transparent discussions about work hour expectations. The PSH40 measure can also be used to quantify the impact of variations in the clinical care environment on the in-person ambulatory patient care time available to physicians.
Conclusions: PSH40 is a novel measure that can be generated from vendor-derived metrics and used by operational leaders to inform work expectations. It can also support research into the impact of changes in the care environment on physicians' workload and capacity.
{"title":"The number of patient scheduled hours resulting in a 40-hour work week by physician specialty and setting: a cross-sectional study using electronic health record event log data.","authors":"Christine A Sinsky, Lisa Rotenstein, A Jay Holmgren, Nate C Apathy","doi":"10.1093/jamia/ocae266","DOIUrl":"https://doi.org/10.1093/jamia/ocae266","url":null,"abstract":"<p><strong>Objective: </strong>To quantify how many patient scheduled hours would result in a 40-h work week (PSH40) for ambulatory physicians and to determine how PSH40 varies by specialty and practice type.</p><p><strong>Methods: </strong>We calculated PSH40 for 186 188 ambulatory physicians across 395 organizations from November 2021 through April 2022 stratified by specialty.</p><p><strong>Results: </strong>Median PSH40 for the sample was 33.2 h (IQR: 28.7-36.5). PSH40 was lowest in infectious disease (26.2, IQR: 21.6-31.1), geriatrics (27.2, IQR: 21.5-32.0) and hematology (28.6, IQR: 23.6-32.6) and highest in plastic surgery (35.7, IQR: 32.8-37.7), pain medicine (35.8, IQR: 32.6-37.9) and sports medicine (36.0, IQR: 33.3-38.1).</p><p><strong>Discussion: </strong>Health system leaders and physicians will benefit from data driven and transparent discussions about work hour expectations. The PSH40 measure can also be used to quantify the impact of variations in the clinical care environment on the in-person ambulatory patient care time available to physicians.</p><p><strong>Conclusions: </strong>PSH40 is a novel measure that can be generated from vendor-derived metrics and used by operational leaders to inform work expectations. It can also support research into the impact of changes in the care environment on physicians' workload and capacity.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: Measuring interpersonal firearm violence: natural language processing methods to address limitations in criminal charge data.","authors":"","doi":"10.1093/jamia/ocae268","DOIUrl":"https://doi.org/10.1093/jamia/ocae268","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiomara T Gonzalez, Karen Steger-May, Joanna Abraham
Objectives: Successful implementation of machine learning-augmented clinical decision support systems (ML-CDSS) in perioperative care requires the prioritization of patient-centric approaches to ensure alignment with societal expectations. We assessed general public and surgical patient attitudes and perspectives on ML-CDSS use in perioperative care.
Materials and methods: A sequential explanatory study was conducted. Stage 1 collected public opinions through a survey. Stage 2 ascertained surgical patients' experiences and attitudes via focus groups and interviews.
Results: For Stage 1, a total of 281 respondents' (140 males [49.8%]) data were considered. Among participants without ML awareness, males were almost three times more likely than females to report more acceptance (OR = 2.97; 95% CI, 1.36-6.49) and embrace (OR = 2.74; 95% CI, 1.23-6.09) of ML-CDSS use by perioperative teams. Males were almost twice as likely as females to report more acceptance across all perioperative phases with ORs ranging from 1.71 to 2.07. In Stage 2, insights from 10 surgical patients revealed unanimous agreement that ML-CDSS should primarily serve a supportive function. The pre- and post-operative phases were identified explicitly as forums where ML-CDSS can enhance care delivery. Patients requested for education on ML-CDSS's role in their care to be disseminated by surgeons across multiple platforms.
Discussion and conclusion: The general public and surgical patients are receptive to ML-CDSS use throughout their perioperative care provided its role is auxiliary to perioperative teams. However, the integration of ML-CDSS into perioperative workflows presents unique challenges for healthcare settings. Insights from this study can inform strategies to support large-scale implementation and adoption of ML-CDSS by patients in all perioperative phases. Key strategies to promote the feasibility and acceptability of ML-CDSS include clinician-led discussions about ML-CDSS's role in perioperative care, established metrics to evaluate the clinical utility of ML-CDSS, and patient education.
{"title":"Just another tool in their repertoire: uncovering insights into public and patient perspectives on clinicians' use of machine learning in perioperative care.","authors":"Xiomara T Gonzalez, Karen Steger-May, Joanna Abraham","doi":"10.1093/jamia/ocae257","DOIUrl":"https://doi.org/10.1093/jamia/ocae257","url":null,"abstract":"<p><strong>Objectives: </strong>Successful implementation of machine learning-augmented clinical decision support systems (ML-CDSS) in perioperative care requires the prioritization of patient-centric approaches to ensure alignment with societal expectations. We assessed general public and surgical patient attitudes and perspectives on ML-CDSS use in perioperative care.</p><p><strong>Materials and methods: </strong>A sequential explanatory study was conducted. Stage 1 collected public opinions through a survey. Stage 2 ascertained surgical patients' experiences and attitudes via focus groups and interviews.</p><p><strong>Results: </strong>For Stage 1, a total of 281 respondents' (140 males [49.8%]) data were considered. Among participants without ML awareness, males were almost three times more likely than females to report more acceptance (OR = 2.97; 95% CI, 1.36-6.49) and embrace (OR = 2.74; 95% CI, 1.23-6.09) of ML-CDSS use by perioperative teams. Males were almost twice as likely as females to report more acceptance across all perioperative phases with ORs ranging from 1.71 to 2.07. In Stage 2, insights from 10 surgical patients revealed unanimous agreement that ML-CDSS should primarily serve a supportive function. The pre- and post-operative phases were identified explicitly as forums where ML-CDSS can enhance care delivery. Patients requested for education on ML-CDSS's role in their care to be disseminated by surgeons across multiple platforms.</p><p><strong>Discussion and conclusion: </strong>The general public and surgical patients are receptive to ML-CDSS use throughout their perioperative care provided its role is auxiliary to perioperative teams. However, the integration of ML-CDSS into perioperative workflows presents unique challenges for healthcare settings. Insights from this study can inform strategies to support large-scale implementation and adoption of ML-CDSS by patients in all perioperative phases. Key strategies to promote the feasibility and acceptability of ML-CDSS include clinician-led discussions about ML-CDSS's role in perioperative care, established metrics to evaluate the clinical utility of ML-CDSS, and patient education.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objective: To understand barriers to obtaining and using interoperable information at US hospitals.
Materials and methods: Using 2023 nationally representative survey data on US hospitals (N = 2420), we examined major and minor barriers to exchanging information with other organizations, and how barriers vary by hospital characteristics and methods used to obtain information. Using a series of regression models, we examined how hospital experiences with barriers relate to routine use of information at responding hospitals.
Results: In 2023, most hospitals experienced at least one minor (81%) or major (62%) barrier to exchange, with the most common major barriers relating to different vendors and exchange partners' capabilities. Higher-resourced hospitals and those often using network-based exchange tended to experience more minor barriers whereas lower-resourced hospitals and those often using mail/fax or direct access to outside electronic health records experienced more major barriers. In multivariate regression, hospitals indicating "Patient matching" and "Costs to exchange" were a major or minor barrier had the strongest independent negative association with the likelihood of reporting providers at their hospital frequently use information from outside organizations.
Discussion: Despite progress in interoperable exchange, various barriers remain. The prevalence of barriers varied by hospital type and methods used, with barriers more often preventing exchange for lower-resourced hospitals and those using outdated exchange methods.
Conclusion: While several technical and policy efforts are underway to address prevalent barriers, it will be important to monitor whether efforts are successful in ensuring information from outside organizations can be seamlessly exchanged and used to inform patient care.
{"title":"Barriers to obtaining and using interoperable information among non-federal acute care hospitals.","authors":"Jordan Everson, Chelsea Richwine","doi":"10.1093/jamia/ocae263","DOIUrl":"https://doi.org/10.1093/jamia/ocae263","url":null,"abstract":"<p><strong>Objective: </strong>To understand barriers to obtaining and using interoperable information at US hospitals.</p><p><strong>Materials and methods: </strong>Using 2023 nationally representative survey data on US hospitals (N = 2420), we examined major and minor barriers to exchanging information with other organizations, and how barriers vary by hospital characteristics and methods used to obtain information. Using a series of regression models, we examined how hospital experiences with barriers relate to routine use of information at responding hospitals.</p><p><strong>Results: </strong>In 2023, most hospitals experienced at least one minor (81%) or major (62%) barrier to exchange, with the most common major barriers relating to different vendors and exchange partners' capabilities. Higher-resourced hospitals and those often using network-based exchange tended to experience more minor barriers whereas lower-resourced hospitals and those often using mail/fax or direct access to outside electronic health records experienced more major barriers. In multivariate regression, hospitals indicating \"Patient matching\" and \"Costs to exchange\" were a major or minor barrier had the strongest independent negative association with the likelihood of reporting providers at their hospital frequently use information from outside organizations.</p><p><strong>Discussion: </strong>Despite progress in interoperable exchange, various barriers remain. The prevalence of barriers varied by hospital type and methods used, with barriers more often preventing exchange for lower-resourced hospitals and those using outdated exchange methods.</p><p><strong>Conclusion: </strong>While several technical and policy efforts are underway to address prevalent barriers, it will be important to monitor whether efforts are successful in ensuring information from outside organizations can be seamlessly exchanged and used to inform patient care.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Allison E Gatz, Chenxi Xiong, Yao Chen, Shihui Jiang, Chi Mai Nguyen, Qianqian Song, Xiaochun Li, Pengyue Zhang, Michael T Eadon, Jing Su
Objective: To assess the health disparities across social determinants of health (SDoH) domains for the risk of severe acidosis independent of demographical and clinical factors.
Materials and methods: A retrospective case-control study (n = 13 310, 1:4 matching) is performed using electronic health records (EHRs), SDoH surveys, and genomics data from the All of Us participants. The propensity score matching controls confounding effects due to EHR data availability. Conditional logistic regressions are used to estimate odds ratios describing associations between SDoHs and the risk of acidosis events, adjusted for demographic features, and clinical conditions.
Results: Those with employer-provided insurance and those with Medicaid plans show dramatically different risks [adjusted odds ratio (AOR): 0.761 vs 1.41]. Low-income groups demonstrate higher risk (household income less than $25k, AOR: 1.3-1.57) than high-income groups ($100-$200k, AOR: 0.597-0.867). Other high-risk factors include impaired mobility (AOR: 1.32), unemployment (AOR: 1.32), renters (AOR: 1.41), other non-house-owners (AOR: 1.7), and house instability (AOR: 1.25). Education was negatively associated with acidosis risk.
Discussion: Our work provides real-world evidence of the comprehensive health disparities due to socioeconomic and behavioral contributors in a cohort enriched in minority groups or underrepresented populations.
Conclusions: SDoHs are strongly associated with systematic health disparities in the risk of severe metabolic acidosis. Types of health insurance, household income levels, housing status and stability, employment status, educational level, and mobility disability play significant roles after being adjusted for demographic features and clinical conditions. Comprehensive solutions are needed to improve equity in healthcare and reduce the risk of severe acidosis.
{"title":"Health disparities in the risk of severe acidosis: real-world evidence from the All of Us cohort.","authors":"Allison E Gatz, Chenxi Xiong, Yao Chen, Shihui Jiang, Chi Mai Nguyen, Qianqian Song, Xiaochun Li, Pengyue Zhang, Michael T Eadon, Jing Su","doi":"10.1093/jamia/ocae256","DOIUrl":"https://doi.org/10.1093/jamia/ocae256","url":null,"abstract":"<p><strong>Objective: </strong>To assess the health disparities across social determinants of health (SDoH) domains for the risk of severe acidosis independent of demographical and clinical factors.</p><p><strong>Materials and methods: </strong>A retrospective case-control study (n = 13 310, 1:4 matching) is performed using electronic health records (EHRs), SDoH surveys, and genomics data from the All of Us participants. The propensity score matching controls confounding effects due to EHR data availability. Conditional logistic regressions are used to estimate odds ratios describing associations between SDoHs and the risk of acidosis events, adjusted for demographic features, and clinical conditions.</p><p><strong>Results: </strong>Those with employer-provided insurance and those with Medicaid plans show dramatically different risks [adjusted odds ratio (AOR): 0.761 vs 1.41]. Low-income groups demonstrate higher risk (household income less than $25k, AOR: 1.3-1.57) than high-income groups ($100-$200k, AOR: 0.597-0.867). Other high-risk factors include impaired mobility (AOR: 1.32), unemployment (AOR: 1.32), renters (AOR: 1.41), other non-house-owners (AOR: 1.7), and house instability (AOR: 1.25). Education was negatively associated with acidosis risk.</p><p><strong>Discussion: </strong>Our work provides real-world evidence of the comprehensive health disparities due to socioeconomic and behavioral contributors in a cohort enriched in minority groups or underrepresented populations.</p><p><strong>Conclusions: </strong>SDoHs are strongly associated with systematic health disparities in the risk of severe metabolic acidosis. Types of health insurance, household income levels, housing status and stability, employment status, educational level, and mobility disability play significant roles after being adjusted for demographic features and clinical conditions. Comprehensive solutions are needed to improve equity in healthcare and reduce the risk of severe acidosis.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mary S Kim, Beomseok Park, Genevieve J Sippel, Aaron H Mun, Wanzhao Yang, Kathleen H McCarthy, Emely Fernandez, Marius George Linguraru, Aleksandra Sarcevic, Ivan Marsic, Randall S Burd
Objectives: Human monitoring of personal protective equipment (PPE) adherence among healthcare providers has several limitations, including the need for additional personnel during staff shortages and decreased vigilance during prolonged tasks. To address these challenges, we developed an automated computer vision system for monitoring PPE adherence in healthcare settings. We assessed the system performance against human observers detecting nonadherence in a video surveillance experiment.
Materials and methods: The automated system was trained to detect 15 classes of eyewear, masks, gloves, and gowns using an object detector and tracker. To assess how the system performs compared to human observers in detecting nonadherence, we designed a video surveillance experiment under 2 conditions: variations in video durations (20, 40, and 60 seconds) and the number of individuals in the videos (3 versus 6). Twelve nurses participated as human observers. Performance was assessed based on the number of detections of nonadherence.
Results: Human observers detected fewer instances of nonadherence than the system (parameter estimate -0.3, 95% CI -0.4 to -0.2, P < .001). Human observers detected more nonadherence during longer video durations (parameter estimate 0.7, 95% CI 0.4-1.0, P < .001). The system achieved a sensitivity of 0.86, specificity of 1, and Matthew's correlation coefficient of 0.82 for detecting PPE nonadherence.
Discussion: An automated system simultaneously tracks multiple objects and individuals. The system performance is also independent of observation duration, an improvement over human monitoring.
Conclusion: The automated system presents a potential solution for scalable monitoring of hospital-wide infection control practices and improving PPE usage in healthcare settings.
{"title":"Comparative analysis of personal protective equipment nonadherence detection: computer vision versus human observers.","authors":"Mary S Kim, Beomseok Park, Genevieve J Sippel, Aaron H Mun, Wanzhao Yang, Kathleen H McCarthy, Emely Fernandez, Marius George Linguraru, Aleksandra Sarcevic, Ivan Marsic, Randall S Burd","doi":"10.1093/jamia/ocae262","DOIUrl":"https://doi.org/10.1093/jamia/ocae262","url":null,"abstract":"<p><strong>Objectives: </strong>Human monitoring of personal protective equipment (PPE) adherence among healthcare providers has several limitations, including the need for additional personnel during staff shortages and decreased vigilance during prolonged tasks. To address these challenges, we developed an automated computer vision system for monitoring PPE adherence in healthcare settings. We assessed the system performance against human observers detecting nonadherence in a video surveillance experiment.</p><p><strong>Materials and methods: </strong>The automated system was trained to detect 15 classes of eyewear, masks, gloves, and gowns using an object detector and tracker. To assess how the system performs compared to human observers in detecting nonadherence, we designed a video surveillance experiment under 2 conditions: variations in video durations (20, 40, and 60 seconds) and the number of individuals in the videos (3 versus 6). Twelve nurses participated as human observers. Performance was assessed based on the number of detections of nonadherence.</p><p><strong>Results: </strong>Human observers detected fewer instances of nonadherence than the system (parameter estimate -0.3, 95% CI -0.4 to -0.2, P < .001). Human observers detected more nonadherence during longer video durations (parameter estimate 0.7, 95% CI 0.4-1.0, P < .001). The system achieved a sensitivity of 0.86, specificity of 1, and Matthew's correlation coefficient of 0.82 for detecting PPE nonadherence.</p><p><strong>Discussion: </strong>An automated system simultaneously tracks multiple objects and individuals. The system performance is also independent of observation duration, an improvement over human monitoring.</p><p><strong>Conclusion: </strong>The automated system presents a potential solution for scalable monitoring of hospital-wide infection control practices and improving PPE usage in healthcare settings.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Savage, John Wang, Robert Gallo, Abdessalem Boukil, Vishwesh Patel, Seyed Amir Ahmad Safavi-Naini, Ali Soroush, Jonathan H Chen
Introduction: The inability of large language models (LLMs) to communicate uncertainty is a significant barrier to their use in medicine. Before LLMs can be integrated into patient care, the field must assess methods to estimate uncertainty in ways that are useful to physician-users.
Objective: Evaluate the ability for uncertainty proxies to quantify LLM confidence when performing diagnosis and treatment selection tasks by assessing the properties of discrimination and calibration.
Methods: We examined confidence elicitation (CE), token-level probability (TLP), and sample consistency (SC) proxies across GPT3.5, GPT4, Llama2, and Llama3. Uncertainty proxies were evaluated against 3 datasets of open-ended patient scenarios.
Results: SC discrimination outperformed TLP and CE methods. SC by sentence embedding achieved the highest discriminative performance (ROC AUC 0.68-0.79), yet with poor calibration. SC by GPT annotation achieved the second-best discrimination (ROC AUC 0.66-0.74) with accurate calibration. Verbalized confidence (CE) was found to consistently overestimate model confidence.
Discussion and conclusions: SC is the most effective method for estimating LLM uncertainty of the proxies evaluated. SC by sentence embedding can effectively estimate uncertainty if the user has a set of reference cases with which to re-calibrate their results, while SC by GPT annotation is the more effective method if the user does not have reference cases and requires accurate raw calibration. Our results confirm LLMs are consistently over-confident when verbalizing their confidence (CE).
{"title":"Large language model uncertainty proxies: discrimination and calibration for medical diagnosis and treatment.","authors":"Thomas Savage, John Wang, Robert Gallo, Abdessalem Boukil, Vishwesh Patel, Seyed Amir Ahmad Safavi-Naini, Ali Soroush, Jonathan H Chen","doi":"10.1093/jamia/ocae254","DOIUrl":"10.1093/jamia/ocae254","url":null,"abstract":"<p><strong>Introduction: </strong>The inability of large language models (LLMs) to communicate uncertainty is a significant barrier to their use in medicine. Before LLMs can be integrated into patient care, the field must assess methods to estimate uncertainty in ways that are useful to physician-users.</p><p><strong>Objective: </strong>Evaluate the ability for uncertainty proxies to quantify LLM confidence when performing diagnosis and treatment selection tasks by assessing the properties of discrimination and calibration.</p><p><strong>Methods: </strong>We examined confidence elicitation (CE), token-level probability (TLP), and sample consistency (SC) proxies across GPT3.5, GPT4, Llama2, and Llama3. Uncertainty proxies were evaluated against 3 datasets of open-ended patient scenarios.</p><p><strong>Results: </strong>SC discrimination outperformed TLP and CE methods. SC by sentence embedding achieved the highest discriminative performance (ROC AUC 0.68-0.79), yet with poor calibration. SC by GPT annotation achieved the second-best discrimination (ROC AUC 0.66-0.74) with accurate calibration. Verbalized confidence (CE) was found to consistently overestimate model confidence.</p><p><strong>Discussion and conclusions: </strong>SC is the most effective method for estimating LLM uncertainty of the proxies evaluated. SC by sentence embedding can effectively estimate uncertainty if the user has a set of reference cases with which to re-calibrate their results, while SC by GPT annotation is the more effective method if the user does not have reference cases and requires accurate raw calibration. Our results confirm LLMs are consistently over-confident when verbalizing their confidence (CE).</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew J Zimolzak, Sundas P Khan, Hardeep Singh, Jessica A Davila
Objectives: Missed and delayed cancer diagnoses are common, harmful, and often preventable. We previously validated a digital quality measure (dQM) of emergency presentation (EP) of lung cancer in 2 US health systems. This study aimed to apply the dQM to a new national electronic health record (EHR) database and examine demographic associations.
Materials and methods: We applied the dQM (emergency encounter followed by new lung cancer diagnosis within 30 days) to Epic Cosmos, a deidentified database covering 184 million US patients. We examined dQM associations with sociodemographic factors.
Results: The overall EP rate was 19.6%. EP rate was higher in Black vs White patients (24% vs 19%, P < .001) and patients with younger age, higher social vulnerability, lower-income ZIP code, and self-reported transport difficulties.
Discussion: We successfully applied a dQM based on cancer EP to the largest US EHR database.
Conclusion: This dQM could be a marker for sociodemographic vulnerabilities in cancer diagnosis.
目标:癌症漏诊和延误诊断是常见的、有害的,而且往往是可以预防的。我们曾在美国的两个医疗系统中验证了肺癌急诊(EP)的数字质量测量(dQM)。本研究旨在将 dQM 应用于一个新的全国电子健康记录(EHR)数据库,并研究人口统计学关联:我们将 dQM(急诊后 30 天内新诊断出肺癌)应用于 Epic Cosmos,这是一个涵盖 1.84 亿美国患者的去身份化数据库。我们研究了 dQM 与社会人口因素的关系:结果:总体 EP 率为 19.6%。黑人患者的 EP 率高于白人患者(24% 对 19%,P 讨论):我们在美国最大的电子病历数据库中成功应用了基于癌症 EP 的 dQM:结论:该 dQM 可以作为癌症诊断中社会人口脆弱性的标记。
{"title":"Application of a digital quality measure for cancer diagnosis in Epic Cosmos.","authors":"Andrew J Zimolzak, Sundas P Khan, Hardeep Singh, Jessica A Davila","doi":"10.1093/jamia/ocae253","DOIUrl":"https://doi.org/10.1093/jamia/ocae253","url":null,"abstract":"<p><strong>Objectives: </strong>Missed and delayed cancer diagnoses are common, harmful, and often preventable. We previously validated a digital quality measure (dQM) of emergency presentation (EP) of lung cancer in 2 US health systems. This study aimed to apply the dQM to a new national electronic health record (EHR) database and examine demographic associations.</p><p><strong>Materials and methods: </strong>We applied the dQM (emergency encounter followed by new lung cancer diagnosis within 30 days) to Epic Cosmos, a deidentified database covering 184 million US patients. We examined dQM associations with sociodemographic factors.</p><p><strong>Results: </strong>The overall EP rate was 19.6%. EP rate was higher in Black vs White patients (24% vs 19%, P < .001) and patients with younger age, higher social vulnerability, lower-income ZIP code, and self-reported transport difficulties.</p><p><strong>Discussion: </strong>We successfully applied a dQM based on cancer EP to the largest US EHR database.</p><p><strong>Conclusion: </strong>This dQM could be a marker for sociodemographic vulnerabilities in cancer diagnosis.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiffani J Bright, Oliver J Bear Don't Walk Iv, Carl Erwin Johnson, Carolyn Petersen, Patricia C Dykes, Krista G Martin, Kevin B Johnson, Lois Walters-Threat, Catherine K Craven, Robert J Lucero, Gretchen P Jackson, Rubina F Rizvi
Objective: The American Medical Informatics Association (AMIA) Task Force on Diversity, Equity, and Inclusion (DEI) was established to address systemic racism and health disparities in biomedical and health informatics, aligning with AMIA's mission to transform healthcare. AMIA's DEI initiatives were spurred by member voices responding to police brutality and COVID-19's impact on Black/African American communities.
Materials and methods: The Task Force, consisting of 20 members across 3 groups aligned with AMIA's 2020-2025 Strategic Plan, met biweekly to develop DEI recommendations with the help of 16 additional volunteers. These recommendations were reviewed, prioritized, and presented to the AMIA Board of Directors for approval.
Results: In 9 months, the Task Force (1) created a logic model to support workforce diversity and raise AMIA's DEI awareness, (2) conducted an environmental scan of other associations' DEI activities, (3) developed a DEI framework for AMIA meetings, (4) gathered member feedback, (5) cultivated DEI educational resources, (6) created a Board nominations and diversity session, (7) reviewed the Board's Strategic Planning for DEI alignment, (8) led a program to increase diversity at the 2020 AMIA Virtual Annual Symposium, and (9) standardized socially-assigned race and ethnicity data collection.
Discussion: The Task Force proposed actionable recommendations that focused on AMIA's role in addressing systemic racism and health equity, helping the organization understand its member diversity.
Conclusion: This work supported marginalized groups, broadened the research agenda, and positioned AMIA as a DEI leader while reinforcing the need for ongoing transformation within informatics.
目标:美国医学信息学协会(American Medical Informatics Association,AMIA)多样性、公平性和包容性(Diversity, Equity, and Inclusion,DEI)工作组的成立旨在解决生物医学和健康信息学中的系统性种族主义和健康差异问题,这与 AMIA 改变医疗保健的使命相一致。AMIA的 "多样性与包容性"(DEI)倡议是由成员对警察暴力和COVID-19对黑人/非裔美国人社区的影响所发出的呼声推动的:工作组由 20 名成员组成,涉及 3 个与 AMIA 2020-2025 年战略计划相一致的小组,每两周召开一次会议,在另外 16 名志愿者的帮助下制定 DEI 建议。这些建议经过审核、排定优先次序后,提交给 AMIA 董事会批准:在 9 个月的时间里,特别工作组(1)创建了一个逻辑模型,以支持劳动力多样性并提高 AMIA 的 DEI 意识;(2)对其他协会的 DEI 活动进行了环境扫描;(3)为 AMIA 会议制定了 DEI 框架;(4)收集了会员反馈意见;(5)开发了 DEI 教育资源、(6) 创建了董事会提名和多样性会议,(7) 审查了董事会的战略规划,使其与 DEI 保持一致,(8) 在 2020 年 AMIA 虚拟年度研讨会上领导了一项提高多样性的计划,(9) 将社会分配的种族和民族数据收集标准化。讨论:工作组提出了可操作的建议,重点关注 AMIA 在解决系统性种族主义和健康公平方面的作用,帮助该组织了解其成员的多样性:这项工作为边缘化群体提供了支持,拓宽了研究议程,并将 AMIA 定位为 DEI 领导者,同时加强了信息学内部持续转型的必要性。
{"title":"The journey to building a diverse, equitable, and inclusive American Medical Informatics Association.","authors":"Tiffani J Bright, Oliver J Bear Don't Walk Iv, Carl Erwin Johnson, Carolyn Petersen, Patricia C Dykes, Krista G Martin, Kevin B Johnson, Lois Walters-Threat, Catherine K Craven, Robert J Lucero, Gretchen P Jackson, Rubina F Rizvi","doi":"10.1093/jamia/ocae258","DOIUrl":"https://doi.org/10.1093/jamia/ocae258","url":null,"abstract":"<p><strong>Objective: </strong>The American Medical Informatics Association (AMIA) Task Force on Diversity, Equity, and Inclusion (DEI) was established to address systemic racism and health disparities in biomedical and health informatics, aligning with AMIA's mission to transform healthcare. AMIA's DEI initiatives were spurred by member voices responding to police brutality and COVID-19's impact on Black/African American communities.</p><p><strong>Materials and methods: </strong>The Task Force, consisting of 20 members across 3 groups aligned with AMIA's 2020-2025 Strategic Plan, met biweekly to develop DEI recommendations with the help of 16 additional volunteers. These recommendations were reviewed, prioritized, and presented to the AMIA Board of Directors for approval.</p><p><strong>Results: </strong>In 9 months, the Task Force (1) created a logic model to support workforce diversity and raise AMIA's DEI awareness, (2) conducted an environmental scan of other associations' DEI activities, (3) developed a DEI framework for AMIA meetings, (4) gathered member feedback, (5) cultivated DEI educational resources, (6) created a Board nominations and diversity session, (7) reviewed the Board's Strategic Planning for DEI alignment, (8) led a program to increase diversity at the 2020 AMIA Virtual Annual Symposium, and (9) standardized socially-assigned race and ethnicity data collection.</p><p><strong>Discussion: </strong>The Task Force proposed actionable recommendations that focused on AMIA's role in addressing systemic racism and health equity, helping the organization understand its member diversity.</p><p><strong>Conclusion: </strong>This work supported marginalized groups, broadened the research agenda, and positioned AMIA as a DEI leader while reinforcing the need for ongoing transformation within informatics.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran
Objectives: To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.
Materials and methods: SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.
Results and discussion: 91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.
Conclusion: Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.
{"title":"Towards cross-application model-agnostic federated cohort discovery.","authors":"Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran","doi":"10.1093/jamia/ocae211","DOIUrl":"10.1093/jamia/ocae211","url":null,"abstract":"<p><strong>Objectives: </strong>To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.</p><p><strong>Materials and methods: </strong>SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.</p><p><strong>Results and discussion: </strong>91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.</p><p><strong>Conclusion: </strong>Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2202-2209"},"PeriodicalIF":4.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413448/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}