首页 > 最新文献

Journal of the American Medical Informatics Association最新文献

英文 中文
Applying natural language processing to patient messages to identify depression concerns in cancer patients. 对患者信息进行自然语言处理,以识别癌症患者的抑郁问题。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-17 DOI: 10.1093/jamia/ocae188
Marieke M van Buchem, Anne A H de Hond, Claudio Fanconi, Vaibhavi Shah, Max Schuessler, Ilse M J Kant, Ewout W Steyerberg, Tina Hernandez-Boussard

Objective: This study aims to explore and develop tools for early identification of depression concerns among cancer patients by leveraging the novel data source of messages sent through a secure patient portal.

Materials and methods: We developed classifiers based on logistic regression (LR), support vector machines (SVMs), and 2 Bidirectional Encoder Representations from Transformers (BERT) models (original and Reddit-pretrained) on 6600 patient messages from a cancer center (2009-2022), annotated by a panel of healthcare professionals. Performance was compared using AUROC scores, and model fairness and explainability were examined. We also examined correlations between model predictions and depression diagnosis and treatment.

Results: BERT and RedditBERT attained AUROC scores of 0.88 and 0.86, respectively, compared to 0.79 for LR and 0.83 for SVM. BERT showed bigger differences in performance across sex, race, and ethnicity than RedditBERT. Patients who sent messages classified as concerning had a higher chance of receiving a depression diagnosis, a prescription for antidepressants, or a referral to the psycho-oncologist. Explanations from BERT and RedditBERT differed, with no clear preference from annotators.

Discussion: We show the potential of BERT and RedditBERT in identifying depression concerns in messages from cancer patients. Performance disparities across demographic groups highlight the need for careful consideration of potential biases. Further research is needed to address biases, evaluate real-world impacts, and ensure responsible integration into clinical settings.

Conclusion: This work represents a significant methodological advancement in the early identification of depression concerns among cancer patients. Our work contributes to a route to reduce clinical burden while enhancing overall patient care, leveraging BERT-based models.

研究目的本研究旨在利用通过安全患者门户网站发送的信息这一新型数据源,探索和开发早期识别癌症患者抑郁问题的工具:我们开发了基于逻辑回归 (LR)、支持向量机 (SVM) 和 2 个双向变换器编码器表征 (BERT) 模型(原始模型和 Reddit 训练模型)的分类器,这些模型来自癌症中心的 6600 条患者信息(2009-2022 年),由医疗保健专业人员小组进行注释。我们使用 AUROC 分数对性能进行了比较,并考察了模型的公平性和可解释性。我们还研究了模型预测与抑郁症诊断和治疗之间的相关性:BERT和RedditBERT的AUROC得分分别为0.88和0.86,而LR为0.79,SVM为0.83。与 RedditBERT 相比,BERT 在性别、种族和民族方面的表现差异更大。发送了相关信息的患者获得抑郁症诊断、抗抑郁药处方或转诊至肿瘤心理医生的几率更高。BERT和RedditBERT的解释各不相同,注释者没有明显的偏好:我们展示了 BERT 和 RedditBERT 在识别癌症患者信息中的抑郁问题方面的潜力。不同人口群体之间的性能差异凸显了仔细考虑潜在偏见的必要性。要解决偏差问题、评估现实世界的影响并确保负责任地融入临床环境,还需要进一步的研究:这项工作代表了癌症患者抑郁问题早期识别方法的重大进步。我们的工作有助于利用基于 BERT 的模型,在减轻临床负担的同时加强对患者的整体护理。
{"title":"Applying natural language processing to patient messages to identify depression concerns in cancer patients.","authors":"Marieke M van Buchem, Anne A H de Hond, Claudio Fanconi, Vaibhavi Shah, Max Schuessler, Ilse M J Kant, Ewout W Steyerberg, Tina Hernandez-Boussard","doi":"10.1093/jamia/ocae188","DOIUrl":"10.1093/jamia/ocae188","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to explore and develop tools for early identification of depression concerns among cancer patients by leveraging the novel data source of messages sent through a secure patient portal.</p><p><strong>Materials and methods: </strong>We developed classifiers based on logistic regression (LR), support vector machines (SVMs), and 2 Bidirectional Encoder Representations from Transformers (BERT) models (original and Reddit-pretrained) on 6600 patient messages from a cancer center (2009-2022), annotated by a panel of healthcare professionals. Performance was compared using AUROC scores, and model fairness and explainability were examined. We also examined correlations between model predictions and depression diagnosis and treatment.</p><p><strong>Results: </strong>BERT and RedditBERT attained AUROC scores of 0.88 and 0.86, respectively, compared to 0.79 for LR and 0.83 for SVM. BERT showed bigger differences in performance across sex, race, and ethnicity than RedditBERT. Patients who sent messages classified as concerning had a higher chance of receiving a depression diagnosis, a prescription for antidepressants, or a referral to the psycho-oncologist. Explanations from BERT and RedditBERT differed, with no clear preference from annotators.</p><p><strong>Discussion: </strong>We show the potential of BERT and RedditBERT in identifying depression concerns in messages from cancer patients. Performance disparities across demographic groups highlight the need for careful consideration of potential biases. Further research is needed to address biases, evaluate real-world impacts, and ensure responsible integration into clinical settings.</p><p><strong>Conclusion: </strong>This work represents a significant methodological advancement in the early identification of depression concerns among cancer patients. Our work contributes to a route to reduce clinical burden while enhancing overall patient care, leveraging BERT-based models.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141635560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accounting for taste: preferences mediate the relationship between documentation time and ambulatory physician burnout. 考虑品味:偏好是记录时间与非住院医生职业倦怠之间关系的中介。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-17 DOI: 10.1093/jamia/ocae193
Nate C Apathy, Heather Hartman-Hall, Alberta Tran, Dae Hyun Kim, Raj M Ratwani, Daniel Marchalik

Objectives: Physician burnout in the US has reached crisis levels, with one source identified as extensive after-hours documentation work in the electronic health record (EHR). Evidence has illustrated that physician preferences for after-hours work vary, such that after-hours work may not be universally burdensome. Our objectives were to analyze variation in preferences for after-hours documentation and assess if preferences mediate the relationship between after-hours documentation time and burnout.

Materials and methods: We combined EHR active use data capturing physicians' hourly documentation work with survey data capturing documentation preferences and burnout. Our sample included 318 ambulatory physicians at MedStar Health. We conducted a mediation analysis to estimate if and how preferences mediated the relationship between after-hours documentation time and burnout. Our primary outcome was physician-reported burnout. We measured preferences for after-hours documentation work via a novel survey instrument (Burden Scenarios Assessment). We measured after-hours documentation time in the EHR as the total active time respondents spent documenting between 7 pm and 3 am.

Results: Physician preferences varied, with completing clinical documentation after clinic hours while at home the scenario rated most burdensome (52.8% of physicians), followed by dealing with prior authorization (49.5% of physicians). In mediation analyses, preferences partially mediated the relationship between after-hours documentation time and burnout.

Discussion: Physician preferences regarding EHR-based work play an important role in the relationship between after-hours documentation time and burnout.

Conclusion: Studies of EHR work and burnout should incorporate preferences, and operational leaders should assess preferences to better target interventions aimed at EHR-based contributors to burnout.

目的:美国医生的职业倦怠已达到危机水平,其中一个原因是下班后大量的电子健康记录(EHR)文档编制工作。有证据表明,医生对下班后工作的偏好各不相同,因此下班后工作可能并不是普遍的负担。我们的目标是分析医生对下班后文档工作的偏好差异,并评估这种偏好是否能调节下班后文档工作时间与职业倦怠之间的关系:我们将记录医生每小时文档工作的 EHR 有效使用数据与记录文档偏好和职业倦怠的调查数据相结合。我们的样本包括 MedStar Health 的 318 名门诊医生。我们进行了一项中介分析,以估计偏好是否以及如何中介下班后文档记录时间与职业倦怠之间的关系。我们的主要结果是医生报告的职业倦怠。我们通过一种新颖的调查工具(负担情景评估)测量了医生对下班后文档工作的偏好。我们用受访者在晚上 7 点到凌晨 3 点之间记录文档的总有效时间来衡量下班后在电子病历中记录文档的时间:结果:医生的偏好各不相同,其中在下班后在家完成临床文档记录的情况被评为最繁重(52.8% 的医生),其次是处理预先授权(49.5% 的医生)。在中介分析中,偏好部分中介了下班后文档记录时间与职业倦怠之间的关系:讨论:医生对基于电子病历的工作的偏好在下班后文档记录时间与职业倦怠之间的关系中起着重要作用:对电子病历工作和职业倦怠的研究应纳入偏好,业务领导应评估偏好,以便更好地针对电子病历导致职业倦怠的因素采取干预措施。
{"title":"Accounting for taste: preferences mediate the relationship between documentation time and ambulatory physician burnout.","authors":"Nate C Apathy, Heather Hartman-Hall, Alberta Tran, Dae Hyun Kim, Raj M Ratwani, Daniel Marchalik","doi":"10.1093/jamia/ocae193","DOIUrl":"https://doi.org/10.1093/jamia/ocae193","url":null,"abstract":"<p><strong>Objectives: </strong>Physician burnout in the US has reached crisis levels, with one source identified as extensive after-hours documentation work in the electronic health record (EHR). Evidence has illustrated that physician preferences for after-hours work vary, such that after-hours work may not be universally burdensome. Our objectives were to analyze variation in preferences for after-hours documentation and assess if preferences mediate the relationship between after-hours documentation time and burnout.</p><p><strong>Materials and methods: </strong>We combined EHR active use data capturing physicians' hourly documentation work with survey data capturing documentation preferences and burnout. Our sample included 318 ambulatory physicians at MedStar Health. We conducted a mediation analysis to estimate if and how preferences mediated the relationship between after-hours documentation time and burnout. Our primary outcome was physician-reported burnout. We measured preferences for after-hours documentation work via a novel survey instrument (Burden Scenarios Assessment). We measured after-hours documentation time in the EHR as the total active time respondents spent documenting between 7 pm and 3 am.</p><p><strong>Results: </strong>Physician preferences varied, with completing clinical documentation after clinic hours while at home the scenario rated most burdensome (52.8% of physicians), followed by dealing with prior authorization (49.5% of physicians). In mediation analyses, preferences partially mediated the relationship between after-hours documentation time and burnout.</p><p><strong>Discussion: </strong>Physician preferences regarding EHR-based work play an important role in the relationship between after-hours documentation time and burnout.</p><p><strong>Conclusion: </strong>Studies of EHR work and burnout should incorporate preferences, and operational leaders should assess preferences to better target interventions aimed at EHR-based contributors to burnout.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141635559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of digital tools to promote hospital quality and safety on adverse events after discharge. 促进医院质量与安全的数字化工具对出院后不良事件的影响。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-16 DOI: 10.1093/jamia/ocae176
Anant Vasudevan, Savanna Plombon, Nicholas Piniella, Alison Garber, Maria Malik, Erin O'Fallon, Abhishek Goyal, Esteban Gershanik, Vivek Kumar, Julie Fiskio, Cathy Yoon, Stuart R Lipsitz, Jeffrey L Schnipper, Anuj K Dalal

Objectives: Post-discharge adverse events (AEs) are common and heralded by new and worsening symptoms (NWS). We evaluated the effect of electronic health record (EHR)-integrated digital tools designed to promote quality and safety in hospitalized patients on NWS and AEs after discharge.

Materials and methods: Adult general medicine patients at a community hospital were enrolled. We implemented a dashboard which clinicians used to assess safety risks during interdisciplinary rounds. Post-implementation patients were randomized to complete a discharge checklist whose responses were incorporated into the dashboard. Outcomes were assessed using EHR review and 30-day call data adjudicated by 2 clinicians and analyzed using Poisson regression. We conducted comparisons of each exposure on post-discharge outcomes and used selected variables and NWS as independent predictors to model post-discharge AEs using multivariable logistic regression.

Results: A total of 260 patients (122 pre, 71 post [dashboard], 67 post [dashboard plus discharge checklist]) enrolled. The adjusted incidence rate ratios (aIRR) for NWS and AEs were unchanged in the post- compared to pre-implementation period. For patient-reported NWS, aIRR was non-significantly higher for dashboard plus discharge checklist compared to dashboard participants (1.23 [0.97,1.56], P = .08). For post-implementation patients with an AE, aIRR for duration of injury (>1 week) was significantly lower for dashboard plus discharge checklist compared to dashboard participants (0 [0,0.53], P < .01). In multivariable models, certain patient-reported NWS were associated with AEs (3.76 [1.89,7.82], P < .01).

Discussion: While significant reductions in post-discharge AEs were not observed, checklist participants experiencing a post-discharge AE were more likely to report NWS and had a shorter duration of injury.

Conclusion: Interventions designed to prompt patients to report NWS may facilitate earlier detection of AEs after discharge.

Clinicaltrials.gov: NCT05232656.

目的:出院后不良事件(AEs)很常见,并以新症状和恶化症状(NWS)为先兆。我们评估了旨在提高住院患者质量和安全的电子健康记录(EHR)集成数字工具对出院后新症状和不良事件的影响:研究对象为一家社区医院的成人全科患者。我们实施了一个仪表板,临床医生在跨学科查房时用它来评估安全风险。实施后,患者被随机分配填写出院核对表,并将其回复纳入仪表板。结果通过电子病历审查和 30 天呼叫数据进行评估,由两名临床医生裁定,并使用泊松回归进行分析。我们比较了每种暴露对出院后结果的影响,并将选定变量和 NWS 作为独立预测因子,使用多变量逻辑回归对出院后 AEs 进行建模:共有 260 名患者(122 名出院前、71 名出院后[仪表板]、67 名出院后[仪表板加出院检查单])参加了研究。与实施前相比,实施后 NWS 和 AE 的调整后发病率比 (aIRR) 保持不变。就患者报告的 NWS 而言,与仪表板参与者相比,仪表板加出院核对表参与者的 aIRR 较高,但无显著性差异(1.23 [0.97,1.56],P = .08)。对于实施后出现 AE 的患者,与仪表板参与者相比,仪表板加出院核对表患者的损伤持续时间(>1 周)的 aIRR 显著降低(0 [0,0.53],P 讨论):虽然没有观察到出院后 AE 的明显减少,但出院后发生 AE 的核对表参与者更有可能报告 NWS,且受伤持续时间更短:结论:旨在促使患者报告 NWS 的干预措施可能有助于更早地发现出院后 AE:NCT05232656。
{"title":"Effect of digital tools to promote hospital quality and safety on adverse events after discharge.","authors":"Anant Vasudevan, Savanna Plombon, Nicholas Piniella, Alison Garber, Maria Malik, Erin O'Fallon, Abhishek Goyal, Esteban Gershanik, Vivek Kumar, Julie Fiskio, Cathy Yoon, Stuart R Lipsitz, Jeffrey L Schnipper, Anuj K Dalal","doi":"10.1093/jamia/ocae176","DOIUrl":"https://doi.org/10.1093/jamia/ocae176","url":null,"abstract":"<p><strong>Objectives: </strong>Post-discharge adverse events (AEs) are common and heralded by new and worsening symptoms (NWS). We evaluated the effect of electronic health record (EHR)-integrated digital tools designed to promote quality and safety in hospitalized patients on NWS and AEs after discharge.</p><p><strong>Materials and methods: </strong>Adult general medicine patients at a community hospital were enrolled. We implemented a dashboard which clinicians used to assess safety risks during interdisciplinary rounds. Post-implementation patients were randomized to complete a discharge checklist whose responses were incorporated into the dashboard. Outcomes were assessed using EHR review and 30-day call data adjudicated by 2 clinicians and analyzed using Poisson regression. We conducted comparisons of each exposure on post-discharge outcomes and used selected variables and NWS as independent predictors to model post-discharge AEs using multivariable logistic regression.</p><p><strong>Results: </strong>A total of 260 patients (122 pre, 71 post [dashboard], 67 post [dashboard plus discharge checklist]) enrolled. The adjusted incidence rate ratios (aIRR) for NWS and AEs were unchanged in the post- compared to pre-implementation period. For patient-reported NWS, aIRR was non-significantly higher for dashboard plus discharge checklist compared to dashboard participants (1.23 [0.97,1.56], P = .08). For post-implementation patients with an AE, aIRR for duration of injury (>1 week) was significantly lower for dashboard plus discharge checklist compared to dashboard participants (0 [0,0.53], P < .01). In multivariable models, certain patient-reported NWS were associated with AEs (3.76 [1.89,7.82], P < .01).</p><p><strong>Discussion: </strong>While significant reductions in post-discharge AEs were not observed, checklist participants experiencing a post-discharge AE were more likely to report NWS and had a shorter duration of injury.</p><p><strong>Conclusion: </strong>Interventions designed to prompt patients to report NWS may facilitate earlier detection of AEs after discharge.</p><p><strong>Clinicaltrials.gov: </strong>NCT05232656.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diagnostic accuracy of deep learning using speech samples in depression: a systematic review and meta-analysis. 利用语音样本进行深度学习对抑郁症的诊断准确性:系统综述和荟萃分析。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-16 DOI: 10.1093/jamia/ocae189
Lidan Liu, Lu Liu, Hatem A Wafa, Florence Tydeman, Wanqing Xie, Yanzhong Wang

Objective: This study aims to conduct a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) using speech samples in depression.

Materials and methods: This review included studies reporting diagnostic results of DL algorithms in depression using speech data, published from inception to January 31, 2024, on PubMed, Medline, Embase, PsycINFO, Scopus, IEEE, and Web of Science databases. Pooled accuracy, sensitivity, and specificity were obtained by random-effect models. The diagnostic Precision Study Quality Assessment Tool (QUADAS-2) was used to assess the risk of bias.

Results: A total of 25 studies met the inclusion criteria and 8 of them were used in the meta-analysis. The pooled estimates of accuracy, specificity, and sensitivity for depression detection models were 0.87 (95% CI, 0.81-0.93), 0.85 (95% CI, 0.78-0.91), and 0.82 (95% CI, 0.71-0.94), respectively. When stratified by model structure, the highest pooled diagnostic accuracy was 0.89 (95% CI, 0.81-0.97) in the handcrafted group.

Discussion: To our knowledge, our study is the first meta-analysis on the diagnostic performance of DL for depression detection from speech samples. All studies included in the meta-analysis used convolutional neural network (CNN) models, posing problems in deciphering the performance of other DL algorithms. The handcrafted model performed better than the end-to-end model in speech depression detection.

Conclusions: The application of DL in speech provided a useful tool for depression detection. CNN models with handcrafted acoustic features could help to improve the diagnostic performance.

Protocol registration: The study protocol was registered on PROSPERO (CRD42023423603).

研究目的本研究旨在对使用语音样本的深度学习(DL)对抑郁症的诊断准确性进行系统综述和荟萃分析:本综述纳入了PubMed、Medline、Embase、PsycINFO、Scopus、IEEE和Web of Science数据库中从开始到2024年1月31日发表的、报告使用语音数据的深度学习算法对抑郁症的诊断结果的研究。通过随机效应模型得出了汇总的准确性、敏感性和特异性。诊断精确性研究质量评估工具(QUADAS-2)用于评估偏倚风险:共有 25 项研究符合纳入标准,其中 8 项用于荟萃分析。抑郁检测模型的准确性、特异性和敏感性的汇总估计值分别为 0.87(95% CI,0.81-0.93)、0.85(95% CI,0.78-0.91)和 0.82(95% CI,0.71-0.94)。按模型结构分层后,手工组的汇总诊断准确率最高,为 0.89(95% CI,0.81-0.97):据我们所知,我们的研究是首次对 DL 从语音样本中检测抑郁的诊断性能进行荟萃分析。所有纳入荟萃分析的研究都使用了卷积神经网络(CNN)模型,这给解读其他 DL 算法的性能带来了问题。在语音抑郁检测中,手工制作的模型比端到端模型表现更好:在语音中应用 DL 为抑郁检测提供了有用的工具。带有手工制作声学特征的 CNN 模型有助于提高诊断性能:研究方案已在 PROSPERO(CRD42023423603)上注册。
{"title":"Diagnostic accuracy of deep learning using speech samples in depression: a systematic review and meta-analysis.","authors":"Lidan Liu, Lu Liu, Hatem A Wafa, Florence Tydeman, Wanqing Xie, Yanzhong Wang","doi":"10.1093/jamia/ocae189","DOIUrl":"https://doi.org/10.1093/jamia/ocae189","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to conduct a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) using speech samples in depression.</p><p><strong>Materials and methods: </strong>This review included studies reporting diagnostic results of DL algorithms in depression using speech data, published from inception to January 31, 2024, on PubMed, Medline, Embase, PsycINFO, Scopus, IEEE, and Web of Science databases. Pooled accuracy, sensitivity, and specificity were obtained by random-effect models. The diagnostic Precision Study Quality Assessment Tool (QUADAS-2) was used to assess the risk of bias.</p><p><strong>Results: </strong>A total of 25 studies met the inclusion criteria and 8 of them were used in the meta-analysis. The pooled estimates of accuracy, specificity, and sensitivity for depression detection models were 0.87 (95% CI, 0.81-0.93), 0.85 (95% CI, 0.78-0.91), and 0.82 (95% CI, 0.71-0.94), respectively. When stratified by model structure, the highest pooled diagnostic accuracy was 0.89 (95% CI, 0.81-0.97) in the handcrafted group.</p><p><strong>Discussion: </strong>To our knowledge, our study is the first meta-analysis on the diagnostic performance of DL for depression detection from speech samples. All studies included in the meta-analysis used convolutional neural network (CNN) models, posing problems in deciphering the performance of other DL algorithms. The handcrafted model performed better than the end-to-end model in speech depression detection.</p><p><strong>Conclusions: </strong>The application of DL in speech provided a useful tool for depression detection. CNN models with handcrafted acoustic features could help to improve the diagnostic performance.</p><p><strong>Protocol registration: </strong>The study protocol was registered on PROSPERO (CRD42023423603).</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging artificial intelligence to summarize abstracts in lay language for increasing research accessibility and transparency. 利用人工智能以通俗语言总结摘要,提高研究的可及性和透明度。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-15 DOI: 10.1093/jamia/ocae186
Cathy Shyr, Randall W Grout, Nan Kennedy, Yasemin Akdas, Maeve Tischbein, Joshua Milford, Jason Tan, Kaysi Quarles, Terri L Edwards, Laurie L Novak, Jules White, Consuelo H Wilkins, Paul A Harris

Objective: Returning aggregate study results is an important ethical responsibility to promote trust and inform decision making, but the practice of providing results to a lay audience is not widely adopted. Barriers include significant cost and time required to develop lay summaries and scarce infrastructure necessary for returning them to the public. Our study aims to generate, evaluate, and implement ChatGPT 4 lay summaries of scientific abstracts on a national clinical study recruitment platform, ResearchMatch, to facilitate timely and cost-effective return of study results at scale.

Materials and methods: We engineered prompts to summarize abstracts at a literacy level accessible to the public, prioritizing succinctness, clarity, and practical relevance. Researchers and volunteers assessed ChatGPT-generated lay summaries across five dimensions: accuracy, relevance, accessibility, transparency, and harmfulness. We used precision analysis and adaptive random sampling to determine the optimal number of summaries for evaluation, ensuring high statistical precision.

Results: ChatGPT achieved 95.9% (95% CI, 92.1-97.9) accuracy and 96.2% (92.4-98.1) relevance across 192 summary sentences from 33 abstracts based on researcher review. 85.3% (69.9-93.6) of 34 volunteers perceived ChatGPT-generated summaries as more accessible and 73.5% (56.9-85.4) more transparent than the original abstract. None of the summaries were deemed harmful. We expanded ResearchMatch's technical infrastructure to automatically generate and display lay summaries for over 750 published studies that resulted from the platform's recruitment mechanism.

Discussion and conclusion: Implementing AI-generated lay summaries on ResearchMatch demonstrates the potential of a scalable framework generalizable to broader platforms for enhancing research accessibility and transparency.

目的:归还综合研究结果是一项重要的道德责任,可促进信任并为决策提供信息,但向非专业受众提供研究结果的做法并未得到广泛采用。其中的障碍包括开发非专业人员摘要所需的大量成本和时间,以及向公众返回摘要所需的基础设施匮乏。我们的研究旨在全国性临床研究招募平台 ResearchMatch 上生成、评估和实施 ChatGPT 4 外行人科学摘要,以促进研究结果及时、经济高效地大规模返回:我们设计了提示语,以公众可接受的文化水平总结摘要,优先考虑简洁、清晰和实际相关性。研究人员和志愿者从五个方面对 ChatGPT 生成的非专业摘要进行了评估:准确性、相关性、易读性、透明度和有害性。我们使用精确度分析和自适应随机抽样来确定最佳评估摘要数量,以确保较高的统计精确度:根据研究人员的审查结果,ChatGPT 对 33 篇摘要中的 192 个摘要句子的准确性和相关性分别达到了 95.9% (95% CI, 92.1-97.9) 和 96.2% (92.4-98.1)。在 34 名志愿者中,85.3%(69.9-93.6)的人认为 ChatGPT 生成的摘要比原始摘要更易懂,73.5%(56.9-85.4)的人认为 ChatGPT 生成的摘要比原始摘要更透明。没有一份摘要被认为是有害的。我们扩展了 ResearchMatch 的技术基础设施,为平台招募机制产生的 750 多项已发表研究自动生成并显示非专业摘要:在 ResearchMatch 上实施人工智能生成的非专业人员摘要证明了一个可扩展框架的潜力,该框架可推广到更广泛的平台,以提高研究的可及性和透明度。
{"title":"Leveraging artificial intelligence to summarize abstracts in lay language for increasing research accessibility and transparency.","authors":"Cathy Shyr, Randall W Grout, Nan Kennedy, Yasemin Akdas, Maeve Tischbein, Joshua Milford, Jason Tan, Kaysi Quarles, Terri L Edwards, Laurie L Novak, Jules White, Consuelo H Wilkins, Paul A Harris","doi":"10.1093/jamia/ocae186","DOIUrl":"10.1093/jamia/ocae186","url":null,"abstract":"<p><strong>Objective: </strong>Returning aggregate study results is an important ethical responsibility to promote trust and inform decision making, but the practice of providing results to a lay audience is not widely adopted. Barriers include significant cost and time required to develop lay summaries and scarce infrastructure necessary for returning them to the public. Our study aims to generate, evaluate, and implement ChatGPT 4 lay summaries of scientific abstracts on a national clinical study recruitment platform, ResearchMatch, to facilitate timely and cost-effective return of study results at scale.</p><p><strong>Materials and methods: </strong>We engineered prompts to summarize abstracts at a literacy level accessible to the public, prioritizing succinctness, clarity, and practical relevance. Researchers and volunteers assessed ChatGPT-generated lay summaries across five dimensions: accuracy, relevance, accessibility, transparency, and harmfulness. We used precision analysis and adaptive random sampling to determine the optimal number of summaries for evaluation, ensuring high statistical precision.</p><p><strong>Results: </strong>ChatGPT achieved 95.9% (95% CI, 92.1-97.9) accuracy and 96.2% (92.4-98.1) relevance across 192 summary sentences from 33 abstracts based on researcher review. 85.3% (69.9-93.6) of 34 volunteers perceived ChatGPT-generated summaries as more accessible and 73.5% (56.9-85.4) more transparent than the original abstract. None of the summaries were deemed harmful. We expanded ResearchMatch's technical infrastructure to automatically generate and display lay summaries for over 750 published studies that resulted from the platform's recruitment mechanism.</p><p><strong>Discussion and conclusion: </strong>Implementing AI-generated lay summaries on ResearchMatch demonstrates the potential of a scalable framework generalizable to broader platforms for enhancing research accessibility and transparency.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141621606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Augmenting clinicians' analytical workflow through task-based integration of data visualizations and algorithmic insights: a user-centered design study. 通过基于任务的数据可视化和算法洞察力整合,增强临床医生的分析工作流程:一项以用户为中心的设计研究。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-14 DOI: 10.1093/jamia/ocae183
Till Scholich, Shriti Raj, Joyce Lee, Mark W Newman

Objectives: To understand healthcare providers' experiences of using GlucoGuide, a mockup tool that integrates visual data analysis with algorithmic insights to support clinicians' use of patientgenerated data from Type 1 diabetes devices.

Materials and methods: This qualitative study was conducted in three phases. In Phase 1, 11 clinicians reviewed data using commercial diabetes platforms in a think-aloud data walkthrough activity followed by semistructured interviews. In Phase 2, GlucoGuide was developed. In Phase 3, the same clinicians reviewed data using GlucoGuide in a think-aloud activity followed by semistructured interviews. Inductive thematic analysis was used to analyze transcripts of Phase 1 and Phase 3 think-aloud activity and interview.

Results: 3 high level tasks, 8 sub-tasks, and 4 challenges were identified in Phase 1. In Phase 2, 3 requirements for GlucoGuide were identified. Phase 3 results suggested that clinicians found GlucoGuide easier to use and experienced a lower cognitive burden as compared to the commercial diabetes data reports that were used in Phase 1. Additionally, GlucoGuide addressed the challenges experienced in Phase 1.

Discussion: The study suggests that the knowledge of analytical tasks and task-specific visualization strategies in implementing features of data interfaces can result in tools that lower the perceived burden of engaging with data. Additionally, supporting clinicians in contextualizing algorithmic insights by visual analysis of relevant data can positively influence clinicians' willingness to leverage algorithmic support.

Conclusion: Task-aligned tools that combine multiple data-driven approaches, such as visualization strategies and algorithmic insights, can improve clinicians' experience in reviewing device data.

目的:了解医疗服务提供者使用 GlucoGuide 的经验:了解医疗服务提供者使用 GlucoGuide 的经验。GlucoGuide 是一种模拟工具,它将可视化数据分析与算法见解相结合,以支持临床医生使用患者从 1 型糖尿病设备中生成的数据:这项定性研究分三个阶段进行。在第 1 阶段,11 名临床医生在 "思考--大声说 "数据演练活动中查看了使用商业糖尿病平台的数据,随后进行了半结构化访谈。在第 2 阶段,开发了 GlucoGuide。在第 3 阶段,同样的临床医生通过 "畅所欲言 "活动使用 GlucoGuide 回顾了数据,随后进行了半结构化访谈。归纳式主题分析法用于分析第 1 阶段和第 3 阶段的畅想活动和访谈记录:结果:第一阶段确定了 3 项高级任务、8 项子任务和 4 项挑战。第 2 阶段确定了 GlucoGuide 的 3 项要求。第 3 阶段的结果表明,与第 1 阶段使用的商业糖尿病数据报告相比,临床医生认为 GlucoGuide 更易于使用,认知负担更轻。此外,GlucoGuide 解决了第 1 阶段遇到的难题:该研究表明,在实施数据界面功能时,对分析任务和特定任务可视化策略的了解可以使工具降低人们在使用数据时的认知负担。此外,通过对相关数据进行可视化分析,支持临床医生将算法见解与实际情况相结合,可对临床医生利用算法支持的意愿产生积极影响:结论:结合多种数据驱动方法(如可视化策略和算法见解)的任务调整工具可以改善临床医生审查设备数据的体验。
{"title":"Augmenting clinicians' analytical workflow through task-based integration of data visualizations and algorithmic insights: a user-centered design study.","authors":"Till Scholich, Shriti Raj, Joyce Lee, Mark W Newman","doi":"10.1093/jamia/ocae183","DOIUrl":"https://doi.org/10.1093/jamia/ocae183","url":null,"abstract":"<p><strong>Objectives: </strong>To understand healthcare providers' experiences of using GlucoGuide, a mockup tool that integrates visual data analysis with algorithmic insights to support clinicians' use of patientgenerated data from Type 1 diabetes devices.</p><p><strong>Materials and methods: </strong>This qualitative study was conducted in three phases. In Phase 1, 11 clinicians reviewed data using commercial diabetes platforms in a think-aloud data walkthrough activity followed by semistructured interviews. In Phase 2, GlucoGuide was developed. In Phase 3, the same clinicians reviewed data using GlucoGuide in a think-aloud activity followed by semistructured interviews. Inductive thematic analysis was used to analyze transcripts of Phase 1 and Phase 3 think-aloud activity and interview.</p><p><strong>Results: </strong>3 high level tasks, 8 sub-tasks, and 4 challenges were identified in Phase 1. In Phase 2, 3 requirements for GlucoGuide were identified. Phase 3 results suggested that clinicians found GlucoGuide easier to use and experienced a lower cognitive burden as compared to the commercial diabetes data reports that were used in Phase 1. Additionally, GlucoGuide addressed the challenges experienced in Phase 1.</p><p><strong>Discussion: </strong>The study suggests that the knowledge of analytical tasks and task-specific visualization strategies in implementing features of data interfaces can result in tools that lower the perceived burden of engaging with data. Additionally, supporting clinicians in contextualizing algorithmic insights by visual analysis of relevant data can positively influence clinicians' willingness to leverage algorithmic support.</p><p><strong>Conclusion: </strong>Task-aligned tools that combine multiple data-driven approaches, such as visualization strategies and algorithmic insights, can improve clinicians' experience in reviewing device data.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-based estimation of individual-level social determinants of health and its applications in All of Us. 基于模型的个人健康社会决定因素估算及其在《我们大家》中的应用。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-14 DOI: 10.1093/jamia/ocae168
Bo Young Kim, Rebecca Anthopolos, Hyungrok Do, Judy Zhong

Objectives: We introduce a widely applicable model-based approach for estimating individual-level Social Determinants of Health (SDoH) and evaluate its effectiveness using the All of Us Research Program.

Materials and methods: Our approach utilizes aggregated SDoH datasets to estimate individual-level SDoH, demonstrated with examples of no high school diploma (NOHSDP) and no health insurance (UNINSUR) variables. Models are estimated using American Community Survey data and applied to derive individual-level estimates for All of Us participants. We assess concordance between model-based SDoH estimates and self-reported SDoHs in All of Us and examine associations with undiagnosed hypertension and diabetes.

Results: Compared to self-reported SDoHs, the area under the curve for NOHSDP is 0.727 (95% CI, 0.724-0.730) and for UNINSUR is 0.730 (95% CI, 0.727-0.733) among the 329 074 All of Us participants, both significantly higher than aggregated SDoHs. The association between model-based NOHSDP and undiagnosed hypertension is concordant with those estimated using self-reported NOHSDP, with a correlation coefficient of 0.649. Similarly, the association between model-based NOHSDP and undiagnosed diabetes is concordant with those estimated using self-reported NOHSDP, with a correlation coefficient of 0.900.

Discussion and conclusion: The model-based SDoH estimation method offers a scalable and easily standardized approach for estimating individual-level SDoHs. Using the All of Us dataset, we demonstrate reasonable concordance between model-based SDoH estimates and self-reported SDoHs, along with consistent associations with health outcomes. Our findings also underscore the critical role of geographic contexts in SDoH estimation and in evaluating the association between SDoHs and health outcomes.

目的:我们介绍了一种广泛适用的基于模型的方法,用于估算个人层面的社会健康决定因素(SDoH),并利用 "我们所有人 "研究计划评估其有效性:我们介绍了一种广泛适用的基于模型的方法,用于估算个人层面的健康社会决定因素(SDoH),并利用 "我们所有人 "研究计划对其有效性进行了评估:我们的方法利用汇总的 SDoH 数据集来估算个人层面的 SDoH,并以无高中文凭(NOHSDP)和无医疗保险(UNINSUR)变量为例进行演示。我们使用美国社区调查数据对模型进行了估算,并将其应用于推导 "我们所有人 "参与者的个人水平估算值。我们评估了基于模型的 SDoH 估计值与 "我们所有人 "中自我报告的 SDoH 之间的一致性,并研究了与未确诊的高血压和糖尿病之间的关联:在 329074 名 All of Us 参与者中,与自我报告的 SDoHs 相比,NOHSDP 的曲线下面积为 0.727(95% CI,0.724-0.730),UNINSUR 的曲线下面积为 0.730(95% CI,0.727-0.733),均显著高于综合 SDoHs。基于模型的 NOHSDP 与未确诊高血压之间的相关性与使用自我报告的 NOHSDP 估算的相关性一致,相关系数为 0.649。同样,基于模型的 NOHSDP 与未确诊糖尿病之间的相关性与使用自我报告的 NOHSDP 估算的相关性一致,相关系数为 0.900:基于模型的 SDoH 估算方法为估算个人层面的 SDoH 提供了一种可扩展且易于标准化的方法。利用 "我们所有人 "数据集,我们证明了基于模型的 SDoH 估算值与自我报告的 SDoH 之间的合理一致性,以及与健康结果之间的一致关联。我们的研究结果还强调了地理环境在 SDoH 估算以及 SDoH 与健康结果之间关联评估中的关键作用。
{"title":"Model-based estimation of individual-level social determinants of health and its applications in All of Us.","authors":"Bo Young Kim, Rebecca Anthopolos, Hyungrok Do, Judy Zhong","doi":"10.1093/jamia/ocae168","DOIUrl":"10.1093/jamia/ocae168","url":null,"abstract":"<p><strong>Objectives: </strong>We introduce a widely applicable model-based approach for estimating individual-level Social Determinants of Health (SDoH) and evaluate its effectiveness using the All of Us Research Program.</p><p><strong>Materials and methods: </strong>Our approach utilizes aggregated SDoH datasets to estimate individual-level SDoH, demonstrated with examples of no high school diploma (NOHSDP) and no health insurance (UNINSUR) variables. Models are estimated using American Community Survey data and applied to derive individual-level estimates for All of Us participants. We assess concordance between model-based SDoH estimates and self-reported SDoHs in All of Us and examine associations with undiagnosed hypertension and diabetes.</p><p><strong>Results: </strong>Compared to self-reported SDoHs, the area under the curve for NOHSDP is 0.727 (95% CI, 0.724-0.730) and for UNINSUR is 0.730 (95% CI, 0.727-0.733) among the 329 074 All of Us participants, both significantly higher than aggregated SDoHs. The association between model-based NOHSDP and undiagnosed hypertension is concordant with those estimated using self-reported NOHSDP, with a correlation coefficient of 0.649. Similarly, the association between model-based NOHSDP and undiagnosed diabetes is concordant with those estimated using self-reported NOHSDP, with a correlation coefficient of 0.900.</p><p><strong>Discussion and conclusion: </strong>The model-based SDoH estimation method offers a scalable and easily standardized approach for estimating individual-level SDoHs. Using the All of Us dataset, we demonstrate reasonable concordance between model-based SDoH estimates and self-reported SDoHs, along with consistent associations with health outcomes. Our findings also underscore the critical role of geographic contexts in SDoH estimation and in evaluating the association between SDoHs and health outcomes.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing. 利用自然语言处理技术从阿尔茨海默病患者的临床笔记中提取睡眠信息。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-13 DOI: 10.1093/jamia/ocae177
Sonish Sivarajkumar, Thomas Yu Chow Tam, Haneef Ahamed Mohammad, Samuel Viggiano, David Oniani, Shyam Visweswaran, Yanshan Wang

Objectives: Alzheimer's disease (AD) is the most common form of dementia in the United States. Sleep is one of the lifestyle-related factors that has been shown critical for optimal cognitive function in old age. However, there is a lack of research studying the association between sleep and AD incidence. A major bottleneck for conducting such research is that the traditional way to acquire sleep information is time-consuming, inefficient, non-scalable, and limited to patients' subjective experience. We aim to automate the extraction of specific sleep-related patterns, such as snoring, napping, poor sleep quality, daytime sleepiness, night wakings, other sleep problems, and sleep duration, from clinical notes of AD patients. These sleep patterns are hypothesized to play a role in the incidence of AD, providing insight into the relationship between sleep and AD onset and progression.

Materials and methods: A gold standard dataset is created from manual annotation of 570 randomly sampled clinical note documents from the adSLEEP, a corpus of 192 000 de-identified clinical notes of 7266 AD patients retrieved from the University of Pittsburgh Medical Center (UPMC). We developed a rule-based natural language processing (NLP) algorithm, machine learning models, and large language model (LLM)-based NLP algorithms to automate the extraction of sleep-related concepts, including snoring, napping, sleep problem, bad sleep quality, daytime sleepiness, night wakings, and sleep duration, from the gold standard dataset.

Results: The annotated dataset of 482 patients comprised a predominantly White (89.2%), older adult population with an average age of 84.7 years, where females represented 64.1%, and a vast majority were non-Hispanic or Latino (94.6%). Rule-based NLP algorithm achieved the best performance of F1 across all sleep-related concepts. In terms of positive predictive value (PPV), the rule-based NLP algorithm achieved the highest PPV scores for daytime sleepiness (1.00) and sleep duration (1.00), while the machine learning models had the highest PPV for napping (0.95) and bad sleep quality (0.86), and LLAMA2 with finetuning had the highest PPV for night wakings (0.93) and sleep problem (0.89).

Discussion: Although sleep information is infrequently documented in the clinical notes, the proposed rule-based NLP algorithm and LLM-based NLP algorithms still achieved promising results. In comparison, the machine learning-based approaches did not achieve good results, which is due to the small size of sleep information in the training data.

Conclusion: The results show that the rule-based NLP algorithm consistently achieved the best performance for all sleep concepts. This study focused on the clinical notes of patients with AD but could be extended to general sleep information extraction for other diseases.

目标:阿尔茨海默病(AD)是美国最常见的痴呆症。睡眠是与生活方式相关的因素之一,已被证明对老年期最佳认知功能至关重要。然而,目前还缺乏对睡眠与老年痴呆症发病率之间关系的研究。开展此类研究的一个主要瓶颈是,获取睡眠信息的传统方法耗时长、效率低、不可扩展,而且仅限于患者的主观体验。我们的目标是从 AD 患者的临床记录中自动提取与睡眠相关的特定模式,如打鼾、打盹、睡眠质量差、白天嗜睡、夜醒、其他睡眠问题和睡眠持续时间。据推测,这些睡眠模式与 AD 的发病率有关,有助于深入了解睡眠与 AD 的发病和发展之间的关系:我们从匹兹堡大学医学中心(UPMC)的adSLEEP语料库中随机抽取了570份临床笔记文档进行人工标注,创建了一个金标准数据集,该语料库包含7266名AD患者的19.2万份去标识化临床笔记。我们开发了一种基于规则的自然语言处理(NLP)算法、机器学习模型和基于大型语言模型(LLM)的NLP算法,以便从金标准数据集中自动提取与睡眠相关的概念,包括打鼾、午睡、睡眠问题、睡眠质量差、白天嗜睡、夜醒和睡眠持续时间:注释数据集包含 482 名患者,主要为白人(89.2%)、平均年龄为 84.7 岁的老年人,其中女性占 64.1%,绝大多数为非西班牙裔或拉丁裔(94.6%)。在所有与睡眠相关的概念中,基于规则的 NLP 算法取得了最佳的 F1 性能。在阳性预测值(PPV)方面,基于规则的NLP算法在白天嗜睡(1.00)和睡眠时间(1.00)方面的PPV得分最高,而机器学习模型在小睡(0.95)和睡眠质量差(0.86)方面的PPV最高,微调后的LLAMA2在夜醒(0.93)和睡眠问题(0.89)方面的PPV最高:讨论:尽管临床笔记中很少记录睡眠信息,但所提出的基于规则的 NLP 算法和基于 LLM 的 NLP 算法仍然取得了可喜的成果。相比之下,基于机器学习的方法没有取得很好的结果,这是因为训练数据中睡眠信息的规模较小:结果表明,基于规则的 NLP 算法在所有睡眠概念上都取得了最佳性能。这项研究的重点是注意力缺失症患者的临床笔记,但也可扩展到其他疾病的一般睡眠信息提取。
{"title":"Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.","authors":"Sonish Sivarajkumar, Thomas Yu Chow Tam, Haneef Ahamed Mohammad, Samuel Viggiano, David Oniani, Shyam Visweswaran, Yanshan Wang","doi":"10.1093/jamia/ocae177","DOIUrl":"10.1093/jamia/ocae177","url":null,"abstract":"<p><strong>Objectives: </strong>Alzheimer's disease (AD) is the most common form of dementia in the United States. Sleep is one of the lifestyle-related factors that has been shown critical for optimal cognitive function in old age. However, there is a lack of research studying the association between sleep and AD incidence. A major bottleneck for conducting such research is that the traditional way to acquire sleep information is time-consuming, inefficient, non-scalable, and limited to patients' subjective experience. We aim to automate the extraction of specific sleep-related patterns, such as snoring, napping, poor sleep quality, daytime sleepiness, night wakings, other sleep problems, and sleep duration, from clinical notes of AD patients. These sleep patterns are hypothesized to play a role in the incidence of AD, providing insight into the relationship between sleep and AD onset and progression.</p><p><strong>Materials and methods: </strong>A gold standard dataset is created from manual annotation of 570 randomly sampled clinical note documents from the adSLEEP, a corpus of 192 000 de-identified clinical notes of 7266 AD patients retrieved from the University of Pittsburgh Medical Center (UPMC). We developed a rule-based natural language processing (NLP) algorithm, machine learning models, and large language model (LLM)-based NLP algorithms to automate the extraction of sleep-related concepts, including snoring, napping, sleep problem, bad sleep quality, daytime sleepiness, night wakings, and sleep duration, from the gold standard dataset.</p><p><strong>Results: </strong>The annotated dataset of 482 patients comprised a predominantly White (89.2%), older adult population with an average age of 84.7 years, where females represented 64.1%, and a vast majority were non-Hispanic or Latino (94.6%). Rule-based NLP algorithm achieved the best performance of F1 across all sleep-related concepts. In terms of positive predictive value (PPV), the rule-based NLP algorithm achieved the highest PPV scores for daytime sleepiness (1.00) and sleep duration (1.00), while the machine learning models had the highest PPV for napping (0.95) and bad sleep quality (0.86), and LLAMA2 with finetuning had the highest PPV for night wakings (0.93) and sleep problem (0.89).</p><p><strong>Discussion: </strong>Although sleep information is infrequently documented in the clinical notes, the proposed rule-based NLP algorithm and LLM-based NLP algorithms still achieved promising results. In comparison, the machine learning-based approaches did not achieve good results, which is due to the small size of sleep information in the training data.</p><p><strong>Conclusion: </strong>The results show that the rule-based NLP algorithm consistently achieved the best performance for all sleep concepts. This study focused on the clinical notes of patients with AD but could be extended to general sleep information extraction for other diseases.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring cognitive effort using tabular transformer-based language models of electronic health record-based audit log action sequences. 使用基于表格转换器的语言模型测量基于电子健康记录的审计日志操作序列的认知努力。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-13 DOI: 10.1093/jamia/ocae171
Seunghwan Kim, Benjamin C Warner, Daphne Lew, Sunny S Lou, Thomas Kannampallil

Objectives: To develop and validate a novel measure, action entropy, for assessing the cognitive effort associated with electronic health record (EHR)-based work activities.

Materials and methods: EHR-based audit logs of attending physicians and advanced practice providers (APPs) from four surgical intensive care units in 2019 were included. Neural language models (LMs) were trained and validated separately for attendings' and APPs' action sequences. Action entropy was calculated as the cross-entropy associated with the predicted probability of the next action, based on prior actions. To validate the measure, a matched pairs study was conducted to assess the difference in action entropy during known high cognitive effort scenarios, namely, attention switching between patients and to or from the EHR inbox.

Results: Sixty-five clinicians performing 5 904 429 EHR-based audit log actions on 8956 unique patients were included. All attention switching scenarios were associated with a higher action entropy compared to non-switching scenarios (P < .001), except for the from-inbox switching scenario among APPs. The highest difference among attendings was for the from-inbox attention switching: Action entropy was 1.288 (95% CI, 1.256-1.320) standard deviations (SDs) higher for switching compared to non-switching scenarios. For APPs, the highest difference was for the to-inbox switching, where action entropy was 2.354 (95% CI, 2.311-2.397) SDs higher for switching compared to non-switching scenarios.

Discussion: We developed a LM-based metric, action entropy, for assessing cognitive burden associated with EHR-based actions. The metric showed discriminant validity and statistical significance when evaluated against known situations of high cognitive effort (ie, attention switching). With additional validation, this metric can potentially be used as a screening tool for assessing behavioral action phenotypes that are associated with higher cognitive burden.

Conclusion: An LM-based action entropy metric-relying on sequences of EHR actions-offers opportunities for assessing cognitive effort in EHR-based workflows.

目的开发并验证一种新的测量方法--行动熵,用于评估与基于电子健康记录(EHR)的工作活动相关的认知努力:纳入了 2019 年四个外科重症监护病房的主治医师和高级医师(APP)基于电子病历的审计日志。针对主治医师和 APP 的动作序列分别训练和验证了神经语言模型(LM)。动作熵是根据先前动作计算出来的与下一个动作预测概率相关的交叉熵。为了验证该测量方法,我们进行了一项配对研究,以评估在已知的高认知努力情景(即注意力在患者之间切换以及进出电子病历收件箱)中行动熵的差异:结果:65 名临床医生对 8956 名患者执行了 5 904 429 次基于电子病历的审计日志操作。与不切换的情况相比,所有注意力切换的情况都与较高的行动熵相关(P 讨论):我们开发了一种基于 LM 的指标--行动熵,用于评估与基于电子病历的行动相关的认知负担。在对已知的高认知努力情况(即注意力切换)进行评估时,该指标显示出了判别有效性和统计学意义。通过进一步验证,该指标可作为筛选工具,用于评估与较高认知负担相关的行为动作表型:基于 LM 的行动熵指标依赖于电子病历行动序列,为评估基于电子病历的工作流程中的认知努力提供了机会。
{"title":"Measuring cognitive effort using tabular transformer-based language models of electronic health record-based audit log action sequences.","authors":"Seunghwan Kim, Benjamin C Warner, Daphne Lew, Sunny S Lou, Thomas Kannampallil","doi":"10.1093/jamia/ocae171","DOIUrl":"https://doi.org/10.1093/jamia/ocae171","url":null,"abstract":"<p><strong>Objectives: </strong>To develop and validate a novel measure, action entropy, for assessing the cognitive effort associated with electronic health record (EHR)-based work activities.</p><p><strong>Materials and methods: </strong>EHR-based audit logs of attending physicians and advanced practice providers (APPs) from four surgical intensive care units in 2019 were included. Neural language models (LMs) were trained and validated separately for attendings' and APPs' action sequences. Action entropy was calculated as the cross-entropy associated with the predicted probability of the next action, based on prior actions. To validate the measure, a matched pairs study was conducted to assess the difference in action entropy during known high cognitive effort scenarios, namely, attention switching between patients and to or from the EHR inbox.</p><p><strong>Results: </strong>Sixty-five clinicians performing 5 904 429 EHR-based audit log actions on 8956 unique patients were included. All attention switching scenarios were associated with a higher action entropy compared to non-switching scenarios (P < .001), except for the from-inbox switching scenario among APPs. The highest difference among attendings was for the from-inbox attention switching: Action entropy was 1.288 (95% CI, 1.256-1.320) standard deviations (SDs) higher for switching compared to non-switching scenarios. For APPs, the highest difference was for the to-inbox switching, where action entropy was 2.354 (95% CI, 2.311-2.397) SDs higher for switching compared to non-switching scenarios.</p><p><strong>Discussion: </strong>We developed a LM-based metric, action entropy, for assessing cognitive burden associated with EHR-based actions. The metric showed discriminant validity and statistical significance when evaluated against known situations of high cognitive effort (ie, attention switching). With additional validation, this metric can potentially be used as a screening tool for assessing behavioral action phenotypes that are associated with higher cognitive burden.</p><p><strong>Conclusion: </strong>An LM-based action entropy metric-relying on sequences of EHR actions-offers opportunities for assessing cognitive effort in EHR-based workflows.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Social media use and mental health among older adults with multimorbidity: the role of self-care efficacy. 患有多种疾病的老年人使用社交媒体与心理健康:自我保健功效的作用。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-11 DOI: 10.1093/jamia/ocae179
Zuoting Nie, Shiying Gao, Long Chen, Rumei Yang, Linda S Edelman, Katherine A Sward, Yun Jiang, George Demiris

Objectives: To describe the prevalence and trends in the use of social media over time and explore whether social media use is related to better self-care efficacy and thus related to better mental health among United States older adults with multimorbidity.

Materials and methods: Respondents aged 65 years+ and having 2 or more chronic conditions from the 2017-2020 Health Information National Trends Survey were analyzed (N = 3341) using weighted descriptive and logistic regression analyses.

Results: Overall, 48% (n = 1674) of older adults with multimorbidity used social media and there was a linear trend in use over time, increasing from 41.1% in 2017 to 46.5% in 2018, and then further up to 51.7% in 2019, and 54.0% in 2020. Users were often younger, married/partnered, and non-Hispanic White with high education and income. Social media use was associated with better self-care efficacy that was further related to better mental health, indicating a significant mediation effect of self-care efficacy in the relationship between social media use and mental health.

Discussion: Although older adults with multimorbidity are a fast-growing population using social media for health, significant demographic disparities exist. While social media use is promising in improving self-care efficacy and thus mental health, relying on social media for the management of multimorbidity might be potentially harmful to those who are not only affected by multimorbidity but also socially disadvantaged (eg, non-White with lower education).

Conclusion: Great effort is needed to address the demographic disparity and ensure health equity when using social media for patient care.

目的描述随着时间推移使用社交媒体的流行程度和趋势,并探讨社交媒体的使用是否与更好的自我护理效能有关,从而与美国患有多种疾病的老年人更好的心理健康有关:采用加权描述性分析和逻辑回归分析,对2017-2020年健康信息全国趋势调查中年龄在65岁以上、患有2种或2种以上慢性疾病的受访者(N = 3341)进行了分析:总体而言,48%(n = 1674)患有多种疾病的老年人使用社交媒体,且随着时间的推移,使用率呈直线上升趋势,从 2017 年的 41.1% 上升至 2018 年的 46.5%,然后进一步上升至 2019 年的 51.7%,以及 2020 年的 54.0%。用户通常较为年轻、已婚/有伴侣、非西班牙裔白人、高学历和高收入。社交媒体的使用与更好的自我保健效能相关,而更好的自我保健效能又与更好的心理健康相关,这表明在社交媒体使用与心理健康的关系中,自我保健效能具有显著的中介效应:讨论:虽然患有多种疾病的老年人是使用社交媒体促进健康的快速增长人群,但仍存在明显的人口差异。虽然社交媒体的使用在提高自我保健效率、进而改善心理健康方面大有可为,但依赖社交媒体来管理多病症可能会对那些不仅受多病症影响,而且处于社会不利地位的人群(如教育程度较低的非白人)造成潜在伤害:结论:在使用社交媒体为患者提供护理时,需要下大力气解决人口差异问题并确保健康公平。
{"title":"Social media use and mental health among older adults with multimorbidity: the role of self-care efficacy.","authors":"Zuoting Nie, Shiying Gao, Long Chen, Rumei Yang, Linda S Edelman, Katherine A Sward, Yun Jiang, George Demiris","doi":"10.1093/jamia/ocae179","DOIUrl":"https://doi.org/10.1093/jamia/ocae179","url":null,"abstract":"<p><strong>Objectives: </strong>To describe the prevalence and trends in the use of social media over time and explore whether social media use is related to better self-care efficacy and thus related to better mental health among United States older adults with multimorbidity.</p><p><strong>Materials and methods: </strong>Respondents aged 65 years+ and having 2 or more chronic conditions from the 2017-2020 Health Information National Trends Survey were analyzed (N = 3341) using weighted descriptive and logistic regression analyses.</p><p><strong>Results: </strong>Overall, 48% (n = 1674) of older adults with multimorbidity used social media and there was a linear trend in use over time, increasing from 41.1% in 2017 to 46.5% in 2018, and then further up to 51.7% in 2019, and 54.0% in 2020. Users were often younger, married/partnered, and non-Hispanic White with high education and income. Social media use was associated with better self-care efficacy that was further related to better mental health, indicating a significant mediation effect of self-care efficacy in the relationship between social media use and mental health.</p><p><strong>Discussion: </strong>Although older adults with multimorbidity are a fast-growing population using social media for health, significant demographic disparities exist. While social media use is promising in improving self-care efficacy and thus mental health, relying on social media for the management of multimorbidity might be potentially harmful to those who are not only affected by multimorbidity but also socially disadvantaged (eg, non-White with lower education).</p><p><strong>Conclusion: </strong>Great effort is needed to address the demographic disparity and ensure health equity when using social media for patient care.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141591872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the American Medical Informatics Association
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1