首页 > 最新文献

NPJ Digital Medicine最新文献

英文 中文
An indirect treatment comparison meta-analysis of digital versus face-to-face cognitive behavior therapy for headache 头痛数字认知行为疗法与面对面认知行为疗法的间接治疗比较荟萃分析
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-29 DOI: 10.1038/s41746-024-01264-9
Yan-Bing Huang, Li Lin, Xin-Yu Li, Bo-Zhu Chen, Lu Yuan, Hui Zheng
Cognitive behavioral therapy (CBT) is effective for headache disorders. However, it is unclear whether the emerging digital CBT is noninferior to face-to-face CBT. An indirect treatment comparison (ITC) meta-analysis was conducted to assess the relative effects between them using standard mean differences (SMDs). Effective sample size (ESS) and required sample size (RSS) were calculated to demonstrate the robustness of the results. Our study found that digital CBT had a similar effect on headache frequency reduction (SMD, 0.12; 95%CI, −2.45 to 2.63) compared with face-to-face CBT. The ESS had 84 participants, while the RSS had 466 participants to achieve the same power as a non-inferior head-to-head trial. Digital CBT is as effective as face-to-face CBT in preventing headache disorders. Due to the heterogeneity (I2 = 94.5%, τ2 = 1.83) and the fact that most of the included studies were on migraine prevention, further head-to-head trials are warranted.
认知行为疗法(CBT)对头痛症很有效。然而,目前还不清楚新兴的数字化 CBT 是否不逊于面对面的 CBT。我们进行了一项间接治疗比较(ITC)荟萃分析,利用标准平均差(SMDs)评估两者之间的相对效果。计算了有效样本量(ESS)和所需样本量(RSS),以证明结果的稳健性。我们的研究发现,与面对面的 CBT 相比,数字化 CBT 对减少头痛频率的效果相似(SMD,0.12;95%CI,-2.45 至 2.63)。ESS有84名参与者,而RSS有466名参与者,达到了与非劣效头对头试验相同的功率。在预防头痛疾病方面,数字化 CBT 与面对面 CBT 同样有效。由于存在异质性(I2 = 94.5%,τ2 = 1.83),而且纳入的研究大多是关于偏头痛预防的,因此有必要进一步开展头对头试验。
{"title":"An indirect treatment comparison meta-analysis of digital versus face-to-face cognitive behavior therapy for headache","authors":"Yan-Bing Huang, Li Lin, Xin-Yu Li, Bo-Zhu Chen, Lu Yuan, Hui Zheng","doi":"10.1038/s41746-024-01264-9","DOIUrl":"10.1038/s41746-024-01264-9","url":null,"abstract":"Cognitive behavioral therapy (CBT) is effective for headache disorders. However, it is unclear whether the emerging digital CBT is noninferior to face-to-face CBT. An indirect treatment comparison (ITC) meta-analysis was conducted to assess the relative effects between them using standard mean differences (SMDs). Effective sample size (ESS) and required sample size (RSS) were calculated to demonstrate the robustness of the results. Our study found that digital CBT had a similar effect on headache frequency reduction (SMD, 0.12; 95%CI, −2.45 to 2.63) compared with face-to-face CBT. The ESS had 84 participants, while the RSS had 466 participants to achieve the same power as a non-inferior head-to-head trial. Digital CBT is as effective as face-to-face CBT in preventing headache disorders. Due to the heterogeneity (I2 = 94.5%, τ2 = 1.83) and the fact that most of the included studies were on migraine prevention, further head-to-head trials are warranted.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01264-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142329189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blending space and time to talk about cancer in extended reality 融合时空,在扩展现实中谈论癌症
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-29 DOI: 10.1038/s41746-024-01262-x
Tamsin J. Robb, Yinan Liu, Braden Woodhouse, Charlotta Windahl, Daniel Hurley, Grant McArthur, Stephen B. Fox, Lisa Brown, Parry Guilford, Alice Minhinnick, Christopher Jackson, Cherie Blenkiron, Kate Parker, Kimiora Henare, Rose McColl, Bianca Haux, Nick Young, Veronica Boyle, Laird Cameron, Sanjeev Deva, Jane Reeve, Cristin G. Print, Michael Davis, Uwe Rieger, Ben Lawrence
We introduce a proof-of-concept extended reality (XR) environment for discussing cancer, presenting genomic information from multiple tumour sites in the context of 3D tumour models generated from CT scans. This tool enhances multidisciplinary discussions. Clinicians and cancer researchers explored its use in oncology, sharing perspectives on XR’s potential for use in molecular tumour boards, clinician-patient communication, and education. XR serves as a universal language, fostering collaborative decision-making in oncology.
我们介绍了一种用于讨论癌症的概念验证扩展现实(XR)环境,在 CT 扫描生成的三维肿瘤模型背景下呈现多个肿瘤部位的基因组信息。该工具可加强多学科讨论。临床医生和癌症研究人员探讨了它在肿瘤学中的应用,分享了 XR 在肿瘤分子委员会、临床医生与患者交流和教育中的应用潜力。XR 可作为一种通用语言,促进肿瘤学领域的合作决策。
{"title":"Blending space and time to talk about cancer in extended reality","authors":"Tamsin J. Robb, Yinan Liu, Braden Woodhouse, Charlotta Windahl, Daniel Hurley, Grant McArthur, Stephen B. Fox, Lisa Brown, Parry Guilford, Alice Minhinnick, Christopher Jackson, Cherie Blenkiron, Kate Parker, Kimiora Henare, Rose McColl, Bianca Haux, Nick Young, Veronica Boyle, Laird Cameron, Sanjeev Deva, Jane Reeve, Cristin G. Print, Michael Davis, Uwe Rieger, Ben Lawrence","doi":"10.1038/s41746-024-01262-x","DOIUrl":"10.1038/s41746-024-01262-x","url":null,"abstract":"We introduce a proof-of-concept extended reality (XR) environment for discussing cancer, presenting genomic information from multiple tumour sites in the context of 3D tumour models generated from CT scans. This tool enhances multidisciplinary discussions. Clinicians and cancer researchers explored its use in oncology, sharing perspectives on XR’s potential for use in molecular tumour boards, clinician-patient communication, and education. XR serves as a universal language, fostering collaborative decision-making in oncology.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01262-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142329248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of NLP machine learning models with human physicians for ASA Physical Status classification NLP 机器学习模型与人类医生在 ASA 身体状况分类方面的比较
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-28 DOI: 10.1038/s41746-024-01259-6
Soo Bin Yoon, Jipyeong Lee, Hyung-Chul Lee, Chul-Woo Jung, Hyeonhoon Lee
The American Society of Anesthesiologist’s Physical Status (ASA-PS) classification system assesses comorbidities before sedation and analgesia, but inconsistencies among raters have hindered its objective use. This study aimed to develop natural language processing (NLP) models to classify ASA-PS using pre-anesthesia evaluation summaries, comparing their performance to human physicians. Data from 717,389 surgical cases in a tertiary hospital (October 2004–May 2023) was split into training, tuning, and test datasets. Board-certified anesthesiologists created reference labels for tuning and test datasets. The NLP models, including ClinicalBigBird, BioClinicalBERT, and Generative Pretrained Transformer 4, were validated against anesthesiologists. The ClinicalBigBird model achieved an area under the receiver operating characteristic curve of 0.915. It outperformed board-certified anesthesiologists with a specificity of 0.901 vs. 0.897, precision of 0.732 vs. 0.715, and F1-score of 0.716 vs. 0.713 (all p <0.01). This approach will facilitate automatic and objective ASA-PS classification, thereby streamlining the clinical workflow.
美国麻醉医师协会的身体状况(ASA-PS)分类系统对镇静和镇痛前的合并症进行评估,但评分者之间的不一致妨碍了该系统的客观使用。本研究旨在开发自然语言处理(NLP)模型,利用麻醉前评估摘要对 ASA-PS 进行分类,并将其性能与人类医生进行比较。一家三级医院的 717,389 例手术数据(2004 年 10 月至 2023 年 5 月)被分成训练、调整和测试数据集。经过认证的麻醉师为调整和测试数据集创建了参考标签。包括 ClinicalBigBird、BioClinicalBERT 和 Generative Pretrained Transformer 4 在内的 NLP 模型通过麻醉师进行了验证。ClinicalBigBird模型的接收者操作特征曲线下面积达到0.915。它的特异性为 0.901 对 0.897,精确性为 0.732 对 0.715,F1 分数为 0.716 对 0.713(所有 p 均为 0.01),均优于麻醉医师。这种方法有助于自动、客观地进行 ASA-PS 分类,从而简化临床工作流程。
{"title":"Comparison of NLP machine learning models with human physicians for ASA Physical Status classification","authors":"Soo Bin Yoon,&nbsp;Jipyeong Lee,&nbsp;Hyung-Chul Lee,&nbsp;Chul-Woo Jung,&nbsp;Hyeonhoon Lee","doi":"10.1038/s41746-024-01259-6","DOIUrl":"10.1038/s41746-024-01259-6","url":null,"abstract":"The American Society of Anesthesiologist’s Physical Status (ASA-PS) classification system assesses comorbidities before sedation and analgesia, but inconsistencies among raters have hindered its objective use. This study aimed to develop natural language processing (NLP) models to classify ASA-PS using pre-anesthesia evaluation summaries, comparing their performance to human physicians. Data from 717,389 surgical cases in a tertiary hospital (October 2004–May 2023) was split into training, tuning, and test datasets. Board-certified anesthesiologists created reference labels for tuning and test datasets. The NLP models, including ClinicalBigBird, BioClinicalBERT, and Generative Pretrained Transformer 4, were validated against anesthesiologists. The ClinicalBigBird model achieved an area under the receiver operating characteristic curve of 0.915. It outperformed board-certified anesthesiologists with a specificity of 0.901 vs. 0.897, precision of 0.732 vs. 0.715, and F1-score of 0.716 vs. 0.713 (all p &lt;0.01). This approach will facilitate automatic and objective ASA-PS classification, thereby streamlining the clinical workflow.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01259-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142328616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning for identifying personal and family history of suicidal thoughts and behaviors from EHRs 从电子病历中识别个人和家庭自杀想法和行为史的深度学习
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-28 DOI: 10.1038/s41746-024-01266-7
Prakash Adekkanattu, Al’ona Furmanchuk, Yonghui Wu, Aman Pathak, Braja Gopal Patra, Sarah Bost, Destinee Morrow, Grace Hsin-Min Wang, Yuyang Yang, Noah James Forrest, Yuan Luo, Theresa L. Walunas, Weihsuan Lo-Ciganic, Walid Gelad, Jiang Bian, Yuhua Bao, Mark Weiner, David Oslin, Jyotishman Pathak
Personal and family history of suicidal thoughts and behaviors (PSH and FSH, respectively) are significant risk factors associated with suicides. Research is limited in automatic identification of such data from clinical notes in Electronic Health Records. This study developed deep learning (DL) tools utilizing transformer models (Bio_ClinicalBERT and GatorTron) to detect PSH and FSH in clinical notes derived from three academic medical centers, and compared their performance with a rule-based natural language processing tool. For detecting PSH, the rule-based approach obtained an F1-score of 0.75 ± 0.07, while the Bio_ClinicalBERT and GatorTron DL tools scored 0.83 ± 0.09 and 0.84 ± 0.07, respectively. For detecting FSH, the rule-based approach achieved an F1-score of 0.69 ± 0.11, compared to 0.89 ± 0.10 for Bio_ClinicalBERT and 0.92 ± 0.07 for GatorTron. Across sites, the DL tools identified more than 80% of patients at elevated risk for suicide who remain undiagnosed and untreated.
个人和家族的自杀想法和行为史(分别为 PSH 和 FSH)是与自杀相关的重要风险因素。从电子健康记录的临床笔记中自动识别此类数据的研究十分有限。本研究利用转换器模型(Bio_ClinicalBERT 和 GatorTron)开发了深度学习(DL)工具,用于检测三个学术医疗中心临床笔记中的 PSH 和 FSH,并将其性能与基于规则的自然语言处理工具进行了比较。在检测 PSH 方面,基于规则的方法获得的 F1 分数为 0.75 ± 0.07,而 Bio_ClinicalBERT 和 GatorTron DL 工具的分数分别为 0.83 ± 0.09 和 0.84 ± 0.07。在检测 FSH 方面,基于规则的方法的 F1 分数为 0.69 ± 0.11,而 Bio_ClinicalBERT 为 0.89 ± 0.10,GatorTron 为 0.92 ± 0.07。在所有研究机构中,DL工具识别出了80%以上的自杀风险较高但仍未得到诊断和治疗的患者。
{"title":"Deep learning for identifying personal and family history of suicidal thoughts and behaviors from EHRs","authors":"Prakash Adekkanattu,&nbsp;Al’ona Furmanchuk,&nbsp;Yonghui Wu,&nbsp;Aman Pathak,&nbsp;Braja Gopal Patra,&nbsp;Sarah Bost,&nbsp;Destinee Morrow,&nbsp;Grace Hsin-Min Wang,&nbsp;Yuyang Yang,&nbsp;Noah James Forrest,&nbsp;Yuan Luo,&nbsp;Theresa L. Walunas,&nbsp;Weihsuan Lo-Ciganic,&nbsp;Walid Gelad,&nbsp;Jiang Bian,&nbsp;Yuhua Bao,&nbsp;Mark Weiner,&nbsp;David Oslin,&nbsp;Jyotishman Pathak","doi":"10.1038/s41746-024-01266-7","DOIUrl":"10.1038/s41746-024-01266-7","url":null,"abstract":"Personal and family history of suicidal thoughts and behaviors (PSH and FSH, respectively) are significant risk factors associated with suicides. Research is limited in automatic identification of such data from clinical notes in Electronic Health Records. This study developed deep learning (DL) tools utilizing transformer models (Bio_ClinicalBERT and GatorTron) to detect PSH and FSH in clinical notes derived from three academic medical centers, and compared their performance with a rule-based natural language processing tool. For detecting PSH, the rule-based approach obtained an F1-score of 0.75 ± 0.07, while the Bio_ClinicalBERT and GatorTron DL tools scored 0.83 ± 0.09 and 0.84 ± 0.07, respectively. For detecting FSH, the rule-based approach achieved an F1-score of 0.69 ± 0.11, compared to 0.89 ± 0.10 for Bio_ClinicalBERT and 0.92 ± 0.07 for GatorTron. Across sites, the DL tools identified more than 80% of patients at elevated risk for suicide who remain undiagnosed and untreated.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01266-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142329190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for human evaluation of large language models in healthcare derived from literature review 从文献综述中得出的医疗保健领域大型语言模型人工评估框架
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-28 DOI: 10.1038/s41746-024-01258-7
Thomas Yu Chow Tam, Sonish Sivarajkumar, Sumit Kapoor, Alisa V. Stolyar, Katelyn Polanska, Karleigh R. McCarthy, Hunter Osterhoudt, Xizhi Wu, Shyam Visweswaran, Sunyang Fu, Piyush Mathur, Giovanni E. Cacciamani, Cong Sun, Yifan Peng, Yanshan Wang
With generative artificial intelligence (GenAI), particularly large language models (LLMs), continuing to make inroads in healthcare, assessing LLMs with human evaluations is essential to assuring safety and effectiveness. This study reviews existing literature on human evaluation methodologies for LLMs in healthcare across various medical specialties and addresses factors such as evaluation dimensions, sample types and sizes, selection, and recruitment of evaluators, frameworks and metrics, evaluation process, and statistical analysis type. Our literature review of 142 studies shows gaps in reliability, generalizability, and applicability of current human evaluation practices. To overcome such significant obstacles to healthcare LLM developments and deployments, we propose QUEST, a comprehensive and practical framework for human evaluation of LLMs covering three phases of workflow: Planning, Implementation and Adjudication, and Scoring and Review. QUEST is designed with five proposed evaluation principles: Quality of Information, Understanding and Reasoning, Expression Style and Persona, Safety and Harm, and Trust and Confidence.
随着生成式人工智能(GenAI),尤其是大型语言模型(LLM)在医疗保健领域的不断发展,通过人工评估对 LLM 进行评估对于确保其安全性和有效性至关重要。本研究回顾了医疗保健领域各专业 LLM 人类评估方法的现有文献,并探讨了评估维度、样本类型和大小、评估人员的选择和招聘、框架和度量标准、评估流程和统计分析类型等因素。我们对 142 项研究的文献综述显示,目前的人类评价实践在可靠性、普遍性和适用性方面存在差距。为了克服这些阻碍医疗保健 LLM 开发和部署的重大障碍,我们提出了 QUEST,这是一个全面而实用的 LLM 人类评估框架,涵盖工作流程的三个阶段:该框架涵盖三个阶段的工作流程:规划、实施和裁决以及评分和审核。QUEST 的设计包含五项拟议的评估原则:信息质量、理解与推理、表达风格与角色、安全与危害以及信任与信心。
{"title":"A framework for human evaluation of large language models in healthcare derived from literature review","authors":"Thomas Yu Chow Tam,&nbsp;Sonish Sivarajkumar,&nbsp;Sumit Kapoor,&nbsp;Alisa V. Stolyar,&nbsp;Katelyn Polanska,&nbsp;Karleigh R. McCarthy,&nbsp;Hunter Osterhoudt,&nbsp;Xizhi Wu,&nbsp;Shyam Visweswaran,&nbsp;Sunyang Fu,&nbsp;Piyush Mathur,&nbsp;Giovanni E. Cacciamani,&nbsp;Cong Sun,&nbsp;Yifan Peng,&nbsp;Yanshan Wang","doi":"10.1038/s41746-024-01258-7","DOIUrl":"10.1038/s41746-024-01258-7","url":null,"abstract":"With generative artificial intelligence (GenAI), particularly large language models (LLMs), continuing to make inroads in healthcare, assessing LLMs with human evaluations is essential to assuring safety and effectiveness. This study reviews existing literature on human evaluation methodologies for LLMs in healthcare across various medical specialties and addresses factors such as evaluation dimensions, sample types and sizes, selection, and recruitment of evaluators, frameworks and metrics, evaluation process, and statistical analysis type. Our literature review of 142 studies shows gaps in reliability, generalizability, and applicability of current human evaluation practices. To overcome such significant obstacles to healthcare LLM developments and deployments, we propose QUEST, a comprehensive and practical framework for human evaluation of LLMs covering three phases of workflow: Planning, Implementation and Adjudication, and Scoring and Review. QUEST is designed with five proposed evaluation principles: Quality of Information, Understanding and Reasoning, Expression Style and Persona, Safety and Harm, and Trust and Confidence.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01258-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142328660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-preserving large language models for structured medical information retrieval 用于结构化医疗信息检索的隐私保护大型语言模型。
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-20 DOI: 10.1038/s41746-024-01233-2
Isabella Catharina Wiest, Dyke Ferber, Jiefu Zhu, Marko van Treeck, Sonja K. Meyer, Radhika Juglan, Zunamys I. Carrero, Daniel Paech, Jens Kleesiek, Matthias P. Ebert, Daniel Truhn, Jakob Nikolas Kather
Most clinical information is encoded as free text, not accessible for quantitative analysis. This study presents an open-source pipeline using the local large language model (LLM) “Llama 2” to extract quantitative information from clinical text and evaluates its performance in identifying features of decompensated liver cirrhosis. The LLM identified five key clinical features in a zero- and one-shot manner from 500 patient medical histories in the MIMIC IV dataset. We compared LLMs of three sizes and various prompt engineering approaches, with predictions compared against ground truth from three blinded medical experts. Our pipeline achieved high accuracy, detecting liver cirrhosis with 100% sensitivity and 96% specificity. High sensitivities and specificities were also yielded for detecting ascites (95%, 95%), confusion (76%, 94%), abdominal pain (84%, 97%), and shortness of breath (87%, 97%) using the 70 billion parameter model, which outperformed smaller versions. Our study successfully demonstrates the capability of locally deployed LLMs to extract clinical information from free text with low hardware requirements.
大多数临床信息以自由文本形式编码,无法进行定量分析。本研究提出了一个开源管道,使用本地大语言模型(LLM)"Llama 2 "从临床文本中提取定量信息,并评估了其在识别肝硬化失代偿期特征方面的性能。LLM 从 MIMIC IV 数据集中的 500 份患者病历中,以零点和单点的方式识别出了五个关键临床特征。我们比较了三种规模的 LLM 和各种提示工程方法,并将预测结果与三位盲人医学专家提供的地面实况进行了比较。我们的管道具有很高的准确性,检测肝硬化的灵敏度为 100%,特异性为 96%。使用 700 亿参数模型检测腹水(95%,95%)、混淆(76%,94%)、腹痛(84%,97%)和呼吸急促(87%,97%)时,灵敏度和特异性也很高,其表现优于较小的版本。我们的研究成功证明了本地部署的 LLM 能够以较低的硬件要求从自由文本中提取临床信息。
{"title":"Privacy-preserving large language models for structured medical information retrieval","authors":"Isabella Catharina Wiest,&nbsp;Dyke Ferber,&nbsp;Jiefu Zhu,&nbsp;Marko van Treeck,&nbsp;Sonja K. Meyer,&nbsp;Radhika Juglan,&nbsp;Zunamys I. Carrero,&nbsp;Daniel Paech,&nbsp;Jens Kleesiek,&nbsp;Matthias P. Ebert,&nbsp;Daniel Truhn,&nbsp;Jakob Nikolas Kather","doi":"10.1038/s41746-024-01233-2","DOIUrl":"10.1038/s41746-024-01233-2","url":null,"abstract":"Most clinical information is encoded as free text, not accessible for quantitative analysis. This study presents an open-source pipeline using the local large language model (LLM) “Llama 2” to extract quantitative information from clinical text and evaluates its performance in identifying features of decompensated liver cirrhosis. The LLM identified five key clinical features in a zero- and one-shot manner from 500 patient medical histories in the MIMIC IV dataset. We compared LLMs of three sizes and various prompt engineering approaches, with predictions compared against ground truth from three blinded medical experts. Our pipeline achieved high accuracy, detecting liver cirrhosis with 100% sensitivity and 96% specificity. High sensitivities and specificities were also yielded for detecting ascites (95%, 95%), confusion (76%, 94%), abdominal pain (84%, 97%), and shortness of breath (87%, 97%) using the 70 billion parameter model, which outperformed smaller versions. Our study successfully demonstrates the capability of locally deployed LLMs to extract clinical information from free text with low hardware requirements.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01233-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142275135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero shot health trajectory prediction using transformer 利用变压器预测零点射击健康轨迹
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-19 DOI: 10.1038/s41746-024-01235-0
Pawel Renc, Yugang Jia, Anthony E. Samir, Jaroslaw Was, Quanzheng Li, David W. Bates, Arkadiusz Sitek
Integrating modern machine learning and clinical decision-making has great promise for mitigating healthcare’s increasing cost and complexity. We introduce the Enhanced Transformer for Health Outcome Simulation (ETHOS), a novel application of the transformer deep-learning architecture for analyzing high-dimensional, heterogeneous, and episodic health data. ETHOS is trained using Patient Health Timelines (PHTs)—detailed, tokenized records of health events—to predict future health trajectories, leveraging a zero-shot learning approach. ETHOS represents a significant advancement in foundation model development for healthcare analytics, eliminating the need for labeled data and model fine-tuning. Its ability to simulate various treatment pathways and consider patient-specific factors positions ETHOS as a tool for care optimization and addressing biases in healthcare delivery. Future developments will expand ETHOS’ capabilities to incorporate a wider range of data types and data sources. Our work demonstrates a pathway toward accelerated AI development and deployment in healthcare.
将现代机器学习与临床决策相结合,对于降低医疗保健日益增长的成本和复杂性大有可为。我们介绍了用于健康结果模拟的增强变压器(ETHOS),这是变压器深度学习架构的一种新型应用,用于分析高维、异构和偶发性健康数据。ETHOS 利用患者健康时间轴(PHTs)--健康事件的详细标记化记录--进行训练,利用零点学习方法预测未来的健康轨迹。ETHOS 无需标注数据和模型微调,是医疗分析基础模型开发的重大进步。ETHOS 能够模拟各种治疗路径,并考虑患者的特定因素,是优化护理和解决医疗服务偏差的工具。未来的发展将扩展 ETHOS 的功能,以纳入更广泛的数据类型和数据源。我们的工作展示了在医疗保健领域加速人工智能开发和部署的途径。
{"title":"Zero shot health trajectory prediction using transformer","authors":"Pawel Renc,&nbsp;Yugang Jia,&nbsp;Anthony E. Samir,&nbsp;Jaroslaw Was,&nbsp;Quanzheng Li,&nbsp;David W. Bates,&nbsp;Arkadiusz Sitek","doi":"10.1038/s41746-024-01235-0","DOIUrl":"10.1038/s41746-024-01235-0","url":null,"abstract":"Integrating modern machine learning and clinical decision-making has great promise for mitigating healthcare’s increasing cost and complexity. We introduce the Enhanced Transformer for Health Outcome Simulation (ETHOS), a novel application of the transformer deep-learning architecture for analyzing high-dimensional, heterogeneous, and episodic health data. ETHOS is trained using Patient Health Timelines (PHTs)—detailed, tokenized records of health events—to predict future health trajectories, leveraging a zero-shot learning approach. ETHOS represents a significant advancement in foundation model development for healthcare analytics, eliminating the need for labeled data and model fine-tuning. Its ability to simulate various treatment pathways and consider patient-specific factors positions ETHOS as a tool for care optimization and addressing biases in healthcare delivery. Future developments will expand ETHOS’ capabilities to incorporate a wider range of data types and data sources. Our work demonstrates a pathway toward accelerated AI development and deployment in healthcare.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01235-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142245563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regulatory responses and approval status of artificial intelligence medical devices with a focus on China 以中国为重点的人工智能医疗器械监管对策和审批情况
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-18 DOI: 10.1038/s41746-024-01254-x
Yuehua Liu, Wenjin Yu, Tharam Dillon
This paper focuses on how regulatory bodies respond to artificial intelligence (AI)-enabled medical devices. To achieve this, we present a comparative overview of the United States (USA), European Union (EU), and China. Our search in the governmental database identified 59 AI medical devices approved in China as of July 2023. In comparison to the rules-based regulatory approach in China, the approaches in the USA and EU are more standards-oriented.
本文重点探讨监管机构如何应对人工智能(AI)医疗设备。为此,我们对美国(USA)、欧盟(EU)和中国进行了比较概述。我们在政府数据库中搜索到,截至 2023 年 7 月,中国共批准了 59 种人工智能医疗设备。与中国以规则为基础的监管方法相比,美国和欧盟的方法更加以标准为导向。
{"title":"Regulatory responses and approval status of artificial intelligence medical devices with a focus on China","authors":"Yuehua Liu,&nbsp;Wenjin Yu,&nbsp;Tharam Dillon","doi":"10.1038/s41746-024-01254-x","DOIUrl":"10.1038/s41746-024-01254-x","url":null,"abstract":"This paper focuses on how regulatory bodies respond to artificial intelligence (AI)-enabled medical devices. To achieve this, we present a comparative overview of the United States (USA), European Union (EU), and China. Our search in the governmental database identified 59 AI medical devices approved in China as of July 2023. In comparison to the rules-based regulatory approach in China, the approaches in the USA and EU are more standards-oriented.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01254-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142245564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic review and meta analysis on digital mental health interventions in inpatient settings 关于住院环境中数字心理健康干预措施的系统回顾和元分析
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-17 DOI: 10.1038/s41746-024-01252-z
Alexander Diel, Isabel Carolin Schröter, Anna-Lena Frewer, Christoph Jansen, Anita Robitzsch, Gertraud Gradl-Dietsch, Martin Teufel, Alexander Bäuerle
E-mental health (EMH) interventions gain increasing importance in the treatment of mental health disorders. Their outpatient efficacy is well-established. However, research on EMH in inpatient settings remains sparse and lacks a meta-analytic synthesis. This paper presents a meta-analysis on the efficacy of EMH in inpatient settings. Searching multiple databases (PubMed, ScienceGov, PsycInfo, CENTRAL, references), 26 randomized controlled trial (RCT) EMH inpatient studies (n = 6112) with low or medium assessed risk of bias were included. A small significant total effect of EMH treatment was found (g = 0.3). The effect was significant both for blended interventions (g = 0.42) and post-treatment EMH-based aftercare (g = 0.29). EMH treatment yielded significant effects across different patient groups and types of therapy, and the effects remained stable post-treatment. The results show the efficacy of EMH treatment in inpatient settings. The meta-analysis is limited by the small number of included studies.
电子心理健康(EMH)干预措施在心理健康疾病治疗中的重要性与日俱增。其门诊疗效已得到公认。然而,有关住院环境中电子心理健康干预的研究仍然很少,也缺乏荟萃分析。本文对EMH在住院环境中的疗效进行了荟萃分析。通过检索多个数据库(PubMed、ScienceGov、PsycInfo、CENTRAL、参考文献),纳入了26项随机对照试验(RCT)EMH住院研究(n = 6112),这些研究的偏倚风险评估为低或中等。研究发现,EMH治疗的总效果很小(g = 0.3)。混合干预(g = 0.42)和基于EMH的治疗后后续护理(g = 0.29)的效果都很明显。在不同的患者群体和治疗类型中,EMH治疗都产生了明显的效果,而且效果在治疗后保持稳定。结果表明,EMH疗法在住院环境中具有疗效。荟萃分析因纳入的研究数量较少而受到限制。
{"title":"A systematic review and meta analysis on digital mental health interventions in inpatient settings","authors":"Alexander Diel,&nbsp;Isabel Carolin Schröter,&nbsp;Anna-Lena Frewer,&nbsp;Christoph Jansen,&nbsp;Anita Robitzsch,&nbsp;Gertraud Gradl-Dietsch,&nbsp;Martin Teufel,&nbsp;Alexander Bäuerle","doi":"10.1038/s41746-024-01252-z","DOIUrl":"10.1038/s41746-024-01252-z","url":null,"abstract":"E-mental health (EMH) interventions gain increasing importance in the treatment of mental health disorders. Their outpatient efficacy is well-established. However, research on EMH in inpatient settings remains sparse and lacks a meta-analytic synthesis. This paper presents a meta-analysis on the efficacy of EMH in inpatient settings. Searching multiple databases (PubMed, ScienceGov, PsycInfo, CENTRAL, references), 26 randomized controlled trial (RCT) EMH inpatient studies (n = 6112) with low or medium assessed risk of bias were included. A small significant total effect of EMH treatment was found (g = 0.3). The effect was significant both for blended interventions (g = 0.42) and post-treatment EMH-based aftercare (g = 0.29). EMH treatment yielded significant effects across different patient groups and types of therapy, and the effects remained stable post-treatment. The results show the efficacy of EMH treatment in inpatient settings. The meta-analysis is limited by the small number of included studies.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01252-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142235063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A drug mix and dose decision algorithm for individualized type 2 diabetes management 个性化 2 型糖尿病管理的药物组合和剂量决策算法
IF 12.4 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-09-17 DOI: 10.1038/s41746-024-01230-5
Mila Nambiar, Yong Mong Bee, Yu En Chan, Ivan Ho Mien, Feri Guretno, David Carmody, Phong Ching Lee, Sing Yi Chia, Nur Nasyitah Mohamed Salim, Pavitra Krishnaswamy
Pharmacotherapy guidelines for type 2 diabetes (T2D) emphasize patient-centered care, but applying this approach effectively in outpatient practice remains challenging. Data-driven treatment optimization approaches could enhance individualized T2D management, but current approaches cannot account for drug-specific and dose-dependent variations in safety and efficacy. We developed and evaluated an AI Drug mix and dose Advisor (AIDA) for glycemic management, using electronic medical records from 107,854 T2D patients in the SingHealth Diabetes Registry. Given a patient’s medical profile, AIDA leverages a predict-then-optimize approach to identify the minimal drug mix and dose changes required to optimize glycemic control, subject to clinical knowledge-based guidelines. On unseen data from large internal, external, and temporal validation sets, AIDA recommendations were estimated to improve post-visit glycated hemoglobin (HbA1c) by an average of 0.40–0.68% over standard of care (P < 0.0001). In qualitative evaluations on 60 diverse cases by a panel of three endocrinologists, AIDA recommendations were mostly rated as reasonable and precise. Finally, AIDA’s ability to account for drug-dose specifics offered several advantages over competing methods, including greater consistency with practice preferences and clinical guidelines for practical but effective options, indication-based treatments, and renal dosing. As AIDA provides drug-dose recommendations to improve outcomes for individual T2D patients, it could be used for clinical decision support at point-of-care, especially in resource-limited settings.
2 型糖尿病(T2D)的药物治疗指南强调以患者为中心的护理,但在门诊实践中有效应用这种方法仍具有挑战性。以数据为驱动的治疗优化方法可以加强 2 型糖尿病的个体化管理,但目前的方法无法解释药物在安全性和有效性方面的特异性和剂量依赖性变化。我们利用新加坡保健集团糖尿病登记处 107,854 名 T2D 患者的电子病历,开发并评估了用于血糖管理的人工智能药物组合和剂量顾问(AIDA)。根据患者的医疗档案,AIDA 采用预测-优化方法,根据基于临床知识的指南,确定优化血糖控制所需的最小药物组合和剂量变化。在来自大型内部、外部和时间验证集的未见数据中,AIDA 的建议估计可将就诊后的糖化血红蛋白 (HbA1c) 平均提高 0.40-0.68%(P < 0.0001)。在由三位内分泌专家组成的小组对 60 个不同病例进行的定性评估中,AIDA 的建议大多被评为合理、精确。最后,与其他竞争方法相比,AIDA 能够考虑药物剂量的具体情况,因此具有一些优势,包括更符合实践偏好和临床指南,可提供实用但有效的选择、基于适应症的治疗和肾脏剂量。由于 AIDA 提供的药物剂量建议可改善 T2D 患者的个体治疗效果,因此可用于护理点的临床决策支持,尤其是在资源有限的环境中。
{"title":"A drug mix and dose decision algorithm for individualized type 2 diabetes management","authors":"Mila Nambiar,&nbsp;Yong Mong Bee,&nbsp;Yu En Chan,&nbsp;Ivan Ho Mien,&nbsp;Feri Guretno,&nbsp;David Carmody,&nbsp;Phong Ching Lee,&nbsp;Sing Yi Chia,&nbsp;Nur Nasyitah Mohamed Salim,&nbsp;Pavitra Krishnaswamy","doi":"10.1038/s41746-024-01230-5","DOIUrl":"10.1038/s41746-024-01230-5","url":null,"abstract":"Pharmacotherapy guidelines for type 2 diabetes (T2D) emphasize patient-centered care, but applying this approach effectively in outpatient practice remains challenging. Data-driven treatment optimization approaches could enhance individualized T2D management, but current approaches cannot account for drug-specific and dose-dependent variations in safety and efficacy. We developed and evaluated an AI Drug mix and dose Advisor (AIDA) for glycemic management, using electronic medical records from 107,854 T2D patients in the SingHealth Diabetes Registry. Given a patient’s medical profile, AIDA leverages a predict-then-optimize approach to identify the minimal drug mix and dose changes required to optimize glycemic control, subject to clinical knowledge-based guidelines. On unseen data from large internal, external, and temporal validation sets, AIDA recommendations were estimated to improve post-visit glycated hemoglobin (HbA1c) by an average of 0.40–0.68% over standard of care (P &lt; 0.0001). In qualitative evaluations on 60 diverse cases by a panel of three endocrinologists, AIDA recommendations were mostly rated as reasonable and precise. Finally, AIDA’s ability to account for drug-dose specifics offered several advantages over competing methods, including greater consistency with practice preferences and clinical guidelines for practical but effective options, indication-based treatments, and renal dosing. As AIDA provides drug-dose recommendations to improve outcomes for individual T2D patients, it could be used for clinical decision support at point-of-care, especially in resource-limited settings.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01230-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142235102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
NPJ Digital Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1