首页 > 最新文献

Journal of Medical Internet Research最新文献

英文 中文
Toward Guidelines for Designing Holistic Integrated Information Visualizations for Time-Critical Contexts: Systematic Review. 为关键时刻设计整体综合信息可视化的指南:系统回顾。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.2196/58088
Ahmed Mohammed Patel, Weston Baxter, Talya Porat
<p><strong>Background: </strong>With the extensive volume of information from various and diverse data sources, it is essential to present information in a way that allows for quick understanding and interpretation. This is particularly crucial in health care, where timely insights into a patient's condition can be lifesaving. Holistic visualizations that integrate multiple data variables into a single visual representation can enhance rapid situational awareness and support informed decision-making. However, despite the existence of numerous guidelines for different types of visualizations, this study reveals that there are currently no specific guidelines or principles for designing holistic integrated information visualizations that enable quick processing and comprehensive understanding of multidimensional data in time-critical contexts. Addressing this gap is essential for enhancing decision-making in time-critical scenarios across various domains, particularly in health care.</p><p><strong>Objective: </strong>This study aims to establish a theoretical foundation supporting the argument that holistic integrated visualizations are a distinct type of visualization for time-critical contexts and identify applicable design principles and guidelines that can be used to design for such cases.</p><p><strong>Methods: </strong>We systematically searched the literature for peer-reviewed research on visualization strategies, guidelines, and taxonomies. The literature selection followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. The search was conducted across 6 databases: ACM Digital Library, Google Scholar, IEEE Xplore, PubMed, Scopus, and Web of Science. The search was conducted up to August 2024 using the terms ("visualisations" OR "visualizations") AND ("guidelines" OR "taxonomy" OR "taxonomies"), with studies restricted to the English language.</p><p><strong>Results: </strong>Of 936 papers, 46 (4.9%) were included in the final review. In total, 48% (22/46) related to providing a holistic understanding and overview of multidimensional data; 28% (13/46) focused on integrated presentation, that is, integrating or combining multidimensional data into a single visual representation; and 35% (16/46) pertained to time and designing for rapid information processing. In total, 65% (30/46) of the papers presented general information visualization or visual communication guidelines and principles. No specific guidelines or principles were found that addressed all the characteristics of holistic, integrated visualizations in time-critical contexts. A summary of the key guidelines and principles from the 46 papers was extracted, collated, and categorized into 60 guidelines that could aid in designing holistic integrated visualizations. These were grouped according to different characteristics identified in the systematic review (eg, gestalt principles, reduction, organization, abstraction, and task complexity) a
背景:面对来自各种不同数据源的大量信息,以一种便于快速理解和解释的方式呈现信息至关重要。这一点在医疗保健领域尤为重要,因为及时了解病人的病情可以挽救生命。将多个数据变量整合到单一可视化表示中的整体可视化可提高快速的态势感知能力,并支持知情决策。然而,尽管针对不同类型的可视化有许多指南,但本研究显示,目前还没有设计整体综合信息可视化的具体指南或原则,以便在时间紧迫的情况下快速处理和全面理解多维数据。要想在时间紧迫的情况下加强各领域的决策制定,尤其是医疗保健领域的决策制定,解决这一差距至关重要:本研究旨在建立一个理论基础,支持整体综合可视化是时间关键型环境下一种独特的可视化类型这一论点,并确定可用于此类情况设计的适用设计原则和指南:我们系统地搜索了有关可视化策略、指南和分类法的同行评审研究文献。文献选择遵循 PRISMA(系统综述和元分析首选报告项目)指南。检索在 6 个数据库中进行:ACM 数字图书馆、Google Scholar、IEEE Xplore、PubMed、Scopus 和 Web of Science。使用术语("可视化 "或 "可视化")和("指南 "或 "分类法 "或 "分类标准")进行搜索,搜索时间截至 2024 年 8 月,研究仅限于英语:在 936 篇论文中,有 46 篇(4.9%)被纳入最终评审。其中,48%(22/46)的论文涉及对多维数据的整体理解和概述;28%(13/46)的论文侧重于综合呈现,即把多维数据整合或组合成单一的可视化呈现;35%(16/46)的论文涉及时间和快速信息处理设计。总之,65%(30/46)的论文介绍了一般的信息可视化或视觉传达指南和原则。在时间紧迫的情况下,没有发现任何具体的指导方针或原则涉及整体、综合可视化的所有特征。我们从 46 篇论文中提取、整理并归类了 60 项有助于设计整体综合可视化的关键指南和原则。这些准则根据系统综述中确定的不同特征(如格式塔原则、缩减、组织、抽象和任务复杂性)进行了分组,并进一步浓缩为 5 项主要的拟议准则:结论:时间紧迫领域的整体综合信息可视化是一种独特的使用案例,需要一套独特的设计准则。我们从现有的设计理论和指导原则中提出的 5 条主要指导原则,可以作为一个起点,实现信息的整体和快速处理,从而在时间紧迫的情况下做出更明智的决策。
{"title":"Toward Guidelines for Designing Holistic Integrated Information Visualizations for Time-Critical Contexts: Systematic Review.","authors":"Ahmed Mohammed Patel, Weston Baxter, Talya Porat","doi":"10.2196/58088","DOIUrl":"10.2196/58088","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;With the extensive volume of information from various and diverse data sources, it is essential to present information in a way that allows for quick understanding and interpretation. This is particularly crucial in health care, where timely insights into a patient's condition can be lifesaving. Holistic visualizations that integrate multiple data variables into a single visual representation can enhance rapid situational awareness and support informed decision-making. However, despite the existence of numerous guidelines for different types of visualizations, this study reveals that there are currently no specific guidelines or principles for designing holistic integrated information visualizations that enable quick processing and comprehensive understanding of multidimensional data in time-critical contexts. Addressing this gap is essential for enhancing decision-making in time-critical scenarios across various domains, particularly in health care.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to establish a theoretical foundation supporting the argument that holistic integrated visualizations are a distinct type of visualization for time-critical contexts and identify applicable design principles and guidelines that can be used to design for such cases.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We systematically searched the literature for peer-reviewed research on visualization strategies, guidelines, and taxonomies. The literature selection followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. The search was conducted across 6 databases: ACM Digital Library, Google Scholar, IEEE Xplore, PubMed, Scopus, and Web of Science. The search was conducted up to August 2024 using the terms (\"visualisations\" OR \"visualizations\") AND (\"guidelines\" OR \"taxonomy\" OR \"taxonomies\"), with studies restricted to the English language.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Of 936 papers, 46 (4.9%) were included in the final review. In total, 48% (22/46) related to providing a holistic understanding and overview of multidimensional data; 28% (13/46) focused on integrated presentation, that is, integrating or combining multidimensional data into a single visual representation; and 35% (16/46) pertained to time and designing for rapid information processing. In total, 65% (30/46) of the papers presented general information visualization or visual communication guidelines and principles. No specific guidelines or principles were found that addressed all the characteristics of holistic, integrated visualizations in time-critical contexts. A summary of the key guidelines and principles from the 46 papers was extracted, collated, and categorized into 60 guidelines that could aid in designing holistic integrated visualizations. These were grouped according to different characteristics identified in the systematic review (eg, gestalt principles, reduction, organization, abstraction, and task complexity) a","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e58088"},"PeriodicalIF":5.8,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142682042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance of a Full-Coverage Cervical Cancer Screening Program Using on an Artificial Intelligence- and Cloud-Based Diagnostic System: Observational Study of an Ultralarge Population. 基于人工智能和云诊断系统的全覆盖宫颈癌筛查计划的绩效:超大规模人群观察研究。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.2196/51477
Lu Ji, Yifan Yao, Dandan Yu, Wen Chen, Shanshan Yin, Yun Fu, Shangfeng Tang, Lan Yao

Background: The World Health Organization has set a global strategy to eliminate cervical cancer, emphasizing the need for cervical cancer screening coverage to reach 70%. In response, China has developed an action plan to accelerate the elimination of cervical cancer, with Hubei province implementing China's first provincial full-coverage screening program using an artificial intelligence (AI) and cloud-based diagnostic system.

Objective: This study aimed to evaluate the performance of AI technology in this full-coverage screening program. The evaluation indicators included accessibility, screening efficiency, diagnostic quality, and program cost.

Methods: Characteristics of 1,704,461 individuals screened from July 2022 to January 2023 were used to analyze accessibility and AI screening efficiency. A random sample of 220 individuals was used for external diagnostic quality control. The costs of different participating screening institutions were assessed.

Results: Cervical cancer screening services were extended to all administrative districts, especially in rural areas. Rural women had the highest participation rate at 67.54% (1,147,839/1,699,591). Approximately 1.7 million individuals were screened, achieving a cumulative coverage of 13.45% in about 6 months. Full-coverage programs could be achieved by AI technology in approximately 1 year, which was 87.5 times more efficient than the manual reading of slides. The sample compliance rate was as high as 99.1%, and compliance rates for positive, negative, and pathology biopsy reviews exceeded 96%. The cost of this program was CN ¥49 (the average exchange rate in 2022 is as follows: US $1=CN ¥6.7261) per person, with the primary screening institution and the third-party testing institute receiving CN ¥19 and ¥27, respectively.

Conclusions: AI-assisted diagnosis has proven to be accessible, efficient, reliable, and low cost, which could support the implementation of full-coverage screening programs, especially in areas with insufficient health resources. AI technology served as a crucial tool for rapidly and effectively increasing screening coverage, which would accelerate the achievement of the World Health Organization's goals of eliminating cervical cancer.

背景:世界卫生组织制定了消除宫颈癌的全球战略,强调宫颈癌筛查覆盖率要达到70%。为此,中国制定了加快消除宫颈癌的行动计划,湖北省利用人工智能(AI)和云诊断系统实施了中国首个省级全覆盖筛查项目:本研究旨在评估人工智能技术在该全覆盖筛查项目中的表现。评估指标包括可及性、筛查效率、诊断质量和项目成本:利用 2022 年 7 月至 2023 年 1 月期间筛查的 1,704,461 人的特征来分析可及性和人工智能筛查效率。随机抽取 220 人进行外部诊断质量控制。对不同参与筛查机构的成本进行了评估:宫颈癌筛查服务已扩展到所有行政区,尤其是农村地区。农村妇女的参与率最高,达到 67.54%(1 147 839/1 699 591)。约 170 万人接受了筛查,在约 6 个月的时间里,累计覆盖率达到 13.45%。人工智能技术可在约 1 年内实现全覆盖计划,其效率是人工读片的 87.5 倍。样本符合率高达 99.1%,阳性、阴性和病理活检复查的符合率均超过 96%。该项目的成本为每人 49 人民币(2022 年平均汇率如下:1 美元=6.7261 人民币),初筛机构和第三方检测机构分别获得 19 和 27 人民币:结论:事实证明,人工智能辅助诊断方便、高效、可靠且成本低廉,可支持全覆盖筛查计划的实施,尤其是在卫生资源不足的地区。人工智能技术是快速有效提高筛查覆盖率的重要工具,将加快实现世界卫生组织消除宫颈癌的目标。
{"title":"Performance of a Full-Coverage Cervical Cancer Screening Program Using on an Artificial Intelligence- and Cloud-Based Diagnostic System: Observational Study of an Ultralarge Population.","authors":"Lu Ji, Yifan Yao, Dandan Yu, Wen Chen, Shanshan Yin, Yun Fu, Shangfeng Tang, Lan Yao","doi":"10.2196/51477","DOIUrl":"https://doi.org/10.2196/51477","url":null,"abstract":"<p><strong>Background: </strong>The World Health Organization has set a global strategy to eliminate cervical cancer, emphasizing the need for cervical cancer screening coverage to reach 70%. In response, China has developed an action plan to accelerate the elimination of cervical cancer, with Hubei province implementing China's first provincial full-coverage screening program using an artificial intelligence (AI) and cloud-based diagnostic system.</p><p><strong>Objective: </strong>This study aimed to evaluate the performance of AI technology in this full-coverage screening program. The evaluation indicators included accessibility, screening efficiency, diagnostic quality, and program cost.</p><p><strong>Methods: </strong>Characteristics of 1,704,461 individuals screened from July 2022 to January 2023 were used to analyze accessibility and AI screening efficiency. A random sample of 220 individuals was used for external diagnostic quality control. The costs of different participating screening institutions were assessed.</p><p><strong>Results: </strong>Cervical cancer screening services were extended to all administrative districts, especially in rural areas. Rural women had the highest participation rate at 67.54% (1,147,839/1,699,591). Approximately 1.7 million individuals were screened, achieving a cumulative coverage of 13.45% in about 6 months. Full-coverage programs could be achieved by AI technology in approximately 1 year, which was 87.5 times more efficient than the manual reading of slides. The sample compliance rate was as high as 99.1%, and compliance rates for positive, negative, and pathology biopsy reviews exceeded 96%. The cost of this program was CN ¥49 (the average exchange rate in 2022 is as follows: US $1=CN ¥6.7261) per person, with the primary screening institution and the third-party testing institute receiving CN ¥19 and ¥27, respectively.</p><p><strong>Conclusions: </strong>AI-assisted diagnosis has proven to be accessible, efficient, reliable, and low cost, which could support the implementation of full-coverage screening programs, especially in areas with insufficient health resources. AI technology served as a crucial tool for rapidly and effectively increasing screening coverage, which would accelerate the achievement of the World Health Organization's goals of eliminating cervical cancer.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e51477"},"PeriodicalIF":5.8,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142682038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effectiveness of Digital Health Interventions in Promoting Physical Activity Among College Students: Systematic Review and Meta-Analysis. 数字健康干预对促进大学生体育锻炼的效果:系统回顾与元分析》。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.2196/51714
Siyuan Bi, Junfeng Yuan, Yanling Wang, Wenxin Zhang, Luqin Zhang, Yongjuan Zhang, Rui Zhu, Lin Luo

Background: Recent studies offer conflicting conclusions about the effectiveness of digital health interventions in changing physical activity behaviors. In addition, research focusing on digital health interventions for college students remains relatively scarce.

Objective: This study aims to examine the impact of digital health interventions on physical activity behaviors among college students, using objective measures as outcome indicators.

Methods: In accordance with the 2020 PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a comprehensive literature search was conducted across several databases, including MEDLINE (PubMed), Web of Science, Cochrane Library, and EBSCO (CINAHL Plus with full text), to identify relevant intervention studies published up to June 6, 2023. The inclusion criteria specified studies that examined the quantitative relationships between digital health interventions and physical activity among adults aged 18 years to 29 years, focusing on light physical activity (LPA), moderate to vigorous physical activity (MVPA), sedentary time (ST), or steps. Non-randomized controlled trials were excluded. The quality of the studies was assessed using the Cochrane Risk of Bias tool. Results were synthesized both narratively and quantitatively, where applicable. When sufficient homogeneity was found among studies, a random-effects model was used for meta-analysis to account for variability.

Results: In total, 8 studies, encompassing 569 participants, were included in the analysis. The primary outcomes measured were LPA, MVPA, ST, and steps. Among these studies, 3 reported on LPA, 5 on MVPA, 5 on ST, and 3 on steps. The meta-analysis revealed a significant increase in steps for the intervention group compared with the control group (standardized mean difference [SMD] 0.64, 95% CI 0.37-0.92; P<.001). However, no significant differences were observed between the intervention and control groups regarding LPA (SMD -0.08, 95% CI -0.32 to 0.16; P=.51), MVPA (SMD 0.02, 95% CI -0.19 to 0.22; P=.88), and ST (SMD 0.03, 95% CI -0.18 to 0.24; P=.78).

Conclusions: Digital health interventions are effective in increasing steps among college students; however, their effects on LPA, MVPA, and sedentary behavior are limited.

Trial registration: PROSPERO CRD42024533180; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=533180.

背景:最近的研究对数字健康干预措施在改变体育锻炼行为方面的效果得出了相互矛盾的结论。此外,针对大学生的数字健康干预研究仍然相对较少:本研究旨在以客观指标作为结果指标,考察数字健康干预对大学生体育锻炼行为的影响:根据 2020 年 PRISMA(系统综述和元分析首选报告项目)指南,我们在多个数据库中进行了全面的文献检索,包括 MEDLINE (PubMed)、Web of Science、Cochrane Library 和 EBSCO (CINAHL Plus with full text),以确定截至 2023 年 6 月 6 日发表的相关干预研究。纳入标准规定了研究数字健康干预措施与 18 岁至 29 岁成年人身体活动之间定量关系的研究,重点关注轻度身体活动 (LPA)、中度至剧烈身体活动 (MVPA)、久坐时间 (ST) 或步数。非随机对照试验被排除在外。研究质量采用 Cochrane 偏倚风险工具进行评估。在适当的情况下,对结果进行叙述性和定量综合。如果发现研究之间存在足够的同质性,则采用随机效应模型进行荟萃分析,以考虑变异性:共有 8 项研究纳入分析,涉及 569 名参与者。测量的主要结果为 LPA、MVPA、ST 和步数。在这些研究中,3 项报告了 LPA,5 项报告了 MVPA,5 项报告了 ST,3 项报告了步数。荟萃分析表明,与对照组相比,干预组的步数显著增加(标准化平均差 [SMD] 0.64,95% CI 0.37-0.92;PC 结论:数字健康干预能有效增加步数:数字健康干预措施能有效增加大学生的步数,但其对LPA、MVPA和久坐行为的影响有限:ProCORD42024533180; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=533180.
{"title":"Effectiveness of Digital Health Interventions in Promoting Physical Activity Among College Students: Systematic Review and Meta-Analysis.","authors":"Siyuan Bi, Junfeng Yuan, Yanling Wang, Wenxin Zhang, Luqin Zhang, Yongjuan Zhang, Rui Zhu, Lin Luo","doi":"10.2196/51714","DOIUrl":"https://doi.org/10.2196/51714","url":null,"abstract":"<p><strong>Background: </strong>Recent studies offer conflicting conclusions about the effectiveness of digital health interventions in changing physical activity behaviors. In addition, research focusing on digital health interventions for college students remains relatively scarce.</p><p><strong>Objective: </strong>This study aims to examine the impact of digital health interventions on physical activity behaviors among college students, using objective measures as outcome indicators.</p><p><strong>Methods: </strong>In accordance with the 2020 PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a comprehensive literature search was conducted across several databases, including MEDLINE (PubMed), Web of Science, Cochrane Library, and EBSCO (CINAHL Plus with full text), to identify relevant intervention studies published up to June 6, 2023. The inclusion criteria specified studies that examined the quantitative relationships between digital health interventions and physical activity among adults aged 18 years to 29 years, focusing on light physical activity (LPA), moderate to vigorous physical activity (MVPA), sedentary time (ST), or steps. Non-randomized controlled trials were excluded. The quality of the studies was assessed using the Cochrane Risk of Bias tool. Results were synthesized both narratively and quantitatively, where applicable. When sufficient homogeneity was found among studies, a random-effects model was used for meta-analysis to account for variability.</p><p><strong>Results: </strong>In total, 8 studies, encompassing 569 participants, were included in the analysis. The primary outcomes measured were LPA, MVPA, ST, and steps. Among these studies, 3 reported on LPA, 5 on MVPA, 5 on ST, and 3 on steps. The meta-analysis revealed a significant increase in steps for the intervention group compared with the control group (standardized mean difference [SMD] 0.64, 95% CI 0.37-0.92; P<.001). However, no significant differences were observed between the intervention and control groups regarding LPA (SMD -0.08, 95% CI -0.32 to 0.16; P=.51), MVPA (SMD 0.02, 95% CI -0.19 to 0.22; P=.88), and ST (SMD 0.03, 95% CI -0.18 to 0.24; P=.78).</p><p><strong>Conclusions: </strong>Digital health interventions are effective in increasing steps among college students; however, their effects on LPA, MVPA, and sedentary behavior are limited.</p><p><strong>Trial registration: </strong>PROSPERO CRD42024533180; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=533180.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e51714"},"PeriodicalIF":5.8,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142682036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation Framework of Large Language Models in Medical Documentation: Development and Usability Study. 医学文档中大型语言模型的评估框架:开发和可用性研究
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.2196/58329
Junhyuk Seo, Dasol Choi, Taerim Kim, Won Chul Cha, Minha Kim, Haanju Yoo, Namkee Oh, YongJin Yi, Kye Hwa Lee, Edward Choi

Background: The advancement of large language models (LLMs) offers significant opportunities for health care, particularly in the generation of medical documentation. However, challenges related to ensuring the accuracy and reliability of LLM outputs, coupled with the absence of established quality standards, have raised concerns about their clinical application.

Objective: This study aimed to develop and validate an evaluation framework for assessing the accuracy and clinical applicability of LLM-generated emergency department (ED) records, aiming to enhance artificial intelligence integration in health care documentation.

Methods: We organized the Healthcare Prompt-a-thon, a competitive event designed to explore the capabilities of LLMs in generating accurate medical records. The event involved 52 participants who generated 33 initial ED records using HyperCLOVA X, a Korean-specialized LLM. We applied a dual evaluation approach. First, clinical evaluation: 4 medical professionals evaluated the records using a 5-point Likert scale across 5 criteria-appropriateness, accuracy, structure/format, conciseness, and clinical validity. Second, quantitative evaluation: We developed a framework to categorize and count errors in the LLM outputs, identifying 7 key error types. Statistical methods, including Pearson correlation and intraclass correlation coefficients (ICC), were used to assess consistency and agreement among evaluators.

Results: The clinical evaluation demonstrated strong interrater reliability, with ICC values ranging from 0.653 to 0.887 (P<.001), and a test-retest reliability Pearson correlation coefficient of 0.776 (P<.001). Quantitative analysis revealed that invalid generation errors were the most common, constituting 35.38% of total errors, while structural malformation errors had the most significant negative impact on the clinical evaluation score (Pearson r=-0.654; P<.001). A strong negative correlation was found between the number of quantitative errors and clinical evaluation scores (Pearson r=-0.633; P<.001), indicating that higher error rates corresponded to lower clinical acceptability.

Conclusions: Our research provides robust support for the reliability and clinical acceptability of the proposed evaluation framework. It underscores the framework's potential to mitigate clinical burdens and foster the responsible integration of artificial intelligence technologies in health care, suggesting a promising direction for future research and practical applications in the field.

背景:大语言模型(LLMs)的发展为医疗保健提供了重要机遇,尤其是在生成医疗文件方面。然而,与确保 LLM 输出的准确性和可靠性有关的挑战,加上缺乏既定的质量标准,引起了人们对其临床应用的担忧:本研究旨在开发和验证一个评估框架,用于评估 LLM 生成的急诊科(ED)记录的准确性和临床适用性,从而加强人工智能在医疗文档中的整合:我们组织了 "医疗保健提示竞赛"(Healthcare Prompt-a-thon),这是一项旨在探索 LLM 生成准确医疗记录能力的竞赛活动。52 名参赛者使用韩国专用 LLM HyperCLOVA X 生成了 33 份初步 ED 记录。我们采用了双重评估方法。首先是临床评估:4 位医学专家使用 5 点李克特量表对病历进行了评估,包括 5 项标准--适宜性、准确性、结构/格式、简洁性和临床有效性。第二,定量评估:我们建立了一个框架,对 LLM 输出中的错误进行分类和统计,确定了 7 种主要错误类型。统计方法包括皮尔逊相关性和类内相关系数(ICC),用于评估评估者之间的一致性和一致性:结果:临床评估显示了很强的评估者间可靠性,ICC 值从 0.653 到 0.887 不等(PC 结论:我们的研究为临床评估的可靠性提供了强有力的支持:我们的研究为拟议评估框架的可靠性和临床可接受性提供了有力的支持。它强调了该框架在减轻临床负担和促进人工智能技术以负责任的方式融入医疗保健领域方面的潜力,为该领域未来的研究和实际应用指明了方向。
{"title":"Evaluation Framework of Large Language Models in Medical Documentation: Development and Usability Study.","authors":"Junhyuk Seo, Dasol Choi, Taerim Kim, Won Chul Cha, Minha Kim, Haanju Yoo, Namkee Oh, YongJin Yi, Kye Hwa Lee, Edward Choi","doi":"10.2196/58329","DOIUrl":"https://doi.org/10.2196/58329","url":null,"abstract":"<p><strong>Background: </strong>The advancement of large language models (LLMs) offers significant opportunities for health care, particularly in the generation of medical documentation. However, challenges related to ensuring the accuracy and reliability of LLM outputs, coupled with the absence of established quality standards, have raised concerns about their clinical application.</p><p><strong>Objective: </strong>This study aimed to develop and validate an evaluation framework for assessing the accuracy and clinical applicability of LLM-generated emergency department (ED) records, aiming to enhance artificial intelligence integration in health care documentation.</p><p><strong>Methods: </strong>We organized the Healthcare Prompt-a-thon, a competitive event designed to explore the capabilities of LLMs in generating accurate medical records. The event involved 52 participants who generated 33 initial ED records using HyperCLOVA X, a Korean-specialized LLM. We applied a dual evaluation approach. First, clinical evaluation: 4 medical professionals evaluated the records using a 5-point Likert scale across 5 criteria-appropriateness, accuracy, structure/format, conciseness, and clinical validity. Second, quantitative evaluation: We developed a framework to categorize and count errors in the LLM outputs, identifying 7 key error types. Statistical methods, including Pearson correlation and intraclass correlation coefficients (ICC), were used to assess consistency and agreement among evaluators.</p><p><strong>Results: </strong>The clinical evaluation demonstrated strong interrater reliability, with ICC values ranging from 0.653 to 0.887 (P<.001), and a test-retest reliability Pearson correlation coefficient of 0.776 (P<.001). Quantitative analysis revealed that invalid generation errors were the most common, constituting 35.38% of total errors, while structural malformation errors had the most significant negative impact on the clinical evaluation score (Pearson r=-0.654; P<.001). A strong negative correlation was found between the number of quantitative errors and clinical evaluation scores (Pearson r=-0.633; P<.001), indicating that higher error rates corresponded to lower clinical acceptability.</p><p><strong>Conclusions: </strong>Our research provides robust support for the reliability and clinical acceptability of the proposed evaluation framework. It underscores the framework's potential to mitigate clinical burdens and foster the responsible integration of artificial intelligence technologies in health care, suggesting a promising direction for future research and practical applications in the field.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e58329"},"PeriodicalIF":5.8,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142682037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Challenges and Lessons Learned Building a New UK Infrastructure for Finding and Accessing Population-Wide COVID-19 Data for Research and Public Health Analysis: The CO-CONNECT Project. 为研究和公共卫生分析查找和访问全人口 COVID-19 数据而建立新的英国基础设施的挑战和经验教训:CO-CONNECT 项目。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.2196/50235
Emily Jefferson, Gordon Milligan, Jenny Johnston, Shahzad Mumtaz, Christian Cole, Joseph Best, Thomas Charles Giles, Samuel Cox, Erum Masood, Scott Horban, Esmond Urwin, Jillian Beggs, Antony Chuter, Gerry Reilly, Andrew Morris, David Seymour, Susan Hopkins, Aziz Sheikh, Philip Quinlan

The COVID-19-Curated and Open Analysis and Research Platform (CO-CONNECT) project worked with 22 organizations across the United Kingdom to build a federated platform, enabling researchers to instantaneously and dynamically query federated datasets to find relevant data for their study. Finding relevant data takes time and effort, reducing the efficiency of research. Although data controllers could understand the value of such a system, there were significant challenges and delays in setting up the platform in response to COVID-19. This paper aims to present the challenges and lessons learned from the CO-CONNECT project to support other similar initiatives in the future. The project encountered many challenges, including the impacts of lockdowns on collaboration, understanding the new architecture, competing demands on people's time during a pandemic, data governance approvals, different levels of technical capabilities, data transformation to a common data model, access to granular-level laboratory data, and how to engage public and patient representatives meaningfully on a highly technical project. To overcome these challenges, we developed a range of methods to support data partners such as explainer videos; regular, short, "touch base" videoconference calls; drop-in workshops; live demos; and a standardized technical onboarding documentation pack. A 4-stage data governance process emerged. The patient and public representatives were fully integrated team members. Persistence, patience, and understanding were key. We make 8 recommendations to change the landscape for future similar initiatives. The new architecture and processes developed are being built upon for non-COVID-19-related data, providing an infrastructural legacy.

COVID-19-Curated and Open Analysis and Research Platform(CO-CONNECT)项目与英国的 22 家机构合作建立了一个联合平台,使研究人员能够即时、动态地查询联合数据集,为其研究找到相关数据。查找相关数据费时费力,降低了研究效率。尽管数据控制者能够理解这样一个系统的价值,但在针对 COVID-19 建立平台的过程中却遇到了巨大的挑战和延误。本文旨在介绍 CO-CONNECT 项目所面临的挑战和汲取的经验教训,以便为今后其他类似项目提供支持。该项目遇到了许多挑战,包括封锁对合作的影响、对新架构的理解、大流行期间对人们时间的竞争性需求、数据治理审批、不同级别的技术能力、向通用数据模型的数据转换、对细粒度实验室数据的访问,以及如何让公众和患者代表有意义地参与到一个高度技术性的项目中。为了克服这些挑战,我们开发了一系列方法为数据合作伙伴提供支持,例如讲解视频、定期、简短的 "接触式 "视频会议电话、随到随学的研讨会、现场演示以及标准化的入职技术文档包。形成了 4 个阶段的数据管理流程。患者和公众代表是完全融入团队的成员。坚持、耐心和理解是关键。我们提出了 8 项建议,以改变未来类似计划的格局。开发的新架构和流程将用于与 COVID-19 无关的数据,从而为基础架构提供遗产。
{"title":"The Challenges and Lessons Learned Building a New UK Infrastructure for Finding and Accessing Population-Wide COVID-19 Data for Research and Public Health Analysis: The CO-CONNECT Project.","authors":"Emily Jefferson, Gordon Milligan, Jenny Johnston, Shahzad Mumtaz, Christian Cole, Joseph Best, Thomas Charles Giles, Samuel Cox, Erum Masood, Scott Horban, Esmond Urwin, Jillian Beggs, Antony Chuter, Gerry Reilly, Andrew Morris, David Seymour, Susan Hopkins, Aziz Sheikh, Philip Quinlan","doi":"10.2196/50235","DOIUrl":"https://doi.org/10.2196/50235","url":null,"abstract":"<p><p>The COVID-19-Curated and Open Analysis and Research Platform (CO-CONNECT) project worked with 22 organizations across the United Kingdom to build a federated platform, enabling researchers to instantaneously and dynamically query federated datasets to find relevant data for their study. Finding relevant data takes time and effort, reducing the efficiency of research. Although data controllers could understand the value of such a system, there were significant challenges and delays in setting up the platform in response to COVID-19. This paper aims to present the challenges and lessons learned from the CO-CONNECT project to support other similar initiatives in the future. The project encountered many challenges, including the impacts of lockdowns on collaboration, understanding the new architecture, competing demands on people's time during a pandemic, data governance approvals, different levels of technical capabilities, data transformation to a common data model, access to granular-level laboratory data, and how to engage public and patient representatives meaningfully on a highly technical project. To overcome these challenges, we developed a range of methods to support data partners such as explainer videos; regular, short, \"touch base\" videoconference calls; drop-in workshops; live demos; and a standardized technical onboarding documentation pack. A 4-stage data governance process emerged. The patient and public representatives were fully integrated team members. Persistence, patience, and understanding were key. We make 8 recommendations to change the landscape for future similar initiatives. The new architecture and processes developed are being built upon for non-COVID-19-related data, providing an infrastructural legacy.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e50235"},"PeriodicalIF":5.8,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142682040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Impact of Patient Access to Electronic Health Records on Health Care Engagement: Systematic Review. 患者访问电子健康记录对参与医疗保健的影响:系统回顾。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.2196/56473
Dalia Alomar, Maryam Almashmoum, Iliada Eleftheriou, Pauline Whelan, John Ainsworth

Background: Health information technologies, including electronic health records (EHRs), have revolutionized health care delivery. These technologies promise to enhance the efficiency and quality of care through improved patient health information management. Despite the transformative potential of EHRs, the extent to which patient access contributes to increased engagement with health care services within different clinical setting remains a distinct and underexplored facet.

Objective: This systematic review aims to investigate the impact of patient access to EHRs on health care engagement. Specifically, we seek to determine whether providing patients with access to their EHRs contributes to improved engagement with health care services.

Methods: A comprehensive systematic review search was conducted across various international databases, including Ovid MEDLINE, Embase, PsycINFO, and CINAHL, to identify relevant studies published from January 1, 2010, to November 15, 2023. The search on these databases was conducted using a combination of keywords and Medical Subject Heading terms related to patient access to electronic health records, patient engagement, and health care services. Studies were included if they assessed the impact of patient access to EHRs on health care engagement and provided evidence (quantitative or qualitative) for that. The guidelines of the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 statement were followed for study selection, data extraction, and quality assessment. The included studies were assessed for quality using the Mixed Methods Appraisal Tool, and the results were reported using a narrative synthesis.

Results: The initial search from the databases yielded 1737 studies, to which, after scanning their reference lists, we added 10 studies. Of these 1747 studies, 18 (1.03%) met the inclusion criteria for the final review. The synthesized evidence from these studies revealed a positive relationship between patient access to EHRs and health care engagement, addressing 6 categories of health care engagement dimensions and outcomes, including treatment adherence and self-management, patient involvement and empowerment, health care communication and relationship, patient satisfaction and health outcomes, use of health care resources, and usability concerns and barriers.

Conclusions: The findings suggested a positive association between patient access to EHRs and health care engagement. The implications of these findings for health care providers, policy makers, and patients should be considered, highlighting the potential benefits and challenges associated with implementing and promoting patient access to EHRs. Further research directions have been proposed to deepen our understanding of this dynamic relationship.

背景:包括电子健康记录(EHR)在内的健康信息技术给医疗服务带来了革命性的变化。这些技术有望通过改善病人的健康信息管理来提高医疗服务的效率和质量。尽管电子病历具有变革性的潜力,但在不同的临床环境中,病人使用电子病历在多大程度上有助于提高医疗服务的参与度,这仍然是一个独特且未被充分探索的方面:本系统综述旨在研究患者使用电子病历对参与医疗服务的影响。具体而言,我们试图确定为患者提供电子病历访问权限是否有助于提高患者对医疗服务的参与度:我们在多个国际数据库(包括 Ovid MEDLINE、Embase、PsycINFO 和 CINAHL)中进行了全面的系统综述检索,以确定 2010 年 1 月 1 日至 2023 年 11 月 15 日期间发表的相关研究。在这些数据库中进行搜索时,使用了与患者访问电子健康记录、患者参与和医疗保健服务相关的关键词和医学主题词。如果研究评估了患者使用电子病历对医疗服务参与度的影响,并提供了相关证据(定量或定性),则纳入研究。研究的选择、数据提取和质量评估均遵循 PRISMA(系统综述和元分析首选报告项目)2020 声明的指导原则。采用混合方法评估工具对纳入的研究进行质量评估,并采用叙述性综合方法报告结果:从数据库中初步搜索出 1737 项研究,在扫描参考文献列表后,我们又增加了 10 项研究。在这 1747 项研究中,有 18 项(1.03%)符合最终审查的纳入标准。这些研究的综合证据显示,患者使用电子病历与医疗参与之间存在正相关关系,涉及 6 类医疗参与维度和结果,包括坚持治疗和自我管理、患者参与和授权、医疗沟通和关系、患者满意度和健康结果、医疗资源的使用以及可用性问题和障碍:研究结果表明,患者使用电子病历与参与医疗保健之间存在正相关。应考虑这些研究结果对医疗服务提供者、政策制定者和患者的影响,强调与实施和促进患者使用电子健康记录相关的潜在益处和挑战。我们还提出了进一步的研究方向,以加深我们对这一动态关系的理解。
{"title":"The Impact of Patient Access to Electronic Health Records on Health Care Engagement: Systematic Review.","authors":"Dalia Alomar, Maryam Almashmoum, Iliada Eleftheriou, Pauline Whelan, John Ainsworth","doi":"10.2196/56473","DOIUrl":"https://doi.org/10.2196/56473","url":null,"abstract":"<p><strong>Background: </strong>Health information technologies, including electronic health records (EHRs), have revolutionized health care delivery. These technologies promise to enhance the efficiency and quality of care through improved patient health information management. Despite the transformative potential of EHRs, the extent to which patient access contributes to increased engagement with health care services within different clinical setting remains a distinct and underexplored facet.</p><p><strong>Objective: </strong>This systematic review aims to investigate the impact of patient access to EHRs on health care engagement. Specifically, we seek to determine whether providing patients with access to their EHRs contributes to improved engagement with health care services.</p><p><strong>Methods: </strong>A comprehensive systematic review search was conducted across various international databases, including Ovid MEDLINE, Embase, PsycINFO, and CINAHL, to identify relevant studies published from January 1, 2010, to November 15, 2023. The search on these databases was conducted using a combination of keywords and Medical Subject Heading terms related to patient access to electronic health records, patient engagement, and health care services. Studies were included if they assessed the impact of patient access to EHRs on health care engagement and provided evidence (quantitative or qualitative) for that. The guidelines of the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 statement were followed for study selection, data extraction, and quality assessment. The included studies were assessed for quality using the Mixed Methods Appraisal Tool, and the results were reported using a narrative synthesis.</p><p><strong>Results: </strong>The initial search from the databases yielded 1737 studies, to which, after scanning their reference lists, we added 10 studies. Of these 1747 studies, 18 (1.03%) met the inclusion criteria for the final review. The synthesized evidence from these studies revealed a positive relationship between patient access to EHRs and health care engagement, addressing 6 categories of health care engagement dimensions and outcomes, including treatment adherence and self-management, patient involvement and empowerment, health care communication and relationship, patient satisfaction and health outcomes, use of health care resources, and usability concerns and barriers.</p><p><strong>Conclusions: </strong>The findings suggested a positive association between patient access to EHRs and health care engagement. The implications of these findings for health care providers, policy makers, and patients should be considered, highlighting the potential benefits and challenges associated with implementing and promoting patient access to EHRs. Further research directions have been proposed to deepen our understanding of this dynamic relationship.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e56473"},"PeriodicalIF":5.8,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142682041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Depression Communities as a Complementary Approach to Improving the Attitudes of Patients With Depression Toward Medication Adherence: Cross-Sectional Survey Study. 网上抑郁症社区作为一种辅助方法,可改善抑郁症患者的服药态度:横断面调查研究。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-19 DOI: 10.2196/56166
Runnan Chen, Xiaorong Fu, Mochi Liu, Ke Liao, Lifei Bai
<p><strong>Background: </strong>Lack of adherence to prescribed medication is common among patients with depression in China, posing serious challenges to the health care system. Online health communities have been found to be effective in enhancing patient compliance. However, empirical evidence supporting this effect in the context of depression treatment is absent, and the influence of online health community content on patients' attitudes toward medication adherence is also underexplored.</p><p><strong>Objective: </strong>This study aims to explore whether online depression communities (ODCs) can help ameliorate the problem of poor medication taking among patients with depression. Drawing on the stimulus-organism-response and feelings-as-information theories, we established a research model to examine the influence of useful institution-generated content (IGC) and positive user-generated content (UGC) on attitudes toward medication adherence when combined with the mediating role of perceived social support, perceived value of antidepressants, and the moderating role of hopelessness.</p><p><strong>Methods: </strong>A cross-sectional questionnaire survey method was used in this research. Participants were recruited from various Chinese ODCs, generating data for a main study and 2 robustness checks. Hierarchical multiple regression analyses and bootstrapping analyses were adopted as the primary methods to test the hypotheses.</p><p><strong>Results: </strong>We received 1515 valid responses in total, contributing to 5 different datasets: model IGC (n=353, 23.3%), model UGC (n=358, 23.63%), model IGC+UGC (n=270, 17.82%), model IGC-B (n=266, 17.56%), and model UGC-B (n=268, 17.69%). Models IGC and UGC were used for the main study. Model IGC+UGC was used for robustness check A. Models IGC-B and UGC-B were used for robustness check B. Useful IGC and positive UGC were proven to have positive impact on the attitudes of patients with depression toward medication adherence through the mediations of perceived social support and perceived value of antidepressants. The findings corroborated the role of hopelessness in weakening or even negating the positive effects of ODC content on the attitudes of patients with depression toward medication adherence.</p><p><strong>Conclusions: </strong>This study provides the first empirical evidence demonstrating the relationship between ODC content and attitudes toward medication adherence, through which we offer a novel solution to the problem of poor medication adherence among patients with depression in China. Our findings also provide suggestions about how to optimize this new approach-health care practitioners should generate online content that precisely matches the informational needs of patients with depression, and ODC service providers should endeavor to regulate the community atmosphere. Nonetheless, we warn that ODC interventions cannot be used as the only approach to addressing the problem of poor medicatio
背景:在中国,抑郁症患者不遵医嘱服药的现象十分普遍,这给医疗系统带来了严峻的挑战。研究发现,在线健康社区能有效提高患者的依从性。然而,在抑郁症治疗中,支持这一效果的实证证据并不存在,在线健康社区的内容对患者服药依从性态度的影响也未得到充分探讨:本研究旨在探讨在线抑郁症社区(ODC)是否有助于改善抑郁症患者服药不力的问题。借鉴刺激-组织-反应理论和感受-信息理论,我们建立了一个研究模型,以考察有用的机构生成内容(IGC)和积极的用户生成内容(UGC)在感知到的社会支持、感知到的抗抑郁药物价值以及无望感的调节作用的中介作用下对服药态度的影响:本研究采用横断面问卷调查法。研究方法:本研究采用横断面问卷调查法,从中国各开放数据中心招募参与者,为一项主要研究和两项稳健性检查提供数据。采用层次多元回归分析和引导分析作为检验假设的主要方法:我们共收到 1515 份有效回复,形成了 5 个不同的数据集:IGC 模型(n=353,23.3%)、UGC 模型(n=358,23.63%)、IGC+UGC 模型(n=270,17.82%)、IGC-B 模型(n=266,17.56%)和 UGC-B 模型(n=268,17.69%)。主要研究使用了 IGC 和 UGC 模型。通过感知社会支持和感知抗抑郁药物价值的中介作用,有用的 IGC 和积极的 UGC 被证明对抑郁症患者坚持服药的态度有积极影响。研究结果证实,无望感削弱甚至抵消了ODC内容对抑郁症患者坚持服药态度的积极影响:本研究首次通过实证研究证明了ODC内容与服药态度之间的关系,为解决中国抑郁症患者服药依从性差的问题提供了一种新的解决方案。我们的研究结果还为如何优化这一新方法提供了建议--医护人员应根据抑郁症患者的信息需求制作精准的在线内容,ODC服务提供者应努力调节社区氛围。尽管如此,我们还是要提醒大家,ODC 干预措施不能作为解决严重抑郁症状患者服药不良问题的唯一方法。
{"title":"Online Depression Communities as a Complementary Approach to Improving the Attitudes of Patients With Depression Toward Medication Adherence: Cross-Sectional Survey Study.","authors":"Runnan Chen, Xiaorong Fu, Mochi Liu, Ke Liao, Lifei Bai","doi":"10.2196/56166","DOIUrl":"https://doi.org/10.2196/56166","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Lack of adherence to prescribed medication is common among patients with depression in China, posing serious challenges to the health care system. Online health communities have been found to be effective in enhancing patient compliance. However, empirical evidence supporting this effect in the context of depression treatment is absent, and the influence of online health community content on patients' attitudes toward medication adherence is also underexplored.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to explore whether online depression communities (ODCs) can help ameliorate the problem of poor medication taking among patients with depression. Drawing on the stimulus-organism-response and feelings-as-information theories, we established a research model to examine the influence of useful institution-generated content (IGC) and positive user-generated content (UGC) on attitudes toward medication adherence when combined with the mediating role of perceived social support, perceived value of antidepressants, and the moderating role of hopelessness.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;A cross-sectional questionnaire survey method was used in this research. Participants were recruited from various Chinese ODCs, generating data for a main study and 2 robustness checks. Hierarchical multiple regression analyses and bootstrapping analyses were adopted as the primary methods to test the hypotheses.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;We received 1515 valid responses in total, contributing to 5 different datasets: model IGC (n=353, 23.3%), model UGC (n=358, 23.63%), model IGC+UGC (n=270, 17.82%), model IGC-B (n=266, 17.56%), and model UGC-B (n=268, 17.69%). Models IGC and UGC were used for the main study. Model IGC+UGC was used for robustness check A. Models IGC-B and UGC-B were used for robustness check B. Useful IGC and positive UGC were proven to have positive impact on the attitudes of patients with depression toward medication adherence through the mediations of perceived social support and perceived value of antidepressants. The findings corroborated the role of hopelessness in weakening or even negating the positive effects of ODC content on the attitudes of patients with depression toward medication adherence.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;This study provides the first empirical evidence demonstrating the relationship between ODC content and attitudes toward medication adherence, through which we offer a novel solution to the problem of poor medication adherence among patients with depression in China. Our findings also provide suggestions about how to optimize this new approach-health care practitioners should generate online content that precisely matches the informational needs of patients with depression, and ODC service providers should endeavor to regulate the community atmosphere. Nonetheless, we warn that ODC interventions cannot be used as the only approach to addressing the problem of poor medicatio","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e56166"},"PeriodicalIF":5.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142675925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the Effectiveness of Technology-Based Distal Interventions for Postpartum Depression and Anxiety: Systematic Review and Meta-Analysis. 产后抑郁和焦虑的科技远端干预效果调查:系统回顾与元分析》。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-19 DOI: 10.2196/53236
Sarah P Brocklehurst, Alyssa R Morse, Tegan Cruwys, Philip J Batterham, Liana Leach, Alysia M Robertson, Aseel Sahib, Colette T Burke, Jessica Nguyen, Alison L Calear
<p><strong>Background: </strong>Postpartum anxiety and depression are common in new parents. While effective interventions exist, they are often delivered in person, which can be a barrier for some parents seeking help. One approach to overcoming these barriers is the delivery of evidence-based self-help interventions via websites, smartphone apps, and other digital media.</p><p><strong>Objective: </strong>This study aims to evaluate the effectiveness of technology-based distal interventions in reducing or preventing symptoms of postpartum depression or anxiety in male and female birth and adoptive parents, explore the effectiveness of technology-based distal interventions in increasing social ties, and determine the level of adherence to and satisfaction with technology-based distal interventions.</p><p><strong>Methods: </strong>A systematic review and series of meta-analyses were conducted. Three electronic bibliographic databases (PsycINFO, PubMed, and Cochrane Library) were searched for randomized controlled trials evaluating technology-based distal interventions for postpartum depression or anxiety in birth and adoptive parents. Searches were updated on August 1, 2023, before conducting the final meta-analyses. Data on trial characteristics, effectiveness, adherence, satisfaction, and quality were extracted. Screening and data extraction were conducted by 2 reviewers. Risk of bias was assessed using the Joanna Briggs Institute quality rating scale for randomized controlled trials. Studies were initially synthesized qualitatively. Where possible, studies were also quantitatively synthesized through 5 meta-analyses.</p><p><strong>Results: </strong>Overall, 18 articles met the inclusion criteria for the systematic review, with 14 (78%) providing sufficient data for a meta-analysis. A small significant between-group effect on depression favored the intervention conditions at the postintervention (Cohen d=-0.28, 95% CI -0.41 to -0.15; P<.001) and follow-up (Cohen d=-0.27, 95% CI -0.52 to -0.02; P=.03) time points. A small significant effect on anxiety also favored the intervention conditions at the postintervention time point (Cohen d=-0.29, 95% CI -0.48 to -0.10; P=.002), with a medium effect at follow-up (Cohen d=-0.47, 95% CI -0.88 to -0.05; P=.03). The effect on social ties was not significant at the postintervention time point (Cohen d=0.04, 95% CI -0.12 to 0.21; P=.61). Effective interventions tended to be web-based cognitive behavioral therapy programs with reminders. Adherence varied considerably between studies, whereas satisfaction tended to be high for most studies.</p><p><strong>Conclusions: </strong>Technology-based distal interventions are effective in reducing symptoms of postpartum depression and anxiety in birth mothers. Key limitations of the reviewed evidence include heterogeneity in outcome measures, studies being underpowered to detect modest effects, and the exclusion of key populations from the evidence base. More research
背景介绍产后焦虑和抑郁是初为父母者的常见病。虽然存在有效的干预措施,但这些措施通常需要亲自提供,这可能会成为一些父母寻求帮助的障碍。克服这些障碍的一种方法是通过网站、智能手机应用程序和其他数字媒体提供循证自助干预:本研究旨在评估基于技术的远程干预措施在减少或预防男性和女性亲生父母和养父母产后抑郁或焦虑症状方面的有效性,探讨基于技术的远程干预措施在增加社会联系方面的有效性,并确定对基于技术的远程干预措施的坚持程度和满意度:方法:进行了系统性回顾和一系列荟萃分析。研究人员在三个电子文献数据库(PsycINFO、PubMed 和 Cochrane Library)中检索了对产后抑郁或焦虑的亲生父母和养父母进行评估的随机对照试验。在进行最终的荟萃分析之前,于 2023 年 8 月 1 日更新了搜索结果。提取了有关试验特征、有效性、依从性、满意度和质量的数据。筛选和数据提取由两名审稿人进行。采用乔安娜-布里格斯研究所(Joanna Briggs Institute)随机对照试验质量评级表评估偏倚风险。最初对研究进行定性综合。在可能的情况下,还通过 5 项元分析对研究进行定量综合:总共有 18 篇文章符合系统综述的纳入标准,其中 14 篇(78%)为荟萃分析提供了足够的数据。在干预后,干预条件对抑郁症的组间影响较小(Cohen d=-0.28,95% CI -0.41至-0.15;PC结论:基于技术的远端干预能有效减轻产后母亲的产后抑郁和焦虑症状。所审查证据的主要局限性包括结果测量的异质性、研究不足以检测出适度的效果,以及证据库中排除了关键人群。需要对亲生父亲和养父母进行更多的研究,以更好地确定干预措施在这些人群中的有效性,并进一步评估基于技术的远距离干预措施对社会关系的影响:ProCORD42021290525; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=290525.
{"title":"Investigating the Effectiveness of Technology-Based Distal Interventions for Postpartum Depression and Anxiety: Systematic Review and Meta-Analysis.","authors":"Sarah P Brocklehurst, Alyssa R Morse, Tegan Cruwys, Philip J Batterham, Liana Leach, Alysia M Robertson, Aseel Sahib, Colette T Burke, Jessica Nguyen, Alison L Calear","doi":"10.2196/53236","DOIUrl":"https://doi.org/10.2196/53236","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Postpartum anxiety and depression are common in new parents. While effective interventions exist, they are often delivered in person, which can be a barrier for some parents seeking help. One approach to overcoming these barriers is the delivery of evidence-based self-help interventions via websites, smartphone apps, and other digital media.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to evaluate the effectiveness of technology-based distal interventions in reducing or preventing symptoms of postpartum depression or anxiety in male and female birth and adoptive parents, explore the effectiveness of technology-based distal interventions in increasing social ties, and determine the level of adherence to and satisfaction with technology-based distal interventions.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;A systematic review and series of meta-analyses were conducted. Three electronic bibliographic databases (PsycINFO, PubMed, and Cochrane Library) were searched for randomized controlled trials evaluating technology-based distal interventions for postpartum depression or anxiety in birth and adoptive parents. Searches were updated on August 1, 2023, before conducting the final meta-analyses. Data on trial characteristics, effectiveness, adherence, satisfaction, and quality were extracted. Screening and data extraction were conducted by 2 reviewers. Risk of bias was assessed using the Joanna Briggs Institute quality rating scale for randomized controlled trials. Studies were initially synthesized qualitatively. Where possible, studies were also quantitatively synthesized through 5 meta-analyses.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Overall, 18 articles met the inclusion criteria for the systematic review, with 14 (78%) providing sufficient data for a meta-analysis. A small significant between-group effect on depression favored the intervention conditions at the postintervention (Cohen d=-0.28, 95% CI -0.41 to -0.15; P&lt;.001) and follow-up (Cohen d=-0.27, 95% CI -0.52 to -0.02; P=.03) time points. A small significant effect on anxiety also favored the intervention conditions at the postintervention time point (Cohen d=-0.29, 95% CI -0.48 to -0.10; P=.002), with a medium effect at follow-up (Cohen d=-0.47, 95% CI -0.88 to -0.05; P=.03). The effect on social ties was not significant at the postintervention time point (Cohen d=0.04, 95% CI -0.12 to 0.21; P=.61). Effective interventions tended to be web-based cognitive behavioral therapy programs with reminders. Adherence varied considerably between studies, whereas satisfaction tended to be high for most studies.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Technology-based distal interventions are effective in reducing symptoms of postpartum depression and anxiety in birth mothers. Key limitations of the reviewed evidence include heterogeneity in outcome measures, studies being underpowered to detect modest effects, and the exclusion of key populations from the evidence base. More research ","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e53236"},"PeriodicalIF":5.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142675923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Large Language Models to Abstract Complex Social Determinants of Health From Original and Deidentified Medical Notes: Development and Validation Study. 使用大型语言模型从原始和去身份化医疗记录中抽象出复杂的健康社会决定因素:开发和验证研究。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-19 DOI: 10.2196/63445
Alexandra Ralevski, Nadaa Taiyab, Michael Nossal, Lindsay Mico, Samantha Piekos, Jennifer Hadlock

Background: Social determinants of health (SDoH) such as housing insecurity are known to be intricately linked to patients' health status. More efficient methods for abstracting structured data on SDoH can help accelerate the inclusion of exposome variables in biomedical research and support health care systems in identifying patients who could benefit from proactive outreach. Large language models (LLMs) developed from Generative Pre-trained Transformers (GPTs) have shown potential for performing complex abstraction tasks on unstructured clinical notes.

Objective: Here, we assess the performance of GPTs on identifying temporal aspects of housing insecurity and compare results between both original and deidentified notes.

Methods: We compared the ability of GPT-3.5 and GPT-4 to identify instances of both current and past housing instability, as well as general housing status, from 25,217 notes from 795 pregnant women. Results were compared with manual abstraction, a named entity recognition model, and regular expressions.

Results: Compared with GPT-3.5 and the named entity recognition model, GPT-4 had the highest performance and had a much higher recall (0.924) than human abstractors (0.702) in identifying patients experiencing current or past housing instability, although precision was lower (0.850) compared with human abstractors (0.971). GPT-4's precision improved slightly (0.936 original, 0.939 deidentified) on deidentified versions of the same notes, while recall dropped (0.781 original, 0.704 deidentified).

Conclusions: This work demonstrates that while manual abstraction is likely to yield slightly more accurate results overall, LLMs can provide a scalable, cost-effective solution with the advantage of greater recall. This could support semiautomated abstraction, but given the potential risk for harm, human review would be essential before using results for any patient engagement or care decisions. Furthermore, recall was lower when notes were deidentified prior to LLM abstraction.

背景:众所周知,住房不安全等健康的社会决定因素(SDoH)与患者的健康状况密切相关。采用更有效的方法来抽取有关 SDoH 的结构化数据,有助于加快将暴露组变量纳入生物医学研究的速度,并支持医疗保健系统识别可从主动外展服务中受益的患者。由生成预训练转换器(GPT)开发的大型语言模型(LLM)已显示出在非结构化临床笔记上执行复杂抽象任务的潜力。目的:在此,我们评估了 GPT 在识别住房不安全的时间方面的性能,并比较了原始笔记和去标识笔记的结果:我们比较了 GPT-3.5 和 GPT-4 从 795 名孕妇的 25,217 份笔记中识别当前和过去住房不稳定情况以及一般住房状况的能力。结果与人工抽象、命名实体识别模型和正则表达式进行了比较:与 GPT-3.5 和命名实体识别模型相比,GPT-4 的性能最高,在识别当前或过去住房不稳定的患者方面,召回率(0.924)远高于人工摘录者(0.702),但精确度(0.850)低于人工摘录者(0.971)。在相同笔记的去标识化版本中,GPT-4 的精确度略有提高(原始版本为 0.936,去标识化版本为 0.939),而召回率则有所下降(原始版本为 0.781,去标识化版本为 0.704):这项工作表明,虽然人工抽取的结果总体上可能略微准确一些,但 LLM 可以提供一种可扩展的、具有成本效益的解决方案,其优势在于召回率更高。这可以支持半自动抽取,但考虑到潜在的伤害风险,在将结果用于任何患者参与或护理决策之前,人工审核是必不可少的。此外,在抽取 LLM 之前对笔记进行去标识化处理时,召回率较低。
{"title":"Using Large Language Models to Abstract Complex Social Determinants of Health From Original and Deidentified Medical Notes: Development and Validation Study.","authors":"Alexandra Ralevski, Nadaa Taiyab, Michael Nossal, Lindsay Mico, Samantha Piekos, Jennifer Hadlock","doi":"10.2196/63445","DOIUrl":"https://doi.org/10.2196/63445","url":null,"abstract":"<p><strong>Background: </strong>Social determinants of health (SDoH) such as housing insecurity are known to be intricately linked to patients' health status. More efficient methods for abstracting structured data on SDoH can help accelerate the inclusion of exposome variables in biomedical research and support health care systems in identifying patients who could benefit from proactive outreach. Large language models (LLMs) developed from Generative Pre-trained Transformers (GPTs) have shown potential for performing complex abstraction tasks on unstructured clinical notes.</p><p><strong>Objective: </strong>Here, we assess the performance of GPTs on identifying temporal aspects of housing insecurity and compare results between both original and deidentified notes.</p><p><strong>Methods: </strong>We compared the ability of GPT-3.5 and GPT-4 to identify instances of both current and past housing instability, as well as general housing status, from 25,217 notes from 795 pregnant women. Results were compared with manual abstraction, a named entity recognition model, and regular expressions.</p><p><strong>Results: </strong>Compared with GPT-3.5 and the named entity recognition model, GPT-4 had the highest performance and had a much higher recall (0.924) than human abstractors (0.702) in identifying patients experiencing current or past housing instability, although precision was lower (0.850) compared with human abstractors (0.971). GPT-4's precision improved slightly (0.936 original, 0.939 deidentified) on deidentified versions of the same notes, while recall dropped (0.781 original, 0.704 deidentified).</p><p><strong>Conclusions: </strong>This work demonstrates that while manual abstraction is likely to yield slightly more accurate results overall, LLMs can provide a scalable, cost-effective solution with the advantage of greater recall. This could support semiautomated abstraction, but given the potential risk for harm, human review would be essential before using results for any patient engagement or care decisions. Furthermore, recall was lower when notes were deidentified prior to LLM abstraction.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e63445"},"PeriodicalIF":5.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142675926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating Cognitive Biases in Clinical Decision-Making Through Multi-Agent Conversations Using Large Language Models: Simulation Study. 通过使用大型语言模型的多代理对话减轻临床决策中的认知偏差:模拟研究。
IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-19 DOI: 10.2196/59439
Yuhe Ke, Rui Yang, Sui An Lie, Taylor Xin Yi Lim, Yilin Ning, Irene Li, Hairil Rizal Abdullah, Daniel Shu Wei Ting, Nan Liu

Background: Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patient outcomes. Addressing these biases presents a formidable challenge in the medical field.

Objective: This study aimed to explore the role of large language models (LLMs) in mitigating these biases through the use of the multi-agent framework. We simulate the clinical decision-making processes through multi-agent conversation and evaluate its efficacy in improving diagnostic accuracy compared with humans.

Methods: A total of 16 published and unpublished case reports where cognitive biases have resulted in misdiagnoses were identified from the literature. In the multi-agent framework, we leveraged GPT-4 (OpenAI) to facilitate interactions among different simulated agents to replicate clinical team dynamics. Each agent was assigned a distinct role: (1) making the final diagnosis after considering the discussions, (2) acting as a devil's advocate to correct confirmation and anchoring biases, (3) serving as a field expert in the required medical subspecialty, (4) facilitating discussions to mitigate premature closure bias, and (5) recording and summarizing findings. We tested varying combinations of these agents within the framework to determine which configuration yielded the highest rate of correct final diagnoses. Each scenario was repeated 5 times for consistency. The accuracy of the initial diagnoses and the final differential diagnoses were evaluated, and comparisons with human-generated answers were made using the Fisher exact test.

Results: A total of 240 responses were evaluated (3 different multi-agent frameworks). The initial diagnosis had an accuracy of 0% (0/80). However, following multi-agent discussions, the accuracy for the top 2 differential diagnoses increased to 76% (61/80) for the best-performing multi-agent framework (Framework 4-C). This was significantly higher compared with the accuracy achieved by human evaluators (odds ratio 3.49; P=.002).

Conclusions: The multi-agent framework demonstrated an ability to re-evaluate and correct misconceptions, even in scenarios with misleading initial investigations. In addition, the LLM-driven, multi-agent conversation framework shows promise in enhancing diagnostic accuracy in diagnostically challenging medical scenarios.

背景:临床决策中的认知偏差在很大程度上导致了诊断错误和患者的不良治疗效果。解决这些偏差是医学领域面临的一项艰巨挑战:本研究旨在探索大语言模型(LLMs)在通过使用多代理框架减轻这些偏差方面的作用。我们通过多代理对话模拟临床决策过程,并评估其与人类相比在提高诊断准确性方面的功效:方法:我们从文献中找出了认知偏差导致误诊的 16 个已发表和未发表的病例报告。在多代理框架中,我们利用 GPT-4(OpenAI)来促进不同模拟代理之间的互动,以复制临床团队的动态。每个代理都被分配了不同的角色:(1) 在考虑讨论结果后做出最终诊断;(2) 充当 "魔鬼代言人 "以纠正确认和锚定偏差;(3) 充当所需医学亚专科的领域专家;(4) 促进讨论以减轻过早结束偏差;(5) 记录和总结研究结果。我们测试了框架内这些代理的不同组合,以确定哪种配置产生的最终诊断正确率最高。为了保持一致性,每个场景都重复了 5 次。我们对初步诊断和最终鉴别诊断的准确性进行了评估,并使用费舍尔精确检验法与人工生成的答案进行了比较:结果:共评估了 240 个回答(3 个不同的多代理框架)。初步诊断的准确率为 0%(0/80)。然而,经过多代理讨论后,表现最好的多代理框架(框架 4-C)的前 2 个差异诊断准确率提高到 76%(61/80)。这明显高于人类评估人员的准确率(几率比3.49;P=.002):多智能体框架展示了重新评估和纠正错误认知的能力,即使在初始调查存在误导的情况下也是如此。此外,LLM 驱动的多代理对话框架有望提高具有诊断挑战性的医疗场景中的诊断准确性。
{"title":"Mitigating Cognitive Biases in Clinical Decision-Making Through Multi-Agent Conversations Using Large Language Models: Simulation Study.","authors":"Yuhe Ke, Rui Yang, Sui An Lie, Taylor Xin Yi Lim, Yilin Ning, Irene Li, Hairil Rizal Abdullah, Daniel Shu Wei Ting, Nan Liu","doi":"10.2196/59439","DOIUrl":"https://doi.org/10.2196/59439","url":null,"abstract":"<p><strong>Background: </strong>Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patient outcomes. Addressing these biases presents a formidable challenge in the medical field.</p><p><strong>Objective: </strong>This study aimed to explore the role of large language models (LLMs) in mitigating these biases through the use of the multi-agent framework. We simulate the clinical decision-making processes through multi-agent conversation and evaluate its efficacy in improving diagnostic accuracy compared with humans.</p><p><strong>Methods: </strong>A total of 16 published and unpublished case reports where cognitive biases have resulted in misdiagnoses were identified from the literature. In the multi-agent framework, we leveraged GPT-4 (OpenAI) to facilitate interactions among different simulated agents to replicate clinical team dynamics. Each agent was assigned a distinct role: (1) making the final diagnosis after considering the discussions, (2) acting as a devil's advocate to correct confirmation and anchoring biases, (3) serving as a field expert in the required medical subspecialty, (4) facilitating discussions to mitigate premature closure bias, and (5) recording and summarizing findings. We tested varying combinations of these agents within the framework to determine which configuration yielded the highest rate of correct final diagnoses. Each scenario was repeated 5 times for consistency. The accuracy of the initial diagnoses and the final differential diagnoses were evaluated, and comparisons with human-generated answers were made using the Fisher exact test.</p><p><strong>Results: </strong>A total of 240 responses were evaluated (3 different multi-agent frameworks). The initial diagnosis had an accuracy of 0% (0/80). However, following multi-agent discussions, the accuracy for the top 2 differential diagnoses increased to 76% (61/80) for the best-performing multi-agent framework (Framework 4-C). This was significantly higher compared with the accuracy achieved by human evaluators (odds ratio 3.49; P=.002).</p><p><strong>Conclusions: </strong>The multi-agent framework demonstrated an ability to re-evaluate and correct misconceptions, even in scenarios with misleading initial investigations. In addition, the LLM-driven, multi-agent conversation framework shows promise in enhancing diagnostic accuracy in diagnostically challenging medical scenarios.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e59439"},"PeriodicalIF":5.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142675924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Medical Internet Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1