首页 > 最新文献

BMJ Evidence-Based Medicine最新文献

英文 中文
Expanded disease definitions in Alzheimer's disease and the new era of disease-modifying drugs. 阿尔茨海默病的扩展疾病定义和疾病修饰药物的新时代。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2023-112588
Su Jin Yim, Sevil Yasar, Nancy Schoenborn, Eddy Lang
{"title":"Expanded disease definitions in Alzheimer's disease and the new era of disease-modifying drugs.","authors":"Su Jin Yim, Sevil Yasar, Nancy Schoenborn, Eddy Lang","doi":"10.1136/bmjebm-2023-112588","DOIUrl":"10.1136/bmjebm-2023-112588","url":null,"abstract":"","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"288-290"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143405685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital integration of research conduct into clinical care: results of the PROSPECTOR randomised feasibility study. 将研究行为数字化整合到临床护理:PROSPECTOR随机可行性研究的结果。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2024-113081
Matthew G Wilson, Folkert W Asselbergs, Nausheen Saleem, Lelia Jeilani, David Brealey, Matthew R Sydes, Steve Harris

Objectives: To evaluate the feasibility of conducting a clinically integrated randomised comparative effectiveness trial using digital clinical trial infrastructure within an electronic patient record (EPR).

Design: A mixed-methods, unblinded, feasibility study of digital clinical trial system incorporating testing of two designs of electronic point-of-care randomisation prompt.

Setting: The study was conducted at University College London Hospitals NHS Trust between March and November 2022. The study used a real clinical research question for context, comparing liberal vs restrictive strategies for magnesium supplementation to prevent new-onset atrial fibrillation in critical care.

Participants: Adult patients undergoing elective, non-cardiac surgical procedures expecting postoperative admission to critical care were recruited.

Interventions: A digital trial system screened participants continuously against eligibility criteria. Participants were automatically randomised (1:1) to (1) magnesium supplementation strategy and (2) one of two electronic randomisation prompt designs (nudge or preference).Electronic point-of-care randomisation prompts displayed to clinicians at regular intervals, inviting them to follow a randomised magnesium supplementation suggestion.

Main outcome measures: The primary outcome measure was a composite determination of study design feasibility (including recruitment, technical performance and concordance between the randomised suggestion and the observed clinician action).

Results: 23 patients were recruited and 11 successfully randomised. The implemented digital systems for automated eligibility screening, randomisation, data collection and follow-up demonstrated technical feasibility. 47 electronic point-of-care randomisation prompts successfully deployed across 11 patients. Clinician actions were concordant with randomised suggestions in 32 prompts (68%).Technical and implementational barriers to delivering the electronic point-of-care randomisation prompts were identified. Patients were followed up to 30 days following discharge from hospital, with no serious adverse events attributable to participation identified.There was insufficient data to make a quantitative determination on the superiority of either prompt design. Clinician feedback suggested the simplified design (nudge) had greater utility.

Conclusions: This study demonstrates that digitally embedding clinical trial infrastructure into a site-level EPR and integrating conduct into clinical care is safe and feasible. Future work will focus on improving and expanding the integrated digital trial design across multiple centres.

Trial registration number: NCT05149820.

目的:评估在电子病历(EPR)中使用数字临床试验基础设施进行临床综合随机比较有效性试验的可行性。设计:一项混合方法,非盲法,数字临床试验系统的可行性研究,包括两种电子护理点随机化提示设计的测试。环境:该研究于2022年3月至11月在伦敦大学学院医院NHS信托基金进行。该研究使用了一个真实的临床研究问题作为背景,比较了自由和限制性镁补充策略以预防危重症患者新发心房颤动。参与者:接受选择性非心脏外科手术的成年患者,期望术后进入重症监护。干预措施:一个数字试验系统根据资格标准不断筛选参与者。参与者被自动随机(1:1)分配到(1)镁补充策略和(2)两种电子随机提示设计中的一种(轻推或偏好)。定期向临床医生展示电子护理随机化提示,邀请他们遵循随机补充镁的建议。主要结局指标:主要结局指标是研究设计可行性的综合确定(包括招募、技术表现和随机建议与观察到的临床医生行动之间的一致性)。结果:23例患者被招募,11例成功随机化。实施的用于自动资格筛选、随机化、数据收集和随访的数字系统证明了技术可行性。在11名患者中成功部署了47个电子护理点随机提示。临床医生的行动与32个提示(68%)的随机建议一致。确定了提供电子即时护理随机化提示的技术和实施障碍。患者出院后随访至30天,未发现与参与相关的严重不良事件。没有足够的数据来定量确定两种提示设计的优越性。临床医生的反馈表明,简化的设计(轻推)有更大的效用。结论:本研究表明,将临床试验基础设施数字化嵌入到现场级EPR中,并将行为整合到临床护理中是安全可行的。未来的工作将侧重于改进和扩展跨多个中心的集成数字试验设计。试验注册号:NCT05149820。
{"title":"Digital integration of research conduct into clinical care: results of the PROSPECTOR randomised feasibility study.","authors":"Matthew G Wilson, Folkert W Asselbergs, Nausheen Saleem, Lelia Jeilani, David Brealey, Matthew R Sydes, Steve Harris","doi":"10.1136/bmjebm-2024-113081","DOIUrl":"10.1136/bmjebm-2024-113081","url":null,"abstract":"<p><strong>Objectives: </strong>To evaluate the feasibility of conducting a clinically integrated randomised comparative effectiveness trial using digital clinical trial infrastructure within an electronic patient record (EPR).</p><p><strong>Design: </strong>A mixed-methods, unblinded, feasibility study of digital clinical trial system incorporating testing of two designs of electronic point-of-care randomisation prompt.</p><p><strong>Setting: </strong>The study was conducted at University College London Hospitals NHS Trust between March and November 2022. The study used a real clinical research question for context, comparing liberal vs restrictive strategies for magnesium supplementation to prevent new-onset atrial fibrillation in critical care.</p><p><strong>Participants: </strong>Adult patients undergoing elective, non-cardiac surgical procedures expecting postoperative admission to critical care were recruited.</p><p><strong>Interventions: </strong>A digital trial system screened participants continuously against eligibility criteria. Participants were automatically randomised (1:1) to (1) magnesium supplementation strategy and (2) one of two electronic randomisation prompt designs (nudge or preference).Electronic point-of-care randomisation prompts displayed to clinicians at regular intervals, inviting them to follow a randomised magnesium supplementation suggestion.</p><p><strong>Main outcome measures: </strong>The primary outcome measure was a composite determination of study design feasibility (including recruitment, technical performance and concordance between the randomised suggestion and the observed clinician action).</p><p><strong>Results: </strong>23 patients were recruited and 11 successfully randomised. The implemented digital systems for automated eligibility screening, randomisation, data collection and follow-up demonstrated technical feasibility. 47 electronic point-of-care randomisation prompts successfully deployed across 11 patients. Clinician actions were concordant with randomised suggestions in 32 prompts (68%).Technical and implementational barriers to delivering the electronic point-of-care randomisation prompts were identified. Patients were followed up to 30 days following discharge from hospital, with no serious adverse events attributable to participation identified.There was insufficient data to make a quantitative determination on the superiority of either prompt design. Clinician feedback suggested the simplified design (nudge) had greater utility.</p><p><strong>Conclusions: </strong>This study demonstrates that digitally embedding clinical trial infrastructure into a site-level EPR and integrating conduct into clinical care is safe and feasible. Future work will focus on improving and expanding the integrated digital trial design across multiple centres.</p><p><strong>Trial registration number: </strong>NCT05149820.</p>","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"323-332"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12505030/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143964069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidence categories in systematic assessment of cancer overdiagnosis. 癌症过度诊断系统评估的证据分类。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2024-113529
Anton Barchuk, Niko K Nordlund, Alex L E Halme, Kari A O Tikkinen

The phenomenon of cancer overdiagnosis, the diagnosis of a malignant tumour that, without detection, would never lead to adverse health effects, has been reported for several cancer types in different populations. There has been an increase in studies focused on overdiagnosis, creating an opportunity to synthesise evidence on specific cancer types. However, studies that systematically assess evidence across different research domains remain scarce, with most of them relying on data from studies that already mentioned overdiagnosis as a potential concern. In this review, we consider several evidence categories that are used to systematically assess the presence and magnitude of overdiagnosis, including (1) data from cancer surveillance, (2) studies exploring the 'true' prevalence of cancer in the population, (3) studies that explore the use of diagnostics and its effect on incidence and mortality and (4) studies that explore changes and progress in cancer management and its effect on cancer mortality. This article highlights the strengths and weaknesses of different evidence categories, provides examples of studies on different cancer types and discusses how these categories can help synthesise evidence on cancer overdiagnosis.

据报告,在不同人群中,有几种癌症类型存在癌症过度诊断现象,即对恶性肿瘤的诊断,如果不加以发现,绝不会对健康造成不利影响。关注过度诊断的研究有所增加,这为综合特定癌症类型的证据创造了机会。然而,系统地评估不同研究领域证据的研究仍然很少,其中大多数研究依赖于已经提到过度诊断是潜在问题的研究数据。在本综述中,我们考虑了用于系统评估过度诊断的存在和程度的几个证据类别,包括:(1)来自癌症监测的数据,(2)探索人群中癌症“真实”患病率的研究,(3)探索诊断方法的使用及其对发病率和死亡率的影响的研究,(4)探索癌症管理的变化和进展及其对癌症死亡率的影响的研究。本文强调了不同证据类别的优缺点,提供了不同癌症类型的研究实例,并讨论了这些类别如何帮助合成癌症过度诊断的证据。
{"title":"Evidence categories in systematic assessment of cancer overdiagnosis.","authors":"Anton Barchuk, Niko K Nordlund, Alex L E Halme, Kari A O Tikkinen","doi":"10.1136/bmjebm-2024-113529","DOIUrl":"10.1136/bmjebm-2024-113529","url":null,"abstract":"<p><p>The phenomenon of cancer overdiagnosis, the diagnosis of a malignant tumour that, without detection, would never lead to adverse health effects, has been reported for several cancer types in different populations. There has been an increase in studies focused on overdiagnosis, creating an opportunity to synthesise evidence on specific cancer types. However, studies that systematically assess evidence across different research domains remain scarce, with most of them relying on data from studies that already mentioned overdiagnosis as a potential concern. In this review, we consider several evidence categories that are used to systematically assess the presence and magnitude of overdiagnosis, including (1) data from cancer surveillance, (2) studies exploring the 'true' prevalence of cancer in the population, (3) studies that explore the use of diagnostics and its effect on incidence and mortality and (4) studies that explore changes and progress in cancer management and its effect on cancer mortality. This article highlights the strengths and weaknesses of different evidence categories, provides examples of studies on different cancer types and discusses how these categories can help synthesise evidence on cancer overdiagnosis.</p>","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"333-339"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12505034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144180727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying actionable statements in Chinese health guidelines: a cross-sectional study. 识别中国健康指南中可操作的陈述:一项横断面研究。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2024-113050
Xiangying Ren, Tamara Lotfi, Jiyu Chen, Yuling Lei, Chenyibei Zhou, Wei Zhang, Qiao Huang, Yongbo Wang, Siyu Yan, Shichun Wang, Siyuan Ruan, Wanru Wang, Qiyi Zhang, Xiaomei Yao, Yinghui Jin, Holger J Schuenemann

Objective: The purpose of this study is to validate the taxonomy and framework using Chinese guidelines and identify actionable statements.

Design and setting: We searched five databases, to identify the health guidelines from 1 January 2020 to 1 May 2023. Five researchers categorised statements into six types: formal recommendations (Type I) with clear direction and strength, with explicit and direct evidence; good practice statements (GPS) (Type II), actionable in isolation with a significant benefit; remarks (Type III), an inseparable unit belonging to a formal recommendation or GPS that provides additional clarification; research only recommendations (Type IV) for specific populations; implementation considerations, tools and tips (Type V), that describe the how, who, where, what and when, in relation to implementing a recommendation and lacking a direct evidence link; and informal recommendations (Type VI), unrelated to evidence and not meeting GPS criteria.

Results: We included 116 guidelines, including 74 Western medicine guidelines, 12 traditional Chinese medicine guidelines and 30 integrated Chinese and Western medicine guidelines. 99 guidelines (85.3%) used the Grading of Recommendations Assessment, Development and Evaluation criteria. Medical specialty societies developed the highest number of guidelines (53.4%). Of all the statements, 4422 statements were extracted from the guidelines. Among them, 2154 (48.7%) were formal recommendations, 197 (4.4%) were GPS, 394 (8.9%) were remarks, 16 (0.4%) were research only recommendations, 1106 (25.0%) were implementation considerations, tools and tips, and 555 (12.6%) were informal recommendations.

Conclusions: Up to date, the Chinese guideline developers tend to overestimate the number of formal recommendations and underestimate the number of GPS, remarks, research only recommendations, implementation considerations, tools and tips, and informal recommendations. Thus the current quality of actionable statements in Chinese health guidelines requires further enhancement.

目的:本研究的目的是利用中国指南验证分类和框架,并确定可操作的陈述。设计和背景:我们检索了5个数据库,以确定2020年1月1日至2023年5月1日的健康指南。五名研究人员将陈述分为六种类型:有明确方向和力度,有明确和直接证据的正式建议(I型);良好做法说明(GPS)(第二类),可单独采取行动,效益显著;备注(第三类),属于正式建议或全球定位系统的一个不可分割的单位,提供额外的澄清;针对特定人群的仅用于研究的建议(IV类);实施注意事项、工具和提示(V类),描述与实施建议有关的方式、人员、地点、内容和时间,缺乏直接证据联系;以及与证据无关且不符合GPS标准的非正式建议(第六类)。结果:纳入指南116份,其中西药指南74份,中药指南12份,中西医结合指南30份。99条指南(85.3%)使用了建议分级评估、发展和评估标准。医学专业学会制定的指南数量最多(53.4%)。在所有语句中,有4422个语句是从准则中提取的。其中,正式建议2154条(48.7%),GPS 197条(4.4%),备注394条(8.9%),仅研究建议16条(0.4%),实施考虑、工具和提示1106条(25.0%),非正式建议555条(12.6%)。结论:迄今为止,中国指南制定者倾向于高估正式建议的数量,而低估GPS、备注、仅用于研究的建议、实施考虑、工具和提示以及非正式建议的数量。因此,目前中国健康指南中可操作声明的质量需要进一步提高。
{"title":"Identifying actionable statements in Chinese health guidelines: a cross-sectional study.","authors":"Xiangying Ren, Tamara Lotfi, Jiyu Chen, Yuling Lei, Chenyibei Zhou, Wei Zhang, Qiao Huang, Yongbo Wang, Siyu Yan, Shichun Wang, Siyuan Ruan, Wanru Wang, Qiyi Zhang, Xiaomei Yao, Yinghui Jin, Holger J Schuenemann","doi":"10.1136/bmjebm-2024-113050","DOIUrl":"10.1136/bmjebm-2024-113050","url":null,"abstract":"<p><strong>Objective: </strong>The purpose of this study is to validate the taxonomy and framework using Chinese guidelines and identify actionable statements.</p><p><strong>Design and setting: </strong>We searched five databases, to identify the health guidelines from 1 January 2020 to 1 May 2023. Five researchers categorised statements into six types: formal recommendations (Type I) with clear direction and strength, with explicit and direct evidence; good practice statements (GPS) (Type II), actionable in isolation with a significant benefit; remarks (Type III), an inseparable unit belonging to a formal recommendation or GPS that provides additional clarification; research only recommendations (Type IV) for specific populations; implementation considerations, tools and tips (Type V), that describe the how, who, where, what and when, in relation to implementing a recommendation and lacking a direct evidence link; and informal recommendations (Type VI), unrelated to evidence and not meeting GPS criteria.</p><p><strong>Results: </strong>We included 116 guidelines, including 74 Western medicine guidelines, 12 traditional Chinese medicine guidelines and 30 integrated Chinese and Western medicine guidelines. 99 guidelines (85.3%) used the Grading of Recommendations Assessment, Development and Evaluation criteria. Medical specialty societies developed the highest number of guidelines (53.4%). Of all the statements, 4422 statements were extracted from the guidelines. Among them, 2154 (48.7%) were formal recommendations, 197 (4.4%) were GPS, 394 (8.9%) were remarks, 16 (0.4%) were research only recommendations, 1106 (25.0%) were implementation considerations, tools and tips, and 555 (12.6%) were informal recommendations.</p><p><strong>Conclusions: </strong>Up to date, the Chinese guideline developers tend to overestimate the number of formal recommendations and underestimate the number of GPS, remarks, research only recommendations, implementation considerations, tools and tips, and informal recommendations. Thus the current quality of actionable statements in Chinese health guidelines requires further enhancement.</p>","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"305-312"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12505086/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143630009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facilitating GRADE judgements about the inconsistency of effects using a novel visualisation approach. 利用新颖的可视化方法促进 GRADE 对效果不一致性的判断。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2024-113038
Mohammad Hassan Murad, Zhen Wang, Yngve Falck-Ytter
{"title":"Facilitating GRADE judgements about the inconsistency of effects using a novel visualisation approach.","authors":"Mohammad Hassan Murad, Zhen Wang, Yngve Falck-Ytter","doi":"10.1136/bmjebm-2024-113038","DOIUrl":"10.1136/bmjebm-2024-113038","url":null,"abstract":"","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"347-350"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12505098/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142280158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel AI applications in systematic review: GPT-4 assisted data extraction, analysis, review of bias. 人工智能在系统评价中的新应用:GPT-4辅助数据提取、分析、偏倚评价。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2024-113066
Jin Kyu Kim, Michael Erlano Chua, Tian Ge Li, Mandy Rickard, Armando J Lorenzo

Objective: To assess custom GPT-4 performance in extracting and evaluating data from medical literature to assist in the systematic review (SR) process.

Design: A proof-of-concept comparative study was conducted to assess the accuracy and precision of custom GPT-4 models against human-performed reviews of randomised controlled trials (RCTs).

Setting: Four custom GPT-4 models were developed, each specialising in one of the following areas: (1) extraction of study characteristics, (2) extraction of outcomes, (3) extraction of bias assessment domains and (4) evaluation of risk of bias using results from the third GPT-4 model. Model outputs were compared against data from four SRs conducted by human authors. The evaluation focused on accuracy in data extraction, precision in replicating outcomes and agreement levels in risk of bias assessments.

Participants: Among four SRs chosen, 43 studies were retrieved for data extraction evaluation. Additionally, 17 RCTs were selected for comparison of risk of bias assessments, where both human comparator SRs and an analogous SR provided assessments for comparison.

Intervention: Custom GPT-4 models were deployed to extract data and evaluate risk of bias from selected studies, and their outputs were compared to those generated by human reviewers.

Main outcome measures: Concordance rates between GPT-4 outputs and human-performed SRs in data extraction, effect size comparability and inter/intra-rater agreement in risk of bias assessments.

Results: When comparing the automatically extracted data to the first table of study characteristics from the published review, GPT-4 showed 88.6% concordance with the original review, with <5% discrepancies due to inaccuracies or omissions. It exceeded human accuracy in 2.5% of instances. Study outcomes were extracted and pooling of results showed comparable effect sizes to comparator SRs. A review of bias assessment using GPT-4 showed fair-moderate but significant intra-rater agreement (ICC=0.518, p<0.001) and inter-rater agreements between human comparator SR (weighted kappa=0.237) and the analogous SR (weighted kappa=0.296). In contrast, there was a poor agreement between the two human-performed SRs (weighted kappa=0.094).

Conclusion: Customized GPT-4 models perform well in extracting precise data from medical literature with potential for utilization in review of bias. While the evaluated tasks are simpler than the broader range of SR methodologies, they provide an important initial assessment of GPT-4's capabilities.

目的:评估自定义GPT-4在医学文献数据提取和评估中的性能,以辅助系统评价(SR)过程。设计:进行了一项概念验证比较研究,以评估定制GPT-4模型与人类随机对照试验(rct)的准确性和精密度。设置:开发了四个定制的GPT-4模型,每个模型都专注于以下领域之一:(1)提取研究特征,(2)提取结果,(3)提取偏倚评估域,(4)使用第三个GPT-4模型的结果评估偏倚风险。将模型输出与人类作者进行的四次SRs的数据进行比较。评估的重点是数据提取的准确性、重复结果的准确性和偏倚风险评估的一致性水平。参与者:选取4个SRs,检索43项研究进行数据提取评价。此外,选择17个随机对照试验进行偏倚风险评估的比较,其中人类比较者SR和类似SR都提供了比较评估。干预:采用定制的GPT-4模型从选定的研究中提取数据并评估偏倚风险,并将其输出与人工审稿人产生的结果进行比较。主要结果测量:数据提取中GPT-4输出和人工执行的SRs之间的一致性率,效应大小可比性和偏见风险评估中评分者之间/内部的一致性。结果:将自动提取的数据与已发表综述的第一个研究特征表进行比较,GPT-4与原始综述的一致性为88.6%。结论:定制的GPT-4模型在从医学文献中提取精确数据方面表现良好,具有应用于偏倚评价的潜力。虽然评估任务比更广泛的SR方法更简单,但它们提供了对GPT-4能力的重要初步评估。
{"title":"Novel AI applications in systematic review: GPT-4 assisted data extraction, analysis, review of bias.","authors":"Jin Kyu Kim, Michael Erlano Chua, Tian Ge Li, Mandy Rickard, Armando J Lorenzo","doi":"10.1136/bmjebm-2024-113066","DOIUrl":"10.1136/bmjebm-2024-113066","url":null,"abstract":"<p><strong>Objective: </strong>To assess custom GPT-4 performance in extracting and evaluating data from medical literature to assist in the systematic review (SR) process.</p><p><strong>Design: </strong>A proof-of-concept comparative study was conducted to assess the accuracy and precision of custom GPT-4 models against human-performed reviews of randomised controlled trials (RCTs).</p><p><strong>Setting: </strong>Four custom GPT-4 models were developed, each specialising in one of the following areas: (1) extraction of study characteristics, (2) extraction of outcomes, (3) extraction of bias assessment domains and (4) evaluation of risk of bias using results from the third GPT-4 model. Model outputs were compared against data from four SRs conducted by human authors. The evaluation focused on accuracy in data extraction, precision in replicating outcomes and agreement levels in risk of bias assessments.</p><p><strong>Participants: </strong>Among four SRs chosen, 43 studies were retrieved for data extraction evaluation. Additionally, 17 RCTs were selected for comparison of risk of bias assessments, where both human comparator SRs and an analogous SR provided assessments for comparison.</p><p><strong>Intervention: </strong>Custom GPT-4 models were deployed to extract data and evaluate risk of bias from selected studies, and their outputs were compared to those generated by human reviewers.</p><p><strong>Main outcome measures: </strong>Concordance rates between GPT-4 outputs and human-performed SRs in data extraction, effect size comparability and inter/intra-rater agreement in risk of bias assessments.</p><p><strong>Results: </strong>When comparing the automatically extracted data to the first table of study characteristics from the published review, GPT-4 showed 88.6% concordance with the original review, with <5% discrepancies due to inaccuracies or omissions. It exceeded human accuracy in 2.5% of instances. Study outcomes were extracted and pooling of results showed comparable effect sizes to comparator SRs. A review of bias assessment using GPT-4 showed fair-moderate but significant intra-rater agreement (ICC=0.518, p<0.001) and inter-rater agreements between human comparator SR (weighted kappa=0.237) and the analogous SR (weighted kappa=0.296). In contrast, there was a poor agreement between the two human-performed SRs (weighted kappa=0.094).</p><p><strong>Conclusion: </strong>Customized GPT-4 models perform well in extracting precise data from medical literature with potential for utilization in review of bias. While the evaluated tasks are simpler than the broader range of SR methodologies, they provide an important initial assessment of GPT-4's capabilities.</p>","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"313-322"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143810374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Top 15 Choosing Wisely international campaign recommendations to reduce low-value care. 15项明智选择国际运动建议,以减少低价值护理。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2025-113804
Wendy Levinson, Karen Born, Juan Victor Ariel Franco, Karin Silvana Kopitowski
{"title":"Top 15 Choosing Wisely international campaign recommendations to reduce low-value care.","authors":"Wendy Levinson, Karen Born, Juan Victor Ariel Franco, Karin Silvana Kopitowski","doi":"10.1136/bmjebm-2025-113804","DOIUrl":"10.1136/bmjebm-2025-113804","url":null,"abstract":"","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"355-357"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144224274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementing AI models in clinical workflows: a roadmap. 在临床工作流程中实施人工智能模型:路线图。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2023-112727
Fei Wang, Ashley Beecy
{"title":"Implementing AI models in clinical workflows: a roadmap.","authors":"Fei Wang, Ashley Beecy","doi":"10.1136/bmjebm-2023-112727","DOIUrl":"10.1136/bmjebm-2023-112727","url":null,"abstract":"","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"285-287"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11666800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141445397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Curcumin and proton pump inhibitors for functional dyspepsia: a randomised, double blind controlled trial. 更正:姜黄素和质子泵抑制剂治疗功能性消化不良:随机双盲对照试验。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-22 DOI: 10.1136/bmjebm-2022-112231corr1
{"title":"Correction: Curcumin and proton pump inhibitors for functional dyspepsia: a randomised, double blind controlled trial.","authors":"","doi":"10.1136/bmjebm-2022-112231corr1","DOIUrl":"10.1136/bmjebm-2022-112231corr1","url":null,"abstract":"","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":"358"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141309952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hidden risks of predictive models in healthcare. 医疗保健预测模型的潜在风险。
IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Pub Date : 2025-09-17 DOI: 10.1136/bmjebm-2025-113730
Joseph Alderman, Richard Riley, Dhruv Parekh, Charlotte Summers, Xiaoxuan Liu, Alastair Denniston
{"title":"Hidden risks of predictive models in healthcare.","authors":"Joseph Alderman, Richard Riley, Dhruv Parekh, Charlotte Summers, Xiaoxuan Liu, Alastair Denniston","doi":"10.1136/bmjebm-2025-113730","DOIUrl":"https://doi.org/10.1136/bmjebm-2025-113730","url":null,"abstract":"","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145079749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
BMJ Evidence-Based Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1