首页 > 最新文献

Cochrane Evidence Synthesis and Methods最新文献

英文 中文
Enhancing nursing and other healthcare professionals' knowledge of childhood sexual abuse through self-assessment: A realist review 通过自我评估提高护理和其他保健专业人员对儿童性虐待的认识:现实主义回顾
Pub Date : 2025-07-23 DOI: 10.1002/cesm.70019
Dr. Olumide Adisa, Ms. Katie Tyrrell, Dr. Katherine Allen

Aim

To explore how child sexual abuse/exploitation (CSA/E) self-assessment tools are being used to enhance healthcare professionals' knowledge and confidence.

Background

Child sexual abuse/exploitation is common and associated with lifelong health impacts. In particular, nurses are well-placed to facilitate disclosures by adult survivors of child sexual abuse/exploitation and promote timely access to support. However, research shows that many are reluctant to enquire about abuse and feel underprepared for disclosures. Self-assessment provides a participatory method for evaluating competencies and identifying areas that need improvement.

Evaluation

Researchers adopted a realist synthesis approach, searching relevant databases for healthcare professionals' self-assessment tools/protocols relevant to adult survivors. In total, researchers reviewed 247 full-text articles. Twenty-five items met the criteria for data extraction, and to assess relevant contexts (C), mechanisms (M) and outcomes (O) were identified and mapped. Eight of these were included in the final synthesis based on papers that identified two key ‘families’ of abuse-related self-assessment interventions for healthcare contexts: PREMIS, a validated survey instrument to assess HCP knowledge, confidence and practice about domestic violence and abuse (DVA); Trauma-informed practice/care (TIP/C) organisational self-assessment protocols. Two revised programme theories were formulated: (1). Individual self-assessment can promote organisational accountability; and (2). Organisational self-assessment can increase the coherence and sustainability of changes in practice.

Conclusions

There is a lack of self-assessment tools/protocols designed to improve healthcare professionals' knowledge and confidence. Our review contributes to the evidence base on improving healthcare responses to CSA/E survivors, illustrating that self-assessment tools or protocols designed to improve HCP responses to adult survivors of CSA/E remain underdeveloped and under-studied. Refined programme theories developed during synthesis regarding DVA and TIP/C-related tools or protocols suggest areas for CSA/E-specific future research with stakeholders and service users.

目的探讨如何使用儿童性虐待/性剥削(CSA/E)自我评估工具来提高医护人员的知识和信心。儿童性虐待/性剥削很常见,并与终身健康影响有关。特别是,护士在促进儿童性虐待/性剥削成年幸存者的披露和促进及时获得支持方面处于有利地位。然而,研究表明,许多人不愿询问性侵问题,对披露感到准备不足。自我评估为评价能力和确定需要改进的领域提供了一种参与性方法。研究人员采用现实主义综合方法,在相关数据库中检索与成年幸存者相关的医疗保健专业人员自我评估工具/协议。研究人员总共审阅了247篇全文文章。25个项目符合数据提取的标准,为了评估相关背景(C),确定并绘制了机制(M)和结果(O)。其中8项被纳入最后的综合,其依据的论文确定了医疗保健环境中与虐待有关的两个关键“家庭”自我评估干预措施:PREMIS,一种有效的调查工具,用于评估卫生保健专业人员对家庭暴力和虐待的知识、信心和做法;创伤知情实践/护理(TIP/C)组织自我评估协议。提出了两种修正方案理论:(1)。个人自我评估可以促进组织问责;和(2)。组织自我评估可以提高实践中变革的一致性和可持续性。结论缺乏旨在提高卫生保健专业人员知识和信心的自我评估工具/方案。我们的综述为改善CSA/E幸存者的医疗保健反应提供了证据基础,说明旨在改善成年CSA/E幸存者的HCP反应的自我评估工具或方案仍然不发达且研究不足。在综合过程中形成的关于DVA和TIP/ c相关工具或协议的完善方案理论,为未来与利益攸关方和服务用户进行特定于CSA/ e的研究提出了建议。
{"title":"Enhancing nursing and other healthcare professionals' knowledge of childhood sexual abuse through self-assessment: A realist review","authors":"Dr. Olumide Adisa,&nbsp;Ms. Katie Tyrrell,&nbsp;Dr. Katherine Allen","doi":"10.1002/cesm.70019","DOIUrl":"https://doi.org/10.1002/cesm.70019","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Aim</h3>\u0000 \u0000 <p>To explore how child sexual abuse/exploitation (CSA/E) self-assessment tools are being used to enhance healthcare professionals' knowledge and confidence.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>Child sexual abuse/exploitation is common and associated with lifelong health impacts. In particular, nurses are well-placed to facilitate disclosures by adult survivors of child sexual abuse/exploitation and promote timely access to support. However, research shows that many are reluctant to enquire about abuse and feel underprepared for disclosures. Self-assessment provides a participatory method for evaluating competencies and identifying areas that need improvement.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Evaluation</h3>\u0000 \u0000 <p>Researchers adopted a realist synthesis approach, searching relevant databases for healthcare professionals' self-assessment tools/protocols relevant to adult survivors. In total, researchers reviewed 247 full-text articles. Twenty-five items met the criteria for data extraction, and to assess relevant contexts (C), mechanisms (M) and outcomes (O) were identified and mapped. Eight of these were included in the final synthesis based on papers that identified two key ‘families’ of abuse-related self-assessment interventions for healthcare contexts: PREMIS, a validated survey instrument to assess HCP knowledge, confidence and practice about domestic violence and abuse (DVA); Trauma-informed practice/care (TIP/C) organisational self-assessment protocols. Two revised programme theories were formulated: (1). Individual self-assessment can promote organisational accountability; and (2). Organisational self-assessment can increase the coherence and sustainability of changes in practice.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>There is a lack of self-assessment tools/protocols designed to improve healthcare professionals' knowledge and confidence. Our review contributes to the evidence base on improving healthcare responses to CSA/E survivors, illustrating that self-assessment tools or protocols designed to improve HCP responses to adult survivors of CSA/E remain underdeveloped and under-studied. Refined programme theories developed during synthesis regarding DVA and TIP/C-related tools or protocols suggest areas for CSA/E-specific future research with stakeholders and service users.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144681594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Artificial Intelligence Tools as Second Reviewers for Data Extraction in Systematic Reviews: A Performance Comparison of Two AI Tools Against Human Reviewers 使用人工智能工具作为系统评价中数据提取的第二审稿人:两种人工智能工具与人类审稿人的性能比较
Pub Date : 2025-07-14 DOI: 10.1002/cesm.70036
T. Helms Andersen, T. M. Marcussen, A. D. Termannsen, T. W. H. Lawaetz, O. Nørgaard

Background

Systematic reviews are essential but time-consuming and expensive. Large language models (LLMs) and artificial intelligence (AI) tools could potentially automate data extraction, but no comprehensive workflow has been tested for different review types.

Objective

To evaluate Elicit's and ChatGPT's abilities to extract data from journal articles as a replacement for one of two human data extractors in systematic reviews.

Methods

Human-extracted data from three systematic reviews (30 articles in total) was compared to data extracted by Elicit and ChatGPT. The AI tools extracted population characteristics, study design, and review-specific variables. Performance metrics were calculated against human double-extracted data as the gold standard, followed by a detailed error analysis.

Results

Precision, recall and F1-score were all 92% for Elicit and 91%, 89% and 90% for ChatGPT. Recall was highest for study design (Elicit: 100%; ChatGPT: 90%) and population characteristics (Elicit: 100%; ChatGPT: 97%), while review-specific variables achieved 77% in Elicit and 80% in ChatGPT. Elicit had four instances of confabulation while ChatGPT had three. There was no significant difference between the two AI tools' performance (recall difference: 3.3% points, 95% CI: –5.2%–11.9%, p = 0.445).

Conclusion

AI tools demonstrated high and similar performance in data extraction compared to human reviewers, particularly for standardized variables. Error analysis revealed confabulations in 4% of data points. We propose adopting AI-assisted extraction to replace the second human extractor, with the second human instead focusing on reconciling discrepancies between AI and the primary human extractor.

系统审查是必要的,但耗时且昂贵。大型语言模型(llm)和人工智能(AI)工具可能会自动提取数据,但还没有针对不同审查类型测试过全面的工作流程。目的评估Elicit和ChatGPT从期刊文章中提取数据的能力,以取代系统评价中两个人工数据提取器中的一个。方法将人工提取的3篇系统综述(共30篇)的数据与Elicit和ChatGPT提取的数据进行比较。人工智能工具提取群体特征、研究设计和评论特定变量。性能指标是根据人类双重提取的数据作为金标准计算的,然后是详细的误差分析。结果Elicit的查全率、查全率和f1评分均为92%,ChatGPT的查全率分别为91%、89%和90%。研究设计的回忆率最高(引出:100%;ChatGPT: 90%)和群体特征(Elicit: 100%;ChatGPT: 97%),而审查特定变量在Elicit中达到77%,在ChatGPT中达到80%。Elicit有4个虚构实例,而ChatGPT有3个。两种人工智能工具的性能没有显著差异(召回差:3.3%点,95% CI: -5.2%-11.9%, p = 0.445)。结论:人工智能工具在数据提取方面表现出与人类审稿人相似的高性能,尤其是在标准化变量方面。误差分析显示4%的数据点存在虚构。我们建议采用人工智能辅助提取来取代第二个人工提取器,而第二个人工提取器则专注于协调人工智能与主要人工提取器之间的差异。
{"title":"Using Artificial Intelligence Tools as Second Reviewers for Data Extraction in Systematic Reviews: A Performance Comparison of Two AI Tools Against Human Reviewers","authors":"T. Helms Andersen,&nbsp;T. M. Marcussen,&nbsp;A. D. Termannsen,&nbsp;T. W. H. Lawaetz,&nbsp;O. Nørgaard","doi":"10.1002/cesm.70036","DOIUrl":"https://doi.org/10.1002/cesm.70036","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>Systematic reviews are essential but time-consuming and expensive. Large language models (LLMs) and artificial intelligence (AI) tools could potentially automate data extraction, but no comprehensive workflow has been tested for different review types.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Objective</h3>\u0000 \u0000 <p>To evaluate Elicit's and ChatGPT's abilities to extract data from journal articles as a replacement for one of two human data extractors in systematic reviews.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>Human-extracted data from three systematic reviews (30 articles in total) was compared to data extracted by Elicit and ChatGPT. The AI tools extracted population characteristics, study design, and review-specific variables. Performance metrics were calculated against human double-extracted data as the gold standard, followed by a detailed error analysis.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>Precision, recall and F1-score were all 92% for Elicit and 91%, 89% and 90% for ChatGPT. Recall was highest for study design (Elicit: 100%; ChatGPT: 90%) and population characteristics (Elicit: 100%; ChatGPT: 97%), while review-specific variables achieved 77% in Elicit and 80% in ChatGPT. Elicit had four instances of confabulation while ChatGPT had three. There was no significant difference between the two AI tools' performance (recall difference: 3.3% points, 95% CI: –5.2%–11.9%, <i>p</i> = 0.445).</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusion</h3>\u0000 \u0000 <p>AI tools demonstrated high and similar performance in data extraction compared to human reviewers, particularly for standardized variables. Error analysis revealed confabulations in 4% of data points. We propose adopting AI-assisted extraction to replace the second human extractor, with the second human instead focusing on reconciling discrepancies between AI and the primary human extractor.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Creating Interactive Data Dashboards for Evidence Syntheses 为证据合成创建交互式数据仪表板
Pub Date : 2025-06-25 DOI: 10.1002/cesm.70035
Leslie A. Perdue, Shaina D. Trevino, Sean Grant, Jennifer S. Lin, Emily E. Tanner-Smith

Systematic review findings are typically disseminated via static outputs, such as scientific manuscripts, which can limit the accessibility and usability for diverse audiences. Interactive data dashboards transform systematic review data into dynamic, user-friendly visualizations, allowing deeper engagement with evidence synthesis findings. We propose a workflow for creating interactive dashboards to display evidence synthesis results, including three key phases: planning, development, and deployment. Planning involves defining the dashboard objectives and key audiences, selecting the appropriate software (e.g., Tableau or R Shiny) and preparing the data. Development includes designing a user-friendly interface and specifying interactive elements. Lastly, deployment focuses on making it available to users and utilizing user-testing. Throughout all phases, we emphasize seeking and incorporating interest-holder input and aligning dashboards with the intended audience's needs. To demonstrate this workflow, we provide two examples from previous systematic reviews. The first dashboard, created in Tableau, presents findings from a meta-analysis to support a U.S. Preventive Services Task Force recommendation on lipid disorder screening in children, while the second utilizes R Shiny to display data from a scoping review on the 4-day school week among K-12 students in the U.S. Both dashboards incorporate interactive elements to present complex evidence tailored to different interest-holders, including non-research audiences. Interactive dashboards can enhance the utility of evidence syntheses by providing a user-friendly tool for interest-holders to explore data relevant to their specific needs. This workflow can be adapted to create interactive dashboards in flexible formats to increase the use and accessibility of systematic review findings.

系统审查结果通常通过静态输出(例如科学手稿)传播,这可能限制不同受众的可及性和可用性。交互式数据仪表板将系统审查数据转换为动态的,用户友好的可视化,允许更深入地参与证据综合发现。我们提出了一个用于创建交互式仪表板以显示证据合成结果的工作流程,包括三个关键阶段:规划、开发和部署。计划包括定义仪表板目标和关键受众,选择合适的软件(例如Tableau或R Shiny)和准备数据。开发包括设计用户友好的界面和指定交互元素。最后,部署的重点是使其对用户可用并利用用户测试。在所有阶段,我们都强调寻求和整合利益相关者的意见,并将仪表板与目标受众的需求保持一致。为了演示这个工作流程,我们从以前的系统回顾中提供两个例子。第一个仪表板是在Tableau中创建的,展示了一项荟萃分析的结果,以支持美国预防服务工作组关于儿童脂质紊乱筛查的建议,而第二个仪表板利用R Shiny显示了来自K-12学生每周4天学校范围审查的数据。两个仪表板都包含互动元素,以呈现针对不同利益相关者(包括非研究受众)量身定制的复杂证据。交互式仪表板为利益相关者提供了一种用户友好的工具,可以探索与其特定需求相关的数据,从而提高证据综合的效用。可以调整此工作流以创建灵活格式的交互式仪表板,以增加系统审查结果的使用和可访问性。
{"title":"Creating Interactive Data Dashboards for Evidence Syntheses","authors":"Leslie A. Perdue,&nbsp;Shaina D. Trevino,&nbsp;Sean Grant,&nbsp;Jennifer S. Lin,&nbsp;Emily E. Tanner-Smith","doi":"10.1002/cesm.70035","DOIUrl":"https://doi.org/10.1002/cesm.70035","url":null,"abstract":"<p>Systematic review findings are typically disseminated via static outputs, such as scientific manuscripts, which can limit the accessibility and usability for diverse audiences. Interactive data dashboards transform systematic review data into dynamic, user-friendly visualizations, allowing deeper engagement with evidence synthesis findings. We propose a workflow for creating interactive dashboards to display evidence synthesis results, including three key phases: planning, development, and deployment. Planning involves defining the dashboard objectives and key audiences, selecting the appropriate software (e.g., Tableau or R Shiny) and preparing the data. Development includes designing a user-friendly interface and specifying interactive elements. Lastly, deployment focuses on making it available to users and utilizing user-testing. Throughout all phases, we emphasize seeking and incorporating interest-holder input and aligning dashboards with the intended audience's needs. To demonstrate this workflow, we provide two examples from previous systematic reviews. The first dashboard, created in Tableau, presents findings from a meta-analysis to support a U.S. Preventive Services Task Force recommendation on lipid disorder screening in children, while the second utilizes R Shiny to display data from a scoping review on the 4-day school week among K-12 students in the U.S. Both dashboards incorporate interactive elements to present complex evidence tailored to different interest-holders, including non-research audiences. Interactive dashboards can enhance the utility of evidence syntheses by providing a user-friendly tool for interest-holders to explore data relevant to their specific needs. This workflow can be adapted to create interactive dashboards in flexible formats to increase the use and accessibility of systematic review findings.</p>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144472977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Extractions Using a Large Language Model (Elicit) and Human Reviewers in Randomized Controlled Trials: A Systematic Comparison 随机对照试验中使用大型语言模型(Elicit)和人工审稿人的数据提取:系统比较
Pub Date : 2025-06-08 DOI: 10.1002/cesm.70033
Joleen Bianchi, Julian Hirt, Magdalena Vogt, Janine Vetsch

Aim

We aimed at comparing data extractions from randomized controlled trials by using Elicit and human reviewers.

Background

Elicit is an artificial intelligence tool which may automate specific steps in conducting systematic reviews. However, the tool's performance and accuracy have not been independently assessed.

Methods

For comparison, we sampled 20 randomized controlled trials of which data were extracted manually from a human reviewer. We assessed the variables study objectives, sample characteristics and size, study design, interventions, outcome measured, and intervention effects and classified the results into “more,” “equal to,” “partially equal,” and “deviating” extractions. STROBE checklist was used to report the study.

Results

We analysed 20 randomized controlled trials from 11 countries. The studies covered diverse healthcare topics. Across all seven variables, Elicit extracted “more” data in 29.3% of cases, “equal” in 20.7%, “partially equal” in 45.7%, and “deviating” in 4.3%. Elicit provided “more” information for the variable study design (100%) and sample characteristics (45%). In contrast, for more nuanced variables, such as “intervention effects,” Elicit's extractions were less detailed, with 95% rated as “partially equal.”

Conclusions

Elicit was capable of extracting data partly correct for our predefined variables. Variables like “intervention effect” or “intervention” may require a human reviewer to complete the data extraction. Our results suggest that verification by human reviewers is necessary to ensure that all relevant information is captured completely and correctly by Elicit.

Implications

Systematic reviews are labor-intensive. Data extraction process may be facilitated by artificial intelligence tools. Use of Elicit may require a human reviewer to double-check the extracted data.

我们的目的是通过使用Elicit和人工审稿人来比较随机对照试验的数据提取。Elicit是一种人工智能工具,它可以自动执行系统审查中的特定步骤。然而,该工具的性能和准确性尚未得到独立评估。方法为了进行比较,我们选取了20个随机对照试验,这些试验的数据都是人工从审稿人那里提取的。我们评估了研究目标、样本特征和规模、研究设计、干预措施、测量结果和干预效果等变量,并将结果分为“更多”、“相等”、“部分相等”和“偏离”提取。采用STROBE检查表进行研究报告。结果我们分析了来自11个国家的20个随机对照试验。这些研究涵盖了不同的医疗保健主题。在所有7个变量中,Elicit提取“更多”数据的情况占29.3%,“相等”的情况占20.7%,“部分相等”的情况占45.7%,“偏离”的情况占4.3%。Elicit为变量研究设计(100%)和样本特征(45%)提供了“更多”信息。相比之下,对于更细微的变量,如“干预效应”,Elicit的提取就不那么详细了,95%的人被评为“部分相等”。得出的结论是,Elicit能够提取出部分符合我们预定义变量的数据。诸如“干预效果”或“干预”之类的变量可能需要人工审阅人员来完成数据提取。我们的结果表明,人工审稿人的验证是必要的,以确保所有相关信息被Elicit完整而正确地捕获。系统审查是劳动密集型的。人工智能工具可以促进数据提取过程。使用Elicit可能需要人工审查人员对提取的数据进行双重检查。
{"title":"Data Extractions Using a Large Language Model (Elicit) and Human Reviewers in Randomized Controlled Trials: A Systematic Comparison","authors":"Joleen Bianchi,&nbsp;Julian Hirt,&nbsp;Magdalena Vogt,&nbsp;Janine Vetsch","doi":"10.1002/cesm.70033","DOIUrl":"https://doi.org/10.1002/cesm.70033","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Aim</h3>\u0000 \u0000 <p>We aimed at comparing data extractions from randomized controlled trials by using Elicit and human reviewers.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>Elicit is an artificial intelligence tool which may automate specific steps in conducting systematic reviews. However, the tool's performance and accuracy have not been independently assessed.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>For comparison, we sampled 20 randomized controlled trials of which data were extracted manually from a human reviewer. We assessed the variables study objectives, sample characteristics and size, study design, interventions, outcome measured, and intervention effects and classified the results into “more,” “equal to,” “partially equal,” and “deviating” extractions. STROBE checklist was used to report the study.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>We analysed 20 randomized controlled trials from 11 countries. The studies covered diverse healthcare topics. Across all seven variables, Elicit extracted “more” data in 29.3% of cases, “equal” in 20.7%, “partially equal” in 45.7%, and “deviating” in 4.3%. Elicit provided “more” information for the variable study design (100%) and sample characteristics (45%). In contrast, for more nuanced variables, such as “intervention effects,” Elicit's extractions were less detailed, with 95% rated as “partially equal.”</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>Elicit was capable of extracting data partly correct for our predefined variables. Variables like “intervention effect” or “intervention” may require a human reviewer to complete the data extraction. Our results suggest that verification by human reviewers is necessary to ensure that all relevant information is captured completely and correctly by Elicit.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Implications</h3>\u0000 \u0000 <p>Systematic reviews are labor-intensive. Data extraction process may be facilitated by artificial intelligence tools. Use of Elicit may require a human reviewer to double-check the extracted data.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using GPT-4 for Title and Abstract Screening in a Literature Review of Public Policies: A Feasibility Study 在公共政策文献综述中使用GPT-4筛选标题和摘要的可行性研究
Pub Date : 2025-05-22 DOI: 10.1002/cesm.70031
Max Rubinstein, Sean Grant, Beth Ann Griffin, Seema Choksy Pessar, Bradley D. Stein

Introduction

We describe the first known use of large language models (LLMs) to screen titles and abstracts in a review of public policy literature. Our objective was to assess the percentage of articles GPT-4 recommended for exclusion that should have been included (“false exclusion rate”).

Methods

We used GPT-4 to exclude articles from a database for a literature review of quantitative evaluations of federal and state policies addressing the opioid crisis. We exported our bibliographic database to a CSV file containing titles, abstracts, and keywords and asked GPT-4 to recommend whether to exclude each article. We conducted a preliminary testing of these recommendations using a subset of articles and a final test on a sample of the entire database. We designated a false exclusion rate of 10% as an adequate performance threshold.

Results

GPT-4 recommended excluding 41,742 of the 43,480 articles (96%) containing an abstract. Our preliminary test identified only one false exclusion; our final test identified no false exclusions, yielding an estimated false exclusion rate of 0.00 [0.00, 0.05]. Fewer than 1%—417 of the 41,742 articles—were incorrectly excluded. After manually assessing the eligibility of all remaining articles, we identified 608 of the 1738 articles that GPT-4 did not exclude: 65% of the articles recommended for inclusion should have been excluded.

Discussion/Conclusions

GPT-4 performed well at recommending articles to exclude from our literature review, resulting in substantial time and cost savings. A key limitation is that we did not use GPT-4 to determine inclusions, nor did our model perform well on this task. However, GPT-4 dramatically reduced the number of articles requiring review. Systematic reviewers should conduct performance evaluations to ensure that an LLM meets a minimally acceptable quality standard before relying on its recommendations.

我们描述了在公共政策文献综述中首次使用大型语言模型(llm)来筛选标题和摘要。我们的目的是评估GPT-4推荐排除的文章本应纳入的百分比(“误排除率”)。方法:我们使用GPT-4从数据库中排除文献,对联邦和州解决阿片类药物危机的政策进行定量评估。我们将书目数据库导出为包含标题、摘要和关键词的CSV文件,并要求GPT-4建议是否排除每篇文章。我们使用文章子集对这些建议进行了初步测试,并对整个数据库的样本进行了最终测试。我们将假排除率指定为10%作为适当的性能阈值。结果GPT-4建议从43480篇包含摘要的文章中剔除41742篇(96%)。我们的初步测试只发现了一个错误排除;我们的最终测试没有发现假排除,估计假排除率为0.00[0.00,0.05]。41742篇文章中只有不到1%(417篇)被错误地排除在外。在人工评估所有剩余文献的合格性后,我们从1738篇GPT-4未排除的文献中确定了608篇:65%推荐纳入的文献本应被排除。GPT-4在推荐从我们的文献综述中排除的文章方面表现良好,从而节省了大量的时间和成本。一个关键的限制是,我们没有使用GPT-4来确定夹杂物,我们的模型也没有很好地完成这项任务。然而,GPT-4大大减少了需要审查的文章数量。系统审稿人应该进行绩效评估,以确保LLM在依赖其建议之前满足最低可接受的质量标准。
{"title":"Using GPT-4 for Title and Abstract Screening in a Literature Review of Public Policies: A Feasibility Study","authors":"Max Rubinstein,&nbsp;Sean Grant,&nbsp;Beth Ann Griffin,&nbsp;Seema Choksy Pessar,&nbsp;Bradley D. Stein","doi":"10.1002/cesm.70031","DOIUrl":"https://doi.org/10.1002/cesm.70031","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Introduction</h3>\u0000 \u0000 <p>We describe the first known use of large language models (LLMs) to screen titles and abstracts in a review of public policy literature. Our objective was to assess the percentage of articles GPT-4 recommended for exclusion that should have been included (“false exclusion rate”).</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We used GPT-4 to exclude articles from a database for a literature review of quantitative evaluations of federal and state policies addressing the opioid crisis. We exported our bibliographic database to a CSV file containing titles, abstracts, and keywords and asked GPT-4 to recommend whether to exclude each article. We conducted a preliminary testing of these recommendations using a subset of articles and a final test on a sample of the entire database. We designated a false exclusion rate of 10% as an adequate performance threshold.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>GPT-4 recommended excluding 41,742 of the 43,480 articles (96%) containing an abstract. Our preliminary test identified only one false exclusion; our final test identified no false exclusions, yielding an estimated false exclusion rate of 0.00 [0.00, 0.05]. Fewer than 1%—417 of the 41,742 articles—were incorrectly excluded. After manually assessing the eligibility of all remaining articles, we identified 608 of the 1738 articles that GPT-4 did not exclude: 65% of the articles recommended for inclusion should have been excluded.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion/Conclusions</h3>\u0000 \u0000 <p>GPT-4 performed well at recommending articles to exclude from our literature review, resulting in substantial time and cost savings. A key limitation is that we did not use GPT-4 to determine inclusions, nor did our model perform well on this task. However, GPT-4 dramatically reduced the number of articles requiring review. Systematic reviewers should conduct performance evaluations to ensure that an LLM meets a minimally acceptable quality standard before relying on its recommendations.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70031","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144108810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence and Machine Learning to Improve Evidence Synthesis Production Efficiency: An Observational Study of Resource Use and Time-to-Completion 人工智能和机器学习提高证据合成生产效率:资源使用和完成时间的观察研究
Pub Date : 2025-05-19 DOI: 10.1002/cesm.70030
Christopher James Rose, Jose Francisco Meneses-Echavez, Ashley Elizabeth Muller, Rigmor C. Berg, Tiril C. Borge, Patricia Sofia Jacobsen Jardim, Chris Cooper

Introduction

Evidence syntheses are crucial in healthcare and elsewhere but are resource-intensive, often taking years to produce. Artificial intelligence and machine learning (AI/ML) tools may improve production efficiency in certain review phases, but little is known about their impact on entire reviews.

Methods

We performed prespecified analyses of a convenience sample of eligible healthcare- or welfare-related reviews commissioned at the Norwegian Institute of Public Health between August 1 2020 (first commission to use AI/ML) and January 31 2023 (administrative cut-off). The main exposures were AI/ML use following an internal support team's recommendation versus no use. Ranking (e.g., priority screening), classification (e.g., study design), clustering (e.g., documents), and bibliometric analysis (e.g., OpenAlex) tools were included, but we did not include or exclude specific tools. Generative AI tools were not widely available during the study period. The outcomes were resources (person-hours) and time from commission to completion (approval for delivery, including peer review; weeks). Analyses accounted for nonrandomized assignment and censored outcomes (reviews ongoing at cut-off). Researchers classifying exposures were blinded to outcomes. The statistician was blinded to exposure.

Results

Among 39 reviews, 7 (18%) were health technology assessments versus systematic reviews, 19 (49%) focused on healthcare versus welfare, 18 (46%) planned meta-analysis, and 3 (8%) were ongoing at cut-off. AI/ML tools were used in 27 (69%) reviews. Reviews that used AI/ML as recommended used more resources (mean 667 vs. 291 person-hours) but were completed slightly faster (27.6 vs. 28.2 weeks). These differences were not statistically significant (relative resource use 3.71; 95% CI: 0.36–37.95; p = 0.269; relative time-to-completion: 0.92; 95% CI: 0.53–1.58; p = 0.753).

Conclusions

Associations between AI/ML use and the outcomes remains uncertain. Multicenter studies or meta-analyses may be needed to determine if these tools meaningfully reduce resource use and time to produce evidence syntheses.

证据综合在医疗保健和其他领域至关重要,但需要大量资源,往往需要数年时间才能完成。人工智能和机器学习(AI/ML)工具可能会在某些审查阶段提高生产效率,但对整个审查的影响知之甚少。方法:我们对挪威公共卫生研究所在2020年8月1日(首次使用人工智能/机器学习)至2023年1月31日(行政截止)期间委托的符合条件的医疗保健或福利相关审查的便利样本进行了预先指定的分析。主要的风险是根据内部支持团队的建议使用AI/ML与不使用。排名(例如,优先筛选)、分类(例如,研究设计)、聚类(例如,文档)和文献计量分析(例如,OpenAlex)工具被纳入,但我们没有纳入或排除特定的工具。在研究期间,生成式人工智能工具并没有广泛使用。结果是资源(人小时)和从委托到完成的时间(批准交付,包括同行评审;周)。分析考虑了非随机分配和审查结果(截止时审查仍在进行)。对暴露程度进行分类的研究人员对结果一无所知。统计学家对曝光视而不见。结果在39篇综述中,7篇(18%)是卫生技术评估与系统评价,19篇(49%)关注医疗保健与福利,18篇(46%)计划荟萃分析,3篇(8%)截止日期仍在进行中。27篇(69%)评论使用了AI/ML工具。按照推荐使用AI/ML的评估使用了更多的资源(平均667 vs 291人小时),但完成的时间略快(27.6 vs 28.2周)。这些差异无统计学意义(相对资源利用3.71;95% ci: 0.36-37.95;p = 0.269;相对完工时间:0.92;95% ci: 0.53-1.58;p = 0.753)。结论:AI/ML使用与预后之间的关系尚不确定。可能需要多中心研究或荟萃分析来确定这些工具是否有意义地减少了资源使用和产生证据综合的时间。
{"title":"Artificial Intelligence and Machine Learning to Improve Evidence Synthesis Production Efficiency: An Observational Study of Resource Use and Time-to-Completion","authors":"Christopher James Rose,&nbsp;Jose Francisco Meneses-Echavez,&nbsp;Ashley Elizabeth Muller,&nbsp;Rigmor C. Berg,&nbsp;Tiril C. Borge,&nbsp;Patricia Sofia Jacobsen Jardim,&nbsp;Chris Cooper","doi":"10.1002/cesm.70030","DOIUrl":"https://doi.org/10.1002/cesm.70030","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Introduction</h3>\u0000 \u0000 <p>Evidence syntheses are crucial in healthcare and elsewhere but are resource-intensive, often taking years to produce. Artificial intelligence and machine learning (AI/ML) tools may improve production efficiency in certain review phases, but little is known about their impact on entire reviews.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We performed prespecified analyses of a convenience sample of eligible healthcare- or welfare-related reviews commissioned at the Norwegian Institute of Public Health between August 1 2020 (first commission to use AI/ML) and January 31 2023 (administrative cut-off). The main exposures were AI/ML use following an internal support team's recommendation versus no use. Ranking (e.g., priority screening), classification (e.g., study design), clustering (e.g., documents), and bibliometric analysis (e.g., OpenAlex) tools were included, but we did not include or exclude specific tools. Generative AI tools were not widely available during the study period. The outcomes were resources (person-hours) and time from commission to completion (approval for delivery, including peer review; weeks). Analyses accounted for nonrandomized assignment and censored outcomes (reviews ongoing at cut-off). Researchers classifying exposures were blinded to outcomes. The statistician was blinded to exposure.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>Among 39 reviews, 7 (18%) were health technology assessments versus systematic reviews, 19 (49%) focused on healthcare versus welfare, 18 (46%) planned meta-analysis, and 3 (8%) were ongoing at cut-off. AI/ML tools were used in 27 (69%) reviews. Reviews that used AI/ML as recommended used more resources (mean 667 vs. 291 person-hours) but were completed slightly faster (27.6 vs. 28.2 weeks). These differences were not statistically significant (relative resource use 3.71; 95% CI: 0.36–37.95; <i>p</i> = 0.269; relative time-to-completion: 0.92; 95% CI: 0.53–1.58; <i>p</i> = 0.753).</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>Associations between AI/ML use and the outcomes remains uncertain. Multicenter studies or meta-analyses may be needed to determine if these tools meaningfully reduce resource use and time to produce evidence syntheses.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70030","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144085014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information Practice as Dialogue: The Case for Collaboration in Evidence Searching and Finding for More Complex Reviews 作为对话的信息实践:为更复杂的审查进行证据检索和发现的合作案例
Pub Date : 2025-04-28 DOI: 10.1002/cesm.70029
Parkhill Anne, Merner Bronwen, Ryan Rebecca
<p>Cochrane Consumers and Communication Group's (CCC) approach to evidence searching has evolved over time in the context of Cochrane's rigorous methodological advice [<span>1, 2</span>]. CCC is a Cochrane review group responsible for coordinating the preparation and publication of evidence syntheses that affect the way people interact with healthcare professionals, services and researchers. CCC includes a highly skilled Information Specialist who collaborates with CCC author teams to design a rigorous search strategy to gather evidence to answer the review question. In this commentary, we discuss the transformation of the information practice of searching in CCC from being a largely technical exercise conducted solely by the Information Specialist to a collaborative dialogue between the Information Specialist and author teams.</p><p>A key reason for the transformation in our search methods has been that CCC reviews tend to be complex, with review questions that are generally not as easily answered as clinically focused reviews. Our research, and information practice specifically, is contextualized and guided by a three-way dynamic of patient preferences and experiences, research evidence, and professional expertize. The reviews are rigorous in their examination of evidence on people's healthcare interactions, including how people self-manage health and disease, understand screening, health and treatment, and negotiate and share decisions with healthcare professionals within systems and different settings. However, interventions to change behaviors, to educate, support and up-skill people to participate actively in their healthcare, are often complex, multifaceted and their effects evaluated via multiple diverse outcomes [<span>3</span>]. This complexity necessarily shapes our methods of information practice.</p><p>Early in the life of CCC and for many years, we viewed searching as a largely solitary technical exercise performed by a skilled Information Specialist following conventional, rigorous Cochrane search methods. Often this required labor-intensive search development, resulting in delays for search results and an excessive screening obligation (e.g., some review questions resulted in authors needing to screen more than 25,000 search results). As volume and complexity of literature in the health communication area increased, we moved towards search strategies developed with practicalities of reference screening in mind [<span>4, 5</span>]. We have since developed transparent and pragmatic search strategies by means of embedded and open dialogue [<span>6</span>] with authors. In the context of increasing topic complexity and rigorous information searching, this approach maximizes identification of relevant references while avoiding unmanageable reference numbers for screening.</p><p>In this commentary, we explore CCC's approach to searching and its evolution over time in the context of Cochrane's rigorous methodological advice. We illustrat
在Cochrane严谨的方法论建议的背景下,Cochrane消费者与沟通小组(CCC)的证据检索方法随着时间的推移而发展[1,2]。CCC是Cochrane的一个评审小组,负责协调影响人们与医疗保健专业人员、服务机构和研究人员互动方式的证据综合的准备和发表。CCC包括一个高技能的信息专家,他与CCC作者团队合作,设计一个严格的搜索策略来收集证据来回答审查问题。在这篇评论中,我们讨论了在CCC中搜索的信息实践的转变,从一个主要由信息专家进行的技术练习转变为信息专家和作者团队之间的协作对话。我们的搜索方法发生转变的一个关键原因是,CCC综述往往很复杂,综述问题通常不像临床重点综述那样容易回答。我们的研究,特别是信息实践,是在患者偏好和经验、研究证据和专业知识的三方动态背景下进行的。这些审查严格审查了人们与卫生保健相互作用的证据,包括人们如何自我管理健康和疾病,了解筛查、健康和治疗,以及在系统和不同环境中与卫生保健专业人员协商和分享决策。然而,改变行为、教育、支持和提高人们的技能以积极参与医疗保健的干预措施往往是复杂的、多方面的,其效果通过多种不同的结果来评估[10]。这种复杂性必然塑造了我们的信息实践方法。在Cochrane的早期和许多年里,我们认为搜索在很大程度上是一项单独的技术工作,由熟练的信息专家按照传统的、严格的Cochrane搜索方法进行。这通常需要密集的搜索开发,导致搜索结果的延迟和过度的筛选义务(例如,一些评论问题导致作者需要筛选超过25,000个搜索结果)。随着健康传播领域文献数量和复杂性的增加,我们转向了考虑参考筛选实用性的搜索策略[4,5]。此后,我们通过与作者的嵌入式和公开对话[6]开发了透明和实用的搜索策略。在主题复杂性和信息搜索日益严格的背景下,该方法最大限度地识别相关参考文献,同时避免了难以管理的参考文献编号进行筛选。在这篇评论中,我们在Cochrane严谨的方法论建议的背景下探讨了CCC的搜索方法及其随时间的演变。我们举例说明了不同的方法来达到审查生产的严格性和实际需求之间的平衡。为了演示我们的方法,我们将讨论最近的两个CCC审查[7,8],它们展示了我们当前实践的发展。在过去的30年里,循证实践(EBP)过程和方法的发展增加了医疗保健集体知识的细微差别和深度。为了使EBP方法保持黄金标准和实用性,研究人员和信息专家在EBP链的所有步骤上都增加了程序复杂性。信息专家为证据合成领域带来了独特的技能,人们越来越认识到他们在识别和管理大量多样证据来源方面的贡献价值[14,15]。搜索,我们这里的例子,已经从主要在数据库中运行的基本PICO框架的模型转变为在证据来源和搜索术语之间进行越来越复杂的选择,以描述和告知搜索问题中包含的概念。搜索现在呼吁与利益攸关方对话。这可以由信息专家来管理,他的任务是平衡通常复杂的概念与解决可回答的搜索问题所需的时间和精力。为了实现这种对话,信息专家必须集成并嵌入到审查生产过程中[14,16]。我们发现,只有通过反复测试和对话,我们才能在证据检索的艺术与科学之间取得有效的平衡,从而获得严格但可管理的搜索结果。Parkhill Anne:概念化,调查,写作-原稿,写作-审查和编辑,项目管理,获得资金。默纳·布朗文:构思,写作-原稿,写作-审查和编辑,项目管理,调查,获得资金。 瑞安·丽贝卡:构思,调查,写作-原稿,写作-审查和编辑,项目管理,监督,资金获取。作者声明无利益冲突。
{"title":"Information Practice as Dialogue: The Case for Collaboration in Evidence Searching and Finding for More Complex Reviews","authors":"Parkhill Anne,&nbsp;Merner Bronwen,&nbsp;Ryan Rebecca","doi":"10.1002/cesm.70029","DOIUrl":"https://doi.org/10.1002/cesm.70029","url":null,"abstract":"&lt;p&gt;Cochrane Consumers and Communication Group's (CCC) approach to evidence searching has evolved over time in the context of Cochrane's rigorous methodological advice [&lt;span&gt;1, 2&lt;/span&gt;]. CCC is a Cochrane review group responsible for coordinating the preparation and publication of evidence syntheses that affect the way people interact with healthcare professionals, services and researchers. CCC includes a highly skilled Information Specialist who collaborates with CCC author teams to design a rigorous search strategy to gather evidence to answer the review question. In this commentary, we discuss the transformation of the information practice of searching in CCC from being a largely technical exercise conducted solely by the Information Specialist to a collaborative dialogue between the Information Specialist and author teams.&lt;/p&gt;&lt;p&gt;A key reason for the transformation in our search methods has been that CCC reviews tend to be complex, with review questions that are generally not as easily answered as clinically focused reviews. Our research, and information practice specifically, is contextualized and guided by a three-way dynamic of patient preferences and experiences, research evidence, and professional expertize. The reviews are rigorous in their examination of evidence on people's healthcare interactions, including how people self-manage health and disease, understand screening, health and treatment, and negotiate and share decisions with healthcare professionals within systems and different settings. However, interventions to change behaviors, to educate, support and up-skill people to participate actively in their healthcare, are often complex, multifaceted and their effects evaluated via multiple diverse outcomes [&lt;span&gt;3&lt;/span&gt;]. This complexity necessarily shapes our methods of information practice.&lt;/p&gt;&lt;p&gt;Early in the life of CCC and for many years, we viewed searching as a largely solitary technical exercise performed by a skilled Information Specialist following conventional, rigorous Cochrane search methods. Often this required labor-intensive search development, resulting in delays for search results and an excessive screening obligation (e.g., some review questions resulted in authors needing to screen more than 25,000 search results). As volume and complexity of literature in the health communication area increased, we moved towards search strategies developed with practicalities of reference screening in mind [&lt;span&gt;4, 5&lt;/span&gt;]. We have since developed transparent and pragmatic search strategies by means of embedded and open dialogue [&lt;span&gt;6&lt;/span&gt;] with authors. In the context of increasing topic complexity and rigorous information searching, this approach maximizes identification of relevant references while avoiding unmanageable reference numbers for screening.&lt;/p&gt;&lt;p&gt;In this commentary, we explore CCC's approach to searching and its evolution over time in the context of Cochrane's rigorous methodological advice. We illustrat","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143883954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the reporting quality of published qualitative evidence syntheses in the cochrane library 评估cochrane图书馆发表的定性证据综合报告的质量
Pub Date : 2025-04-15 DOI: 10.1002/cesm.70023
Martina Giltenane, Aoife O'Mahony, Mayara S. Bianchim, Andrew Booth, Angela Harden, Catherine Houghton, Emma F. France, Heather Ames, Kate Flemming, Katy Sutcliffe, Ruth Garside, Tomas Pantoja, Jane Noyes

Background

Over ten years since the first qualitative evidence synthesis (QES) was published in the Cochrane Library, QES and mixed-methods reviews (MMR) with a qualitative component have become increasingly common and influential in healthcare research and policy development. The quality of such reviews and the completeness with which they are reported is therefore of paramount importance.

Aim

This review aimed to assess the reporting quality of published QESs and MMRs with a qualitative component in the Cochrane Library.

Methods

All published QESs and MMRs were identified from the Cochrane Library. A bespoke framework developed by key international experts based on the Effective Practice and Organisation of Care (EPOC), Enhancing Transparency in Reporting the Synthesis of Qualitative Research (ENTREQ) and meta-ethnography reporting guidance (eMERGe) was used to code the quality of reporting of QESs and MMRs.

Results

Thirty-one reviews were identified, including 11 MMRs. The reporting quality of the QESs and MMRs published by Cochrane varied considerably. Based on the criteria within our framework, just over a quarter (8, 26%) were considered to meet satisfactory reporting standards, 10 (32%) could have provided clearer or more detailed descriptions in their reporting, just over a quarter (8, 26%) provided poor quality or insufficient descriptions and five (16%) omitted descriptions relevant to our framework.

Conclusion

This assessment offers important insights into the reporting practices prevalent in these review types. Methodology and reporting have changed considerably over time. Earlier QES have not necessarily omitted important reporting components, but rather our understanding of what should be completed and reported has grown considerably. The variability in reporting quality within QESs and MMRs underscores the need to develop Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) specifically for QES.

自从第一个定性证据合成(QES)在Cochrane图书馆发表以来,十多年来,QES和带有定性成分的混合方法综述(MMR)在医疗保健研究和政策制定中变得越来越普遍和有影响力。因此,这些审查的质量和报告的完整性是至关重要的。目的:本综述旨在评估Cochrane图书馆中已发表的QESs和mmr的报告质量。方法从Cochrane图书馆检索已发表的QESs和mmr。主要国际专家根据有效实践和护理组织(EPOC)、提高定性研究综合报告的透明度(ENTREQ)和元人种学报告指南(eMERGe)开发了一个定制框架,用于编码QESs和mmr报告的质量。结果共纳入31篇综述,其中mmr 11篇。Cochrane发表的QESs和MMRs的报告质量差异很大。基于我们框架内的标准,仅仅超过四分之一(8.26%)被认为符合令人满意的报告标准,10个(32%)可以在他们的报告中提供更清晰或更详细的描述,仅仅超过四分之一(8.26%)提供质量差或不充分的描述,五个(16%)省略了与我们框架相关的描述。结论:该评估为这些审查类型中普遍存在的报告实践提供了重要的见解。随着时间的推移,方法和报告发生了很大的变化。早期的QES不一定省略重要的报告组成部分,而是我们对应该完成和报告的内容的理解已经大大增加。QESs和mmr中报告质量的可变性强调了为QES开发系统评价和荟萃分析(PRISMA)首选报告项目的必要性。
{"title":"Assessing the reporting quality of published qualitative evidence syntheses in the cochrane library","authors":"Martina Giltenane,&nbsp;Aoife O'Mahony,&nbsp;Mayara S. Bianchim,&nbsp;Andrew Booth,&nbsp;Angela Harden,&nbsp;Catherine Houghton,&nbsp;Emma F. France,&nbsp;Heather Ames,&nbsp;Kate Flemming,&nbsp;Katy Sutcliffe,&nbsp;Ruth Garside,&nbsp;Tomas Pantoja,&nbsp;Jane Noyes","doi":"10.1002/cesm.70023","DOIUrl":"https://doi.org/10.1002/cesm.70023","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>Over ten years since the first qualitative evidence synthesis (QES) was published in the Cochrane Library, QES and mixed-methods reviews (MMR) with a qualitative component have become increasingly common and influential in healthcare research and policy development. The quality of such reviews and the completeness with which they are reported is therefore of paramount importance.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Aim</h3>\u0000 \u0000 <p>This review aimed to assess the reporting quality of published QESs and MMRs with a qualitative component in the Cochrane Library.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>All published QESs and MMRs were identified from the Cochrane Library. A bespoke framework developed by key international experts based on the Effective Practice and Organisation of Care (EPOC), Enhancing Transparency in Reporting the Synthesis of Qualitative Research (ENTREQ) and meta-ethnography reporting guidance (eMERGe) was used to code the quality of reporting of QESs and MMRs.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>Thirty-one reviews were identified, including 11 MMRs. The reporting quality of the QESs and MMRs published by Cochrane varied considerably. Based on the criteria within our framework, just over a quarter (8, 26%) were considered to meet satisfactory reporting standards, 10 (32%) could have provided clearer or more detailed descriptions in their reporting, just over a quarter (8, 26%) provided poor quality or insufficient descriptions and five (16%) omitted descriptions relevant to our framework.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusion</h3>\u0000 \u0000 <p>This assessment offers important insights into the reporting practices prevalent in these review types. Methodology and reporting have changed considerably over time. Earlier QES have not necessarily omitted important reporting components, but rather our understanding of what should be completed and reported has grown considerably. The variability in reporting quality within QESs and MMRs underscores the need to develop Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) specifically for QES.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70023","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143836206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Should we adopt the case report format to report challenges in complicated evidence synthesis? A proposal and illustration of a case report of a complex search strategy for humanitarian interventions 我们是否应该采用案例报告的形式来报告复杂证据合成中的挑战?人道主义干预的复杂搜索策略的案例报告的建议和说明
Pub Date : 2025-04-13 DOI: 10.1002/cesm.70021
Chris Cooper, Zahra Premji, Cem Yavuz, Mark Engelbert

Case reports represent a form of evidence in medicine which detail an unusual or novel clinical case in a short, published report, disseminated for the attention of clinical staff. This form of report is not common outside of clinical practice. We question if the adoption of the ‘case report’ might also be useful in evidence synthesis. This where the case represents a challenge in undertaking evidence synthesis and the report details not only the resolution but also shows the working to resolve the challenge. Our rationale is that methodological responses to problems arising in complicated evidence synthesis often go unreported. The risk is that lessons learned in developing evidence synthesis are lost if not recorded. This represents a form of research waste. We suggest that the adoption of the case report format might represent the opportunity to highlight not only a challenge (the case) but a worked example of a possible solution (the report). These case reports would represent a resting place for the case, with notes left behind for future researchers to follow. We provide an example of a case report: a complicated search strategy developed to inform an evidence gap map on the effects of interventions in humanitarian settings on food security outcomes in low and middle-income countries and specific high-income countries. Our report details the solution that we developed (the search strategy). We also illustrate how we conceptualised the search, and the approaches that we tested but rejected, and the ideas that we pursued.

病例报告是医学证据的一种形式,它将不寻常或新颖的临床病例详细记录在一份简短的、已发表的报告中,传播给临床工作人员以引起注意。这种形式的报告在临床实践之外并不常见。我们质疑采用“案件报告”是否也可能对证据合成有用。在这种情况下,案件代表了进行证据综合的挑战,报告不仅详细说明了解决方案,还展示了解决挑战的工作。我们的理由是,在复杂的证据合成中出现的问题的方法学反应往往没有报告。风险在于,如果不加以记录,在发展证据综合过程中吸取的经验教训就会丢失。这是研究浪费的一种形式。我们建议,采用案例报告格式可能意味着有机会不仅突出挑战(案例),而且突出可能解决方案的工作示例(报告)。这些病例报告代表了病例的安息之地,留下了笔记供未来的研究人员参考。我们提供了一个案例报告的例子:制定了一个复杂的搜索策略,为证据缺口图提供信息,说明人道主义背景下的干预措施对中低收入国家和特定高收入国家粮食安全结果的影响。我们的报告详细介绍了我们开发的解决方案(搜索策略)。我们还说明了我们如何概念化搜索,以及我们测试但拒绝的方法,以及我们追求的想法。
{"title":"Should we adopt the case report format to report challenges in complicated evidence synthesis? A proposal and illustration of a case report of a complex search strategy for humanitarian interventions","authors":"Chris Cooper,&nbsp;Zahra Premji,&nbsp;Cem Yavuz,&nbsp;Mark Engelbert","doi":"10.1002/cesm.70021","DOIUrl":"https://doi.org/10.1002/cesm.70021","url":null,"abstract":"<p>Case reports represent a form of evidence in medicine which detail an unusual or novel clinical case in a short, published report, disseminated for the attention of clinical staff. This form of report is not common outside of clinical practice. We question if the adoption of the ‘case report’ might also be useful in evidence synthesis. This where the case represents a challenge in undertaking evidence synthesis and the report details not only the resolution but also shows the working to resolve the challenge. Our rationale is that methodological responses to problems arising in complicated evidence synthesis often go unreported. The risk is that lessons learned in developing evidence synthesis are lost if not recorded. This represents a form of research waste. We suggest that the adoption of the case report format might represent the opportunity to highlight not only a challenge (the case) but a worked example of a possible solution (the report). These case reports would represent a resting place for the case, with notes left behind for future researchers to follow. We provide an example of a case report: a complicated search strategy developed to inform an evidence gap map on the effects of interventions in humanitarian settings on food security outcomes in low and middle-income countries and specific high-income countries. Our report details the solution that we developed (the search strategy). We also illustrate how we conceptualised the search, and the approaches that we tested but rejected, and the ideas that we pursued.</p>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70021","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143826729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Process Model of Study Identification Specific to the Identification of Randomised Studies for Systematic Reviews of Medical Interventions 为医学干预措施系统综述鉴定随机研究的新研究鉴定流程模型
Pub Date : 2025-04-13 DOI: 10.1002/cesm.70026
Chris Cooper, Zahra Premji, Christine Worsley, Eve Tomlinson, Sarah Dawson, Emma Prentice

Background

Recent work has illustrated that the same process of study identification is used in systematic reviews irrespective of the studies or data needs required for synthesis. We question if different review types should have their own specific models of study identification, to ensure the appropriate and timely identification of studies/study reports and to minimise research waste.

Objective

In this paper, we aim to:

1. illustrate and report a new process model to identify randomised studies for systematic reviews of medical interventions; and

2. situate the model in context of current practice using a worked example from a recent systematic review.

Method

Our model splits the identification of studies from the identification of study reports by searching in distinct phases. It begins with searches of trials registry resources to identify studies, followed by searches of bibliographic databases to identify study reports or unregistered studies. Supplementary search methods are then used to identify unpublished studies. The model includes the possibility of secondary searches, and we consider the role of update searches.

Conclusion

A case study illustrates the application of the method alongside operational guidance.

最近的研究表明,无论综合所需的研究或数据需求如何,系统评价都使用相同的研究识别过程。我们质疑不同的综述类型是否应该有自己特定的研究识别模型,以确保适当和及时地识别研究/研究报告,并最大限度地减少研究浪费。本论文的目的是:1。说明并报告一个新的过程模型,以确定医学干预措施系统评价的随机研究;和 2。使用来自最近系统回顾的工作示例,将模型置于当前实践的上下文中。方法我们的模型通过在不同阶段进行搜索,将研究的识别与研究报告的识别分开。首先搜索试验注册资源以确定研究,然后搜索书目数据库以确定研究报告或未注册的研究。然后使用补充搜索方法来识别未发表的研究。该模型考虑了二次搜索的可能性,并考虑了更新搜索的作用。结论一个案例研究说明了该方法与操作指导的应用。
{"title":"A New Process Model of Study Identification Specific to the Identification of Randomised Studies for Systematic Reviews of Medical Interventions","authors":"Chris Cooper,&nbsp;Zahra Premji,&nbsp;Christine Worsley,&nbsp;Eve Tomlinson,&nbsp;Sarah Dawson,&nbsp;Emma Prentice","doi":"10.1002/cesm.70026","DOIUrl":"https://doi.org/10.1002/cesm.70026","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>Recent work has illustrated that the same process of study identification is used in systematic reviews irrespective of the studies or data needs required for synthesis. We question if different review types should have their own specific models of study identification, to ensure the appropriate and timely identification of studies/study reports and to minimise research waste.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Objective</h3>\u0000 \u0000 <p>In this paper, we aim to:</p>\u0000 \u0000 <p>1. illustrate and report a new process model to identify randomised studies for systematic reviews of medical interventions; and</p>\u0000 \u0000 <p>2. situate the model in context of current practice using a worked example from a recent systematic review.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Method</h3>\u0000 \u0000 <p>Our model splits the identification of studies from the identification of study reports by searching in distinct phases. It begins with searches of trials registry resources to identify studies, followed by searches of bibliographic databases to identify study reports or unregistered studies. Supplementary search methods are then used to identify unpublished studies. The model includes the possibility of secondary searches, and we consider the role of update searches.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusion</h3>\u0000 \u0000 <p>A case study illustrates the application of the method alongside operational guidance.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70026","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143826730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Cochrane Evidence Synthesis and Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1