A Hybrid Semi-Automated Workflow for Systematic and Literature Review Processes with Large Language Model Analysis

IF 2.8 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Future Internet Pub Date : 2024-05-12 DOI:10.3390/fi16050167

Anjia Ye, Ananda Maiti, Matthew Schmidt, Scott Pedersen

{"title":"A Hybrid Semi-Automated Workflow for Systematic and Literature Review Processes with Large Language Model Analysis","authors":"Anjia Ye, Ananda Maiti, Matthew Schmidt, Scott Pedersen","doi":"10.3390/fi16050167","DOIUrl":null,"url":null,"abstract":"Systematic reviews (SRs) are a rigorous method for synthesizing empirical evidence to answer specific research questions. However, they are labor-intensive because of their collaborative nature, strict protocols, and typically large number of documents. Large language models (LLMs) and their applications such as gpt-4/ChatGPT have the potential to reduce the human workload of the SR process while maintaining accuracy. We propose a new hybrid methodology that combines the strengths of LLMs and humans using the ability of LLMs to summarize large bodies of text autonomously and extract key information. This is then used by a researcher to make inclusion/exclusion decisions quickly. This process replaces the typical manually performed title/abstract screening, full-text screening, and data extraction steps in an SR while keeping a human in the loop for quality control. We developed a semi-automated LLM-assisted (Gemini-Pro) workflow with a novel innovative prompt development strategy. This involves extracting three categories of information including identifier, verifier, and data field (IVD) from the formatted documents. We present a case study where our hybrid approach reduced errors compared with a human-only SR. The hybrid workflow improved the accuracy of the case study by identifying 6/390 (1.53%) articles that were misclassified by the human-only process. It also matched the human-only decisions completely regarding the rest of the 384 articles. Given the rapid advances in LLM technology, these results will undoubtedly improve over time.","PeriodicalId":37982,"journal":{"name":"Future Internet","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/fi16050167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Systematic reviews (SRs) are a rigorous method for synthesizing empirical evidence to answer specific research questions. However, they are labor-intensive because of their collaborative nature, strict protocols, and typically large number of documents. Large language models (LLMs) and their applications such as gpt-4/ChatGPT have the potential to reduce the human workload of the SR process while maintaining accuracy. We propose a new hybrid methodology that combines the strengths of LLMs and humans using the ability of LLMs to summarize large bodies of text autonomously and extract key information. This is then used by a researcher to make inclusion/exclusion decisions quickly. This process replaces the typical manually performed title/abstract screening, full-text screening, and data extraction steps in an SR while keeping a human in the loop for quality control. We developed a semi-automated LLM-assisted (Gemini-Pro) workflow with a novel innovative prompt development strategy. This involves extracting three categories of information including identifier, verifier, and data field (IVD) from the formatted documents. We present a case study where our hybrid approach reduced errors compared with a human-only SR. The hybrid workflow improved the accuracy of the case study by identifying 6/390 (1.53%) articles that were misclassified by the human-only process. It also matched the human-only decisions completely regarding the rest of the 384 articles. Given the rapid advances in LLM technology, these results will undoubtedly improve over time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过大语言模型分析实现系统性和文献综述流程的混合半自动化工作流程

系统综述（SR）是一种综合经验证据以回答特定研究问题的严谨方法。然而，由于其协作性质、严格的协议和典型的大量文件，系统综述是一种劳动密集型方法。大型语言模型（LLM）及其应用（如 gpt-4/ChatGPT）有可能在保持准确性的同时减少 SR 过程中的人工工作量。我们提出了一种新的混合方法，它结合了 LLM 和人类的优势，利用 LLM 自主总结大量文本并提取关键信息的能力。然后，研究人员利用这些信息快速做出收录/排除决定。这一流程取代了 SR 中通常由人工执行的标题/摘要筛选、全文筛选和数据提取步骤，同时保留了人工质量控制环节。我们开发了一种半自动化的 LLM 辅助（Gemini-Pro）工作流程，采用了新颖的创新提示开发策略。这涉及从格式化文档中提取三类信息，包括标识符、验证器和数据字段（IVD）。我们介绍了一个案例研究，与纯人工 SR 相比，我们的混合方法减少了错误。混合工作流程提高了案例研究的准确性，识别出 6/390 篇（1.53%）被纯人工流程错误分类的文章。此外，在其余 384 篇文章中，混合工作流程也与纯人工决策完全吻合。鉴于 LLM 技术的飞速发展，随着时间的推移，这些结果无疑会有所改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Future Internet Computer Science-Computer Networks and Communications

CiteScore

7.10

自引率

5.90%

发文量

303

审稿时长

11 weeks

期刊介绍： Future Internet is a scholarly open access journal which provides an advanced forum for science and research concerned with evolution of Internet technologies and related smart systems for “Net-Living” development. The general reference subject is therefore the evolution towards the future internet ecosystem, which is feeding a continuous, intensive, artificial transformation of the lived environment, for a widespread and significant improvement of well-being in all spheres of human life (private, public, professional). Included topics are: • advanced communications network infrastructures • evolution of internet basic services • internet of things • netted peripheral sensors • industrial internet • centralized and distributed data centers • embedded computing • cloud computing • software defined network functions and network virtualization • cloud-let and fog-computing • big data, open data and analytical tools • cyber-physical systems • network and distributed operating systems • web services • semantic structures and related software tools • artificial and augmented intelligence • augmented reality • system interoperability and flexible service composition • smart mission-critical system architectures • smart terminals and applications • pro-sumer tools for application design and development • cyber security compliance • privacy compliance • reliability compliance • dependability compliance • accountability compliance • trust compliance • technical quality of basic services.