{"title":"Screening articles for systematic reviews with ChatGPT","authors":"Eugene Syriani , Istvan David , Gauransh Kumar","doi":"10.1016/j.cola.2024.101287","DOIUrl":null,"url":null,"abstract":"<div><p>Systematic reviews (SRs) provide valuable evidence for guiding new research directions. However, the manual effort involved in selecting articles for inclusion in an SR is error-prone and time-consuming. While screening articles has traditionally been considered challenging to automate, the advent of large language models offers new possibilities. In this paper, we discuss the effect of using ChatGPT on the SR process. In particular, we investigate the effectiveness of different prompt strategies for automating the article screening process using five real SR datasets. Our results show that ChatGPT can reach up to 82% accuracy. The best performing prompts specify exclusion criteria and avoid negative shots. However, prompts should be adapted to different corpus characteristics.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"80 ","pages":"Article 101287"},"PeriodicalIF":1.7000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590118424000303/pdfft?md5=88fb1aa235050a4011046d39a856044b&pid=1-s2.0-S2590118424000303-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Languages","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590118424000303","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Systematic reviews (SRs) provide valuable evidence for guiding new research directions. However, the manual effort involved in selecting articles for inclusion in an SR is error-prone and time-consuming. While screening articles has traditionally been considered challenging to automate, the advent of large language models offers new possibilities. In this paper, we discuss the effect of using ChatGPT on the SR process. In particular, we investigate the effectiveness of different prompt strategies for automating the article screening process using five real SR datasets. Our results show that ChatGPT can reach up to 82% accuracy. The best performing prompts specify exclusion criteria and avoid negative shots. However, prompts should be adapted to different corpus characteristics.
系统综述(SR)为指导新的研究方向提供了宝贵的证据。然而,人工筛选纳入系统综述的文章既容易出错又耗费时间。虽然筛选文章在传统上被认为具有自动化的挑战性,但大型语言模型的出现提供了新的可能性。在本文中,我们讨论了使用 ChatGPT 对 SR 流程的影响。特别是,我们使用五个真实的 SR 数据集研究了不同提示策略对文章筛选过程自动化的有效性。结果表明,ChatGPT 的准确率可达 82%。表现最好的提示指定了排除标准,避免了负面镜头。不过,提示应适应不同的语料特征。