Daniel Raff, Kurtis Stewart, Michelle Christie Yang, Jessie Shang, Sonya Cressman, Roger Tam, Jessica Wong, Martin C Tammemägi, Kendall Ho
{"title":"Improving Triage Accuracy in Prehospital Emergency Telemedicine: Scoping Review of Machine Learning-Enhanced Approaches.","authors":"Daniel Raff, Kurtis Stewart, Michelle Christie Yang, Jessie Shang, Sonya Cressman, Roger Tam, Jessica Wong, Martin C Tammemägi, Kendall Ho","doi":"10.2196/56729","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Prehospital telemedicine triage systems combined with machine learning (ML) methods have the potential to improve triage accuracy and safely redirect low-acuity patients from attending the emergency department. However, research in prehospital settings is limited but needed; emergency department overcrowding and adverse patient outcomes are increasingly common.</p><p><strong>Objective: </strong>In this scoping review, we sought to characterize the existing methods for ML-enhanced telemedicine emergency triage. In order to support future research, we aimed to delineate what data sources, predictors, labels, ML models, and performance metrics were used, and in which telemedicine triage systems these methods were applied.</p><p><strong>Methods: </strong>A scoping review was conducted, querying multiple databases (MEDLINE, PubMed, Scopus, and IEEE Xplore) through February 24, 2023, to identify potential ML-enhanced methods, and for those eligible, relevant study characteristics were extracted, including prehospital triage setting, types of predictors, ground truth labeling method, ML models used, and performance metrics. Inclusion criteria were restricted to the triage of emergency telemedicine services using ML methods on an undifferentiated (disease nonspecific) population. Only primary research studies in English were considered. Furthermore, only those studies using data collected remotely (as opposed to derived from physical assessments) were included. In order to limit bias, we exclusively included articles identified through our predefined search criteria and had 3 researchers (DR, JS, and KS) independently screen the resulting studies. We conducted a narrative synthesis of findings to establish a knowledge base in this domain and identify potential gaps to be addressed in forthcoming ML-enhanced methods.</p><p><strong>Results: </strong>A total of 165 unique records were screened for eligibility and 15 were included in the review. Most studies applied ML methods during emergency medical dispatch (7/15, 47%) or used chatbot applications (5/15, 33%). Patient demographics and health status variables were the most common predictors, with a notable absence of social variables. Frequently used ML models included support vector machines and tree-based methods. ML-enhanced models typically outperformed conventional triage algorithms, and we found a wide range of methods used to establish ground truth labels.</p><p><strong>Conclusions: </strong>This scoping review observed heterogeneity in dataset size, predictors, clinical setting (triage process), and reported performance metrics. Standard structured predictors, including age, sex, and comorbidities, across articles suggest the importance of these inputs; however, there was a notable absence of other potentially useful data, including medications, social variables, and health system exposure. Ground truth labeling practices should be reported in a standard fashion as the true model performance hinges on these labels. This review calls for future work to form a standardized framework, thereby supporting consistent reporting and performance comparisons across ML-enhanced prehospital triage systems.</p>","PeriodicalId":51757,"journal":{"name":"Interactive Journal of Medical Research","volume":"13 ","pages":"e56729"},"PeriodicalIF":1.9000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11429666/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interactive Journal of Medical Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/56729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Prehospital telemedicine triage systems combined with machine learning (ML) methods have the potential to improve triage accuracy and safely redirect low-acuity patients from attending the emergency department. However, research in prehospital settings is limited but needed; emergency department overcrowding and adverse patient outcomes are increasingly common.
Objective: In this scoping review, we sought to characterize the existing methods for ML-enhanced telemedicine emergency triage. In order to support future research, we aimed to delineate what data sources, predictors, labels, ML models, and performance metrics were used, and in which telemedicine triage systems these methods were applied.
Methods: A scoping review was conducted, querying multiple databases (MEDLINE, PubMed, Scopus, and IEEE Xplore) through February 24, 2023, to identify potential ML-enhanced methods, and for those eligible, relevant study characteristics were extracted, including prehospital triage setting, types of predictors, ground truth labeling method, ML models used, and performance metrics. Inclusion criteria were restricted to the triage of emergency telemedicine services using ML methods on an undifferentiated (disease nonspecific) population. Only primary research studies in English were considered. Furthermore, only those studies using data collected remotely (as opposed to derived from physical assessments) were included. In order to limit bias, we exclusively included articles identified through our predefined search criteria and had 3 researchers (DR, JS, and KS) independently screen the resulting studies. We conducted a narrative synthesis of findings to establish a knowledge base in this domain and identify potential gaps to be addressed in forthcoming ML-enhanced methods.
Results: A total of 165 unique records were screened for eligibility and 15 were included in the review. Most studies applied ML methods during emergency medical dispatch (7/15, 47%) or used chatbot applications (5/15, 33%). Patient demographics and health status variables were the most common predictors, with a notable absence of social variables. Frequently used ML models included support vector machines and tree-based methods. ML-enhanced models typically outperformed conventional triage algorithms, and we found a wide range of methods used to establish ground truth labels.
Conclusions: This scoping review observed heterogeneity in dataset size, predictors, clinical setting (triage process), and reported performance metrics. Standard structured predictors, including age, sex, and comorbidities, across articles suggest the importance of these inputs; however, there was a notable absence of other potentially useful data, including medications, social variables, and health system exposure. Ground truth labeling practices should be reported in a standard fashion as the true model performance hinges on these labels. This review calls for future work to form a standardized framework, thereby supporting consistent reporting and performance comparisons across ML-enhanced prehospital triage systems.
背景:院前远程医疗分诊系统与机器学习(ML)方法相结合,有可能提高分诊的准确性,并安全地将低危重病人转到急诊科就诊。然而,针对院前环境的研究虽然有限,但却亟待开展;急诊科人满为患、患者病情恶化的现象日益普遍:在这篇范围综述中,我们试图描述现有的 ML 增强型远程医疗急诊分诊方法的特点。为了支持未来的研究,我们旨在界定使用了哪些数据源、预测因子、标签、ML 模型和性能指标,以及这些方法应用于哪些远程医疗分诊系统:在 2023 年 2 月 24 日之前,我们对多个数据库(MEDLINE、PubMed、Scopus 和 IEEE Xplore)进行了范围审查,以确定潜在的 ML 增强方法,并提取符合条件的相关研究特征,包括院前分诊设置、预测因子类型、基本真实标记方法、使用的 ML 模型和性能指标。纳入标准仅限于使用 ML 方法对未分化(非特异性疾病)人群进行紧急远程医疗服务分流。只考虑英语的初级研究。此外,只有使用远程收集的数据(而不是通过身体评估得出的数据)的研究才被纳入。为了减少偏差,我们只收录通过预定义搜索标准确定的文章,并由 3 位研究人员(DR、JS 和 KS)独立筛选所得研究。我们对研究结果进行了叙述性综合,以建立该领域的知识库,并确定即将推出的 ML 增强方法可能存在的不足:结果:共筛选出 165 条符合条件的记录,其中 15 条被纳入综述。大多数研究在紧急医疗派遣过程中应用了人工智能方法(7/15,47%)或使用了聊天机器人应用(5/15,33%)。患者人口统计学和健康状况变量是最常见的预测因素,而社会变量则明显缺乏。常用的 ML 模型包括支持向量机和基于树的方法。ML 增强模型的表现通常优于传统的分诊算法,我们发现用于建立基本真实标签的方法多种多样:本次范围界定审查在数据集规模、预测因素、临床环境(分诊流程)和报告的性能指标方面发现了异质性。不同文章中的标准结构化预测因子(包括年龄、性别和合并症)表明了这些输入数据的重要性;然而,其他可能有用的数据(包括药物、社会变量和医疗系统接触)却明显缺乏。由于模型的真实性能取决于这些标签,因此应以标准方式报告基本真实标签做法。本综述要求今后的工作形成一个标准化的框架,从而支持在经 ML 增强的院前分诊系统中进行一致的报告和性能比较。