首页 > 最新文献

medRxiv - Health Informatics最新文献

英文 中文
A case is not a case is not a case - challenges and solutions in determining urolithiasis caseloads using the digital infrastructure of a clinical data warehouse 病例不是病例不是病例--利用临床数据仓库的数字基础设施确定尿路结石病例数的挑战和解决方案
Pub Date : 2024-09-18 DOI: 10.1101/2024.09.13.24313333
Martin Schoenthaler, Noah Hempen, Maria Weymann, Maximilian Ferry von Bargen, Maximilian Glienke, Antonia Elsaesser, Max Behrens, Harald Binder, Nadine Binder
Background:To provide more evidence in urolithiasis research, we have established the German Nationwide Register for RECurrent URolithiasis (RECUR) using local clinical data warehouses (CDWH). For RECUR and other registers relying on digitalized clinical data, it is crucial to ensure the data's reliability for answering scientific questions. In this work, we aim to compare the results of different CDWH-based queries on urolithiasis cases next to manual case extraction from the primary source.Methods:Sources for data extraction included the Medical Center University of Freiburg (MCUF) hospital information system (HIS), MCUF performance data (a clinical data set with merged data from patients including data from various time points throughout their treatment), and MCUF reimbursement data. We extracted data on caseloads in urolithiasis algorithmically (performance and reimbursement data) and compared those to a reference group compiled of manually extracted data from the local HIS and algorithmically extracted data.Results:Algorithmic extraction based on performance data resulted in correct and complete case identification as compared to the reference group. The case numbers from manual extraction from HIS data and algorithmic extraction from reimbursement data differed by 14% and 12%, respectively. The reasons for deviations in HIS data included human errors and a lack of data availability from different wards. Deviations in reimbursement data arose primarily due to the merging of cases in the context of reimbursement mechanisms. As the CDWH at MCUF is part of the German Medical Informatics Initiative (MII), the results can be transferred to other medical centers with similar CDWH structure.Conclusions:The current study provides firm evidence of the importance of clearly defining a studys target variable, e.g., urolithiasis cases, and a thorough understanding of the data sources and modes used to extract the target data. Our work clearly shows that, depending on various data sources, a case is not a case is not a case.
背景:为了给尿石症研究提供更多证据,我们利用当地的临床数据仓库(CDWH)建立了德国全国复发性尿石症登记册(RECUR)。对于 RECUR 和其他依赖于数字化临床数据的登记册来说,确保数据的可靠性对于回答科学问题至关重要。方法:数据提取来源包括弗莱堡医学中心大学(MCUF)医院信息系统(HIS)、MCUF绩效数据(临床数据集,包含患者治疗过程中不同时间点的合并数据)和MCUF报销数据。我们通过算法提取了泌尿系结石的病例数据(绩效数据和报销数据),并将其与由当地 HIS 人工提取的数据和算法提取的数据组成的参照组进行了比较。从 HIS 数据中人工提取的病例数与从报销数据中算法提取的病例数分别相差 14% 和 12%。HIS 数据出现偏差的原因包括人为失误和缺乏来自不同病房的数据。报销数据出现偏差的主要原因是报销机制中的病例合并。结论:目前的研究有力地证明了明确定义研究目标变量(如尿路结石病例)的重要性,以及透彻了解数据来源和用于提取目标数据的模式的重要性。我们的工作清楚地表明,根据不同的数据来源,病例并非病例。
{"title":"A case is not a case is not a case - challenges and solutions in determining urolithiasis caseloads using the digital infrastructure of a clinical data warehouse","authors":"Martin Schoenthaler, Noah Hempen, Maria Weymann, Maximilian Ferry von Bargen, Maximilian Glienke, Antonia Elsaesser, Max Behrens, Harald Binder, Nadine Binder","doi":"10.1101/2024.09.13.24313333","DOIUrl":"https://doi.org/10.1101/2024.09.13.24313333","url":null,"abstract":"Background:\u0000To provide more evidence in urolithiasis research, we have established the German Nationwide Register for RECurrent URolithiasis (RECUR) using local clinical data warehouses (CDWH). For RECUR and other registers relying on digitalized clinical data, it is crucial to ensure the data's reliability for answering scientific questions. In this work, we aim to compare the results of different CDWH-based queries on urolithiasis cases next to manual case extraction from the primary source.\u0000Methods:\u0000Sources for data extraction included the Medical Center University of Freiburg (MCUF) hospital information system (HIS), MCUF performance data (a clinical data set with merged data from patients including data from various time points throughout their treatment), and MCUF reimbursement data. We extracted data on caseloads in urolithiasis algorithmically (performance and reimbursement data) and compared those to a reference group compiled of manually extracted data from the local HIS and algorithmically extracted data.\u0000Results:\u0000Algorithmic extraction based on performance data resulted in correct and complete case identification as compared to the reference group. The case numbers from manual extraction from HIS data and algorithmic extraction from reimbursement data differed by 14% and 12%, respectively. The reasons for deviations in HIS data included human errors and a lack of data availability from different wards. Deviations in reimbursement data arose primarily due to the merging of cases in the context of reimbursement mechanisms. As the CDWH at MCUF is part of the German Medical Informatics Initiative (MII), the results can be transferred to other medical centers with similar CDWH structure.\u0000Conclusions:\u0000The current study provides firm evidence of the importance of clearly defining a studys target variable, e.g., urolithiasis cases, and a thorough understanding of the data sources and modes used to extract the target data. Our work clearly shows that, depending on various data sources, a case is not a case is not a case.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reliable Online Auditory Cognitive Testing: An observational study 可靠的在线听觉认知测试:观察研究
Pub Date : 2024-09-17 DOI: 10.1101/2024.09.17.24313794
Meher Lad, John-Paul Taylor, Timothy Griffiths
Technological advances have allowed researchers to conduct research remotely. Online auditory testing has received interest since the Covid-19 pandemic. A number of web-based developments have improved the range of auditory tasks during remote participation. Most of these studies have been conducted in young, motivated individuals who are comfortable with technology. Such studies have also used stimuli testing auditory perceptual abilities. Research on auditory cognitive abilities in real-world older adults is lacking. In this study, we assess the reproducibility of a range of auditory cognitive abilities in older adults, with a range of hearing abilities, who took part in in-person and online experiments.Participants performed a questionnaire-based assessment and were asked to complete two verbal speech-in-noise perception tasks, for digits and sentences, and two auditory memory tasks, for different sound features. In the first part of the study, 58 Participants performed these tests in-person and online in order to test the reproducibility of the tasks. In the second part, 147 participants conducted all the tasks online in order to test if previously published findings from in-person research were reproducible. We found that older adults under the age of 70 and those with a better hearing were more likely to take part in online testing. The questionnaire-based test had significantly better reproducibility than the behavioural auditory tests but there were no differences in reproducibility between in-person and online auditory cognitive metrics. Relationships between relationships with age and hearing thresholds in an in-person or online setting were not significantly different. Furthermore, important relationships between auditory metrics, evidenced in literature previously, were reproducible online. This study suggests that auditory cognitive testing may be reliably conducted online.
技术进步使研究人员能够进行远程研究。自 Covid-19 大流行以来,在线听觉测试受到了关注。一些基于网络的开发改进了远程参与过程中听觉任务的范围。这些研究的对象大多是年轻、有上进心、熟悉技术的人。这些研究还使用了测试听觉感知能力的刺激物。对现实世界中老年人听觉认知能力的研究还很缺乏。在这项研究中,我们评估了具有不同听力能力的老年人的一系列听觉认知能力的可重复性,这些老年人参加了现场和在线实验。参与者进行了基于问卷的评估,并被要求完成两项噪音中的言语感知任务(针对数字和句子)和两项听觉记忆任务(针对不同的声音特征)。在研究的第一部分,58 名参与者分别在现场和网上完成了这些测试,以测试任务的可重复性。在第二部分中,147 名参与者在网上完成了所有任务,以检验之前公布的现场研究结果是否具有可重复性。我们发现,70 岁以下的老年人和听力较好的人更愿意参加在线测试。基于问卷的测试的再现性明显优于行为听觉测试,但现场和在线听觉认知指标的再现性没有差异。亲自测试和在线测试中年龄与听阈之间的关系没有显著差异。此外,以前在文献中证实的听觉指标之间的重要关系在网上也具有再现性。这项研究表明,听觉认知测试可以在网上可靠地进行。
{"title":"Reliable Online Auditory Cognitive Testing: An observational study","authors":"Meher Lad, John-Paul Taylor, Timothy Griffiths","doi":"10.1101/2024.09.17.24313794","DOIUrl":"https://doi.org/10.1101/2024.09.17.24313794","url":null,"abstract":"Technological advances have allowed researchers to conduct research remotely. Online auditory testing has received interest since the Covid-19 pandemic. A number of web-based developments have improved the range of auditory tasks during remote participation. Most of these studies have been conducted in young, motivated individuals who are comfortable with technology. Such studies have also used stimuli testing auditory perceptual abilities. Research on auditory cognitive abilities in real-world older adults is lacking. In this study, we assess the reproducibility of a range of auditory cognitive abilities in older adults, with a range of hearing abilities, who took part in in-person and online experiments.\u0000Participants performed a questionnaire-based assessment and were asked to complete two verbal speech-in-noise perception tasks, for digits and sentences, and two auditory memory tasks, for different sound features. In the first part of the study, 58 Participants performed these tests in-person and online in order to test the reproducibility of the tasks. In the second part, 147 participants conducted all the tasks online in order to test if previously published findings from in-person research were reproducible. We found that older adults under the age of 70 and those with a better hearing were more likely to take part in online testing. The questionnaire-based test had significantly better reproducibility than the behavioural auditory tests but there were no differences in reproducibility between in-person and online auditory cognitive metrics. Relationships between relationships with age and hearing thresholds in an in-person or online setting were not significantly different. Furthermore, important relationships between auditory metrics, evidenced in literature previously, were reproducible online. This study suggests that auditory cognitive testing may be reliably conducted online.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterizing the connection between Parkinson's disease progression and healthcare utilization 描述帕金森病进展与医疗保健使用之间的联系
Pub Date : 2024-09-16 DOI: 10.1101/2024.09.15.24313708
Lane Fitzsimmons, Francesca Frau, Sylvie Bozzi, Karen Chandross, Brett Beaulieu-Jones
Background and Objectives: Parkinson's disease (PD) progression can be characterized in terms of healthcare utilization by analyzing clinical events across different stages of disease. Methods: PD progression was measured by the Hoehn & Yahr (H&Y) clinical rating scale and clinical events at each stage were evaluated. Natural language processing and a large language model were used to extract H&Y values from real-world data enabling a larger cohort than manually collected studies, and multi-state hidden Markov models were used for H&Y progression likelihood.Results: Within the one year, most patients in H&Y stages 2-5 remained in the same stage. Stage transitions, when they occurred, were most frequently to the next higher stage. Higher H&Y stages were associated with discharges into long term care and higher rates of additional clinical events.Conclusions: Stratifying key clinical events by H&Y score demonstrates the increases of health care utilization and economic burden with PD severity. Modelling the progression likelihood establishes a progression timeline and emphasizes the unmet need to identify treatment options that stop or slow these transitions.
背景和目的:通过分析不同疾病阶段的临床事件,可以从医疗保健利用率的角度来描述帕金森病(PD)的进展。方法帕金森病的进展是通过 Hoehn & Yahr(H&Y)临床评分量表来衡量的,并对每个阶段的临床事件进行评估。使用自然语言处理和大型语言模型从真实世界的数据中提取 H&Y 值,从而获得比人工收集的研究数据更大的队列,并使用多状态隐马尔可夫模型计算 H&Y 进展可能性:结果:在一年内,H&Y 2-5 期的大多数患者仍处于同一阶段。如果发生阶段转换,最常见的是转换到下一个更高的阶段。H&Y分期越高,出院后接受长期护理的比例越高,发生其他临床事件的比例也越高:按H&Y评分对主要临床事件进行分层表明,随着帕金森病严重程度的增加,医疗保健使用率和经济负担也会增加。对病情进展可能性的建模确定了病情进展的时间表,并强调了确定治疗方案以阻止或减缓病情进展的必要性。
{"title":"Characterizing the connection between Parkinson's disease progression and healthcare utilization","authors":"Lane Fitzsimmons, Francesca Frau, Sylvie Bozzi, Karen Chandross, Brett Beaulieu-Jones","doi":"10.1101/2024.09.15.24313708","DOIUrl":"https://doi.org/10.1101/2024.09.15.24313708","url":null,"abstract":"Background and Objectives: Parkinson's disease (PD) progression can be characterized in terms of healthcare utilization by analyzing clinical events across different stages of disease. Methods: PD progression was measured by the Hoehn & Yahr (H&Y) clinical rating scale and clinical events at each stage were evaluated. Natural language processing and a large language model were used to extract H&Y values from real-world data enabling a larger cohort than manually collected studies, and multi-state hidden Markov models were used for H&Y progression likelihood.\u0000Results: Within the one year, most patients in H&Y stages 2-5 remained in the same stage. Stage transitions, when they occurred, were most frequently to the next higher stage. Higher H&Y stages were associated with discharges into long term care and higher rates of additional clinical events.\u0000Conclusions: Stratifying key clinical events by H&Y score demonstrates the increases of health care utilization and economic burden with PD severity. Modelling the progression likelihood establishes a progression timeline and emphasizes the unmet need to identify treatment options that stop or slow these transitions.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Multiple Imputation for Variables that Are Missing Not At Random in Distributed Electronic Health Records 针对分布式电子健康记录中非随机缺失变量的联合多重估算
Pub Date : 2024-09-16 DOI: 10.1101/2024.09.15.24313479
Yi Lian, Xiaoqian Jiang, Qi Long
Large electronic health records (EHR) have been widely implemented and are available for research activities. The magnitude of such databases often requires storage and computing infrastructure that are distributed at different sites. Restrictions on data-sharing due to privacy concerns have been another driving force behind the development of a large class of distributed and/or federated machine learning methods. While missing data problem is also present in distributed EHRs, albeit potentially more complex, distributed multiple imputation (MI) methods have not received as much attention. An important advantage of distributed MI, as well as distributed analysis, is that it allows researchers to borrow information across data sites, mitigating potential fairness issues for minority groups that do not have enough volume at certain sites. In this paper, we propose a communication-efficient and privacy-preserving distributed MI algorithms for variables that are missing not at random.
大型电子健康记录(EHR)已经广泛应用,并可用于研究活动。此类数据库的规模往往需要分布在不同地点的存储和计算基础设施。出于对隐私的考虑,数据共享受到限制,这也是一大类分布式和/或联合式机器学习方法发展的推动力。虽然分布式电子病历中也存在数据缺失问题,而且可能更加复杂,但分布式多重归因(MI)方法却没有受到如此多的关注。分布式多重归因以及分布式分析的一个重要优势是,它允许研究人员跨数据站点借用信息,减轻了在某些站点没有足够数据量的少数群体可能面临的公平性问题。在本文中,我们针对非随机缺失的变量提出了一种通信效率高、保护隐私的分布式 MI 算法。
{"title":"Federated Multiple Imputation for Variables that Are Missing Not At Random in Distributed Electronic Health Records","authors":"Yi Lian, Xiaoqian Jiang, Qi Long","doi":"10.1101/2024.09.15.24313479","DOIUrl":"https://doi.org/10.1101/2024.09.15.24313479","url":null,"abstract":"Large electronic health records (EHR) have been widely implemented and are available for research activities. The magnitude of such databases often requires storage and computing infrastructure that are distributed at different sites. Restrictions on data-sharing due to privacy concerns have been another driving force behind the development of a large class of distributed and/or federated machine learning methods. While missing data problem is also present in distributed EHRs, albeit potentially more complex, distributed multiple imputation (MI) methods have not received as much attention. An important advantage of distributed MI, as well as distributed analysis, is that it allows researchers to borrow information across data sites, mitigating potential fairness issues for minority groups that do not have enough volume at certain sites. In this paper, we propose a communication-efficient and privacy-preserving distributed MI algorithms for variables that are missing not at random.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative AI and Large Language Models in Reducing Medication Related Harm and Adverse Drug Events - A Scoping Review 生成式人工智能和大型语言模型在减少用药相关伤害和药物不良事件中的应用--范围界定综述
Pub Date : 2024-09-14 DOI: 10.1101/2024.09.13.24313606
Jasmine Chiat Ling Ong, Michael Chen, Ning Ng, Kabilan Elangovan, Nichole Yue Ting Tan, Liyuan Jin, Qihuang Xie, Daniel Shu Wei Ting, Rosa Rodriguez-Monguio, David Bates, Nan Liu
Background: Medication-related harm has a significant impact on global healthcare costs and patient outcomes, accounting for deaths in 4.3 per 1000 patients. Generative artificial intelligence (GenAI) has emerged as a promising tool in mitigating risks of medication-related harm. In particular, large language models (LLMs) and well-developed generative adversarial networks (GANs) showing promise for healthcare related tasks. This review aims to explore the scope and effectiveness of generative AI in reducing medication-related harm, identifying existing development and challenges in research. Methods: We searched for peer reviewed articles in PubMed, Web of Science, Embase, and Scopus for literature published from January 2012 to February 2024. We included studies focusing on the development or application of generative AI in mitigating risk for medication-related harm during the entire medication use process. We excluded studies using traditional AI methods only, those unrelated to healthcare settings, or concerning non-prescribed medication uses such as supplements. Extracted variables included study characteristics, AI model specifics and performance, application settings, and any patient outcome evaluated. Findings: A total of 2203 articles were identified, and 14 met the criteria for inclusion into final review. We found that generative AI and large language models were used in a few key applications: drug-drug interaction identification and prediction; clinical decision support and pharmacovigilance. While the performance and utility of these models varied, they generally showed promise in areas like early identification and classification of adverse drug events and support in decision-making for medication management. However, no studies tested these models prospectively, suggesting a need for further investigation into the integration and real-world application of generative AI tools to improve patient safety and healthcare outcomes effectively. Interpretation: Generative AI shows promise in mitigating medication-related harms, but there are gaps in research rigor and ethical considerations. Future research should focus on creation of high-quality, task-specific benchmarking datasets for medication safety and real-world implementation outcomes.
背景:与用药相关的伤害对全球医疗成本和患者预后有着重大影响,每 1000 名患者中有 4.3 人因此死亡。生成式人工智能(GenAI)已成为降低用药相关伤害风险的一种有前途的工具。尤其是大型语言模型(LLMs)和成熟的生成对抗网络(GANs)在医疗保健相关任务中大有可为。本综述旨在探讨生成式人工智能在减少用药相关伤害方面的应用范围和有效性,同时明确现有研究的发展情况和面临的挑战。研究方法我们在 PubMed、Web of Science、Embase 和 Scopus 中搜索了 2012 年 1 月至 2024 年 2 月期间发表的同行评审文章。我们纳入的研究重点是开发或应用生成式人工智能来降低整个用药过程中与药物相关的伤害风险。我们排除了仅使用传统人工智能方法的研究、与医疗机构无关的研究,或涉及非处方用药(如保健品)的研究。提取的变量包括研究特点、人工智能模型的具体内容和性能、应用设置以及任何患者评估结果。研究结果:共识别出 2203 篇文章,其中 14 篇符合纳入最终审查的标准。我们发现,生成式人工智能和大型语言模型主要应用于以下几个方面:药物相互作用识别和预测、临床决策支持和药物警戒。虽然这些模型的性能和效用各不相同,但它们在药物不良事件的早期识别和分类以及药物管理决策支持等领域普遍表现出良好的前景。然而,没有任何研究对这些模型进行了前瞻性测试,这表明有必要进一步调查生成式人工智能工具的整合和实际应用情况,以有效改善患者安全和医疗保健结果。诠释:生成式人工智能有望减轻与用药相关的伤害,但在研究的严谨性和伦理考虑方面还存在差距。未来的研究应侧重于创建高质量、针对特定任务的用药安全基准数据集和真实世界的实施结果。
{"title":"Generative AI and Large Language Models in Reducing Medication Related Harm and Adverse Drug Events - A Scoping Review","authors":"Jasmine Chiat Ling Ong, Michael Chen, Ning Ng, Kabilan Elangovan, Nichole Yue Ting Tan, Liyuan Jin, Qihuang Xie, Daniel Shu Wei Ting, Rosa Rodriguez-Monguio, David Bates, Nan Liu","doi":"10.1101/2024.09.13.24313606","DOIUrl":"https://doi.org/10.1101/2024.09.13.24313606","url":null,"abstract":"Background: Medication-related harm has a significant impact on global healthcare costs and patient outcomes, accounting for deaths in 4.3 per 1000 patients. Generative artificial intelligence (GenAI) has emerged as a promising tool in mitigating risks of medication-related harm. In particular, large language models (LLMs) and well-developed generative adversarial networks (GANs) showing promise for healthcare related tasks. This review aims to explore the scope and effectiveness of generative AI in reducing medication-related harm, identifying existing development and challenges in research. Methods: We searched for peer reviewed articles in PubMed, Web of Science, Embase, and Scopus for literature published from January 2012 to February 2024. We included studies focusing on the development or application of generative AI in mitigating risk for medication-related harm during the entire medication use process. We excluded studies using traditional AI methods only, those unrelated to healthcare settings, or concerning non-prescribed medication uses such as supplements. Extracted variables included study characteristics, AI model specifics and performance, application settings, and any patient outcome evaluated. Findings: A total of 2203 articles were identified, and 14 met the criteria for inclusion into final review. We found that generative AI and large language models were used in a few key applications: drug-drug interaction identification and prediction; clinical decision support and pharmacovigilance. While the performance and utility of these models varied, they generally showed promise in areas like early identification and classification of adverse drug events and support in decision-making for medication management. However, no studies tested these models prospectively, suggesting a need for further investigation into the integration and real-world application of generative AI tools to improve patient safety and healthcare outcomes effectively. Interpretation: Generative AI shows promise in mitigating medication-related harms, but there are gaps in research rigor and ethical considerations. Future research should focus on creation of high-quality, task-specific benchmarking datasets for medication safety and real-world implementation outcomes.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accuracy of Online Symptom-Assessment Applications, Large Language Models, and Laypeople for Self-Triage Decisions: A Systematic Review 在线症状评估应用程序、大型语言模型和外行人自我分诊决策的准确性:系统回顾
Pub Date : 2024-09-14 DOI: 10.1101/2024.09.13.24313657
Marvin Kopka, Niklas von Kalckreuth, Markus A. Feufel
Symptom-Assessment Application (SAAs, e.g., NHS 111 online) that assist medical laypeople in deciding if and where to seek care (self-triage) are gaining popularity and their accuracy has been examined in numerous studies. With the public release of Large Language Models (LLMs, e.g., ChatGPT), their use in such decision-making processes is growing as well. However, there is currently no comprehensive evidence synthesis for LLMs, and no review has contextualized the accuracy of SAAs and LLMs relative to the accuracy of their users. Thus, this systematic review evaluates the self-triage accuracy of both SAAs and LLMs and compares them to the accuracy of medical laypeople. A total of 1549 studies were screened, with 19 included in the final analysis. The self-triage accuracy of SAAs was found to be moderate but highly variable (11.5 - 90.0%), while the accuracy of LLMs (57.8 - 76.0%) and laypeople (47.3 - 62.4%) was moderate with low variability. Despite some published recommendations to standardize evaluation methodologies, there remains considerable heterogeneity among studies. The use of SAAs should not be universally recommended or discouraged; rather, their utility should be assessed based on the specific use case and tool under consideration.
症状评估应用程序(SAA,如英国国家医疗服务系统 111 在线)可帮助非专业医疗人员决定是否就医以及去哪里就医(自我分诊),这种应用程序越来越受欢迎,许多研究都对其准确性进行了检验。随着大型语言模型(LLMs,如 ChatGPT)的公开发布,它们在此类决策过程中的应用也在不断增加。然而,目前还没有针对 LLMs 的全面证据综述,也没有综述将 SAA 和 LLMs 的准确性与其用户的准确性相对比。因此,本系统综述对 SAA 和 LLM 的自我分诊准确性进行了评估,并将其与非专业医务人员的准确性进行了比较。共筛选出 1549 项研究,其中 19 项纳入最终分析。结果发现,SAA 的自我分诊准确率为中等,但变异较大(11.5 - 90.0%),而 LLM(57.8 - 76.0%)和非专业人士(47.3 - 62.4%)的准确率为中等,变异较小。尽管已发表了一些关于评估方法标准化的建议,但不同研究之间仍存在相当大的异质性。不应普遍推荐或不鼓励使用SAA;相反,应根据具体的使用情况和所考虑的工具来评估其效用。
{"title":"Accuracy of Online Symptom-Assessment Applications, Large Language Models, and Laypeople for Self-Triage Decisions: A Systematic Review","authors":"Marvin Kopka, Niklas von Kalckreuth, Markus A. Feufel","doi":"10.1101/2024.09.13.24313657","DOIUrl":"https://doi.org/10.1101/2024.09.13.24313657","url":null,"abstract":"Symptom-Assessment Application (SAAs, e.g., NHS 111 online) that assist medical laypeople in deciding if and where to seek care (self-triage) are gaining popularity and their accuracy has been examined in numerous studies. With the public release of Large Language Models (LLMs, e.g., ChatGPT), their use in such decision-making processes is growing as well. However, there is currently no comprehensive evidence synthesis for LLMs, and no review has contextualized the accuracy of SAAs and LLMs relative to the accuracy of their users. Thus, this systematic review evaluates the self-triage accuracy of both SAAs and LLMs and compares them to the accuracy of medical laypeople. A total of 1549 studies were screened, with 19 included in the final analysis. The self-triage accuracy of SAAs was found to be moderate but highly variable (11.5 - 90.0%), while the accuracy of LLMs (57.8 - 76.0%) and laypeople (47.3 - 62.4%) was moderate with low variability. Despite some published recommendations to standardize evaluation methodologies, there remains considerable heterogeneity among studies. The use of SAAs should not be universally recommended or discouraged; rather, their utility should be assessed based on the specific use case and tool under consideration.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Breast Cancer Genetic Testing Experience: Probing the Potential Utility of an Online Decision Aid in Risk Perception and Decision Making 乳腺癌基因检测体验:探究在线决策辅助工具在风险认知和决策制定中的潜在作用
Pub Date : 2024-09-14 DOI: 10.1101/2024.09.13.24313647
Anna Vaynrub, Brian Salazar, Yilin Eileen Feng, Harry West, Alissa Michel, Subiksha Umakanth, Katherine Crew, Rita Kukafka
ABSTRACT (350/350 words) Background: Despite the role of pathogenic variants (PVs) in cancer predisposition genes conferring significantly increased risk of breast cancer (BC), uptake of genetic testing (GT) remains low, especially among ethnic minorities. Our prior study identified that a patient decision aid, RealRisks, improved patient-reported outcomes relative to standard educational materials. This study examined patients GT experience and its influence on subsequent actions. We also sought to identify areas for improvement in RealRisks that would expand its focus from improved GT decision-making to understanding results.Methods: Women enrolled in the parent randomized controlled trial were recruited and interviewed. Demographic data was collected from surveys in the parent study. Interviews were conducted, transcribed, and coded to identify recurring themes. Descriptive statistics were generated to compare the interviewed subgroup to the original study cohort of 187 women. Results: Of the 22 women interviewed, 11 (50%) had positive GT results, 2 (9.1%) with a BRCA1/2 PV, and 9 (40.9%) with variants of uncertain significance (VUS). Median age was 40.5 years and 15 (71.4%) identified as non-Hispanic. Twenty (90.9%) reported a family history of BC, and 2 (9.1%) reported a family history of BRCA1/2 PV. The emerging themes included a preference for structured communication of GT results and the need for more actionable knowledge to mitigate BC risk, especially among patients with VUS or negative results. Few patients reported lifestyle changes following the return of their results, although they did understand that their behaviors can impact their BC risk. Conclusions: Patients preferred a structured explanation of their GT results to facilitate a more personal testing experience. While most did not change lifestyle behaviors in response to their GT results, there was a consistent call for further guidance following the initial discussion of GT results. Empowering patients, especially those with negative or VUS results, with the knowledge and context to internalize the implications of their results and form accurate risk perception represents a powerful opportunity to mediate subsequent risk management strategies. Informed by this study, future work will expand RealRisks to foster an accurate perception of GT results and include decision support to navigate concrete next steps.
ABSTRACT (350/350 words) 背景:尽管癌症易感基因中的致病变异(PVs)会显著增加乳腺癌(BC)的患病风险,但基因检测(GT)的接受率仍然很低,尤其是在少数民族中。我们之前的研究发现,与标准教育材料相比,患者决策辅助工具 RealRisks 可改善患者报告的结果。本研究考察了患者的基因检测体验及其对后续行动的影响。我们还试图找出 RealRisks 需要改进的地方,从而将其重点从改善 GT 决策扩大到了解结果:我们招募并采访了参加母体随机对照试验的妇女。人口统计学数据从母体研究的调查中收集。对访谈进行了记录和编码,以确定重复出现的主题。通过描述性统计,将接受访谈的亚组与原始研究的 187 名妇女进行比较。结果:在 22 名受访女性中,11 人(50%)的 GT 结果呈阳性,2 人(9.1%)的 BRCA1/2 PV 呈阳性,9 人(40.9%)的变异意义不确定(VUS)。中位年龄为 40.5 岁,15 人(71.4%)为非西班牙裔。20人(90.9%)报告有BC家族史,2人(9.1%)报告有BRCA1/2 PV家族史。新出现的主题包括偏好有条理地交流 GT 结果,以及需要更多可操作的知识来降低 BC 风险,尤其是在 VUS 或阴性结果的患者中。尽管患者确实了解他们的行为会影响其 BC 风险,但很少有患者报告在结果出来后改变了生活方式。结论:患者更喜欢有条理地解释他们的 GT 结果,以便获得更个人化的检测体验。虽然大多数患者并没有因为 GT 结果而改变生活方式,但在对 GT 结果进行初步讨论后,他们一致要求得到进一步的指导。让患者,尤其是检测结果为阴性或 VUS 的患者,了解相关知识和背景,从而理解检测结果的意义,形成准确的风险认知,这将为后续的风险管理战略提供有力的支持。在本研究的启发下,未来的工作将扩展 RealRisks,以促进对 GT 结果的准确认知,并包括决策支持,以指导具体的下一步行动。
{"title":"The Breast Cancer Genetic Testing Experience: Probing the Potential Utility of an Online Decision Aid in Risk Perception and Decision Making","authors":"Anna Vaynrub, Brian Salazar, Yilin Eileen Feng, Harry West, Alissa Michel, Subiksha Umakanth, Katherine Crew, Rita Kukafka","doi":"10.1101/2024.09.13.24313647","DOIUrl":"https://doi.org/10.1101/2024.09.13.24313647","url":null,"abstract":"ABSTRACT (350/350 words) Background: Despite the role of pathogenic variants (PVs) in cancer predisposition genes conferring significantly increased risk of breast cancer (BC), uptake of genetic testing (GT) remains low, especially among ethnic minorities. Our prior study identified that a patient decision aid, RealRisks, improved patient-reported outcomes relative to standard educational materials. This study examined patients GT experience and its influence on subsequent actions. We also sought to identify areas for improvement in RealRisks that would expand its focus from improved GT decision-making to understanding results.\u0000Methods: Women enrolled in the parent randomized controlled trial were recruited and interviewed. Demographic data was collected from surveys in the parent study. Interviews were conducted, transcribed, and coded to identify recurring themes. Descriptive statistics were generated to compare the interviewed subgroup to the original study cohort of 187 women. Results: Of the 22 women interviewed, 11 (50%) had positive GT results, 2 (9.1%) with a BRCA1/2 PV, and 9 (40.9%) with variants of uncertain significance (VUS). Median age was 40.5 years and 15 (71.4%) identified as non-Hispanic. Twenty (90.9%) reported a family history of BC, and 2 (9.1%) reported a family history of BRCA1/2 PV. The emerging themes included a preference for structured communication of GT results and the need for more actionable knowledge to mitigate BC risk, especially among patients with VUS or negative results. Few patients reported lifestyle changes following the return of their results, although they did understand that their behaviors can impact their BC risk. Conclusions: Patients preferred a structured explanation of their GT results to facilitate a more personal testing experience. While most did not change lifestyle behaviors in response to their GT results, there was a consistent call for further guidance following the initial discussion of GT results. Empowering patients, especially those with negative or VUS results, with the knowledge and context to internalize the implications of their results and form accurate risk perception represents a powerful opportunity to mediate subsequent risk management strategies. Informed by this study, future work will expand RealRisks to foster an accurate perception of GT results and include decision support to navigate concrete next steps.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Forecasting local surges in COVID-19 hospitalizations through adaptive decision tree classifiers 通过自适应决策树分类器预测 COVID-19 住院人数的局部激增
Pub Date : 2024-09-13 DOI: 10.1101/2024.09.12.24313570
Rachel E Murray-Watson, Alyssa Bilinski, Reza Yaesoubi
During the COVID-19 pandemic, many communities across the US experienced surges in hospitalizations, which strained the local hospital capacity and affected the overall quality of care. Even when effective vaccines became available, many communities remained at high risk of surges in COVID-19-related hospitalizations due to waning immunity, low uptake of booster vaccinations, and the continual emergence of new variations of SARS-CoV-2. Some risk metrics, such as the CDC's Community Levels, were developed to predict the impact of COVID-19 on the community-level healthcare system based on routine surveillance data. However, they had limited utility as they were not routinely updated based on accumulating data and were not directly linked to specific outcomes, such as surges in COVID-19 hospitalizations beyond local capacities. Regression models could resolve these limitations, but they have limited interpretability and do not convey the reasoning behind their predictions. In this paper, we evaluated decision tree classifiers that were developed in "real-time" to predict surges in local hospitalizations due to COVID-19 between July 2020 and November 2022. These classifiers would have provided visually intuitive and interpretable decision rules for local decision-makers to understand and act upon, and by being updated weekly, would have responded to changes in the epidemic. We showed that these classifiers exhibit reasonable predictive ability with the area under the receiver operating characteristic curve (auROC) >80%. These classifiers maintained their performance temporally (i.e, over the duration of the pandemic) and spatially (i.e., across US counties). We also showed that these classifiers outperformed the CDC's Community Levels for predicting high hospital occupancy.
在 COVID-19 大流行期间,美国许多社区的住院人数激增,使当地医院的医疗能力不堪重负,并影响了整体医疗质量。即使在有效疫苗上市后,由于免疫力下降、强化免疫接种率低以及 SARS-CoV-2 不断出现新变种,许多社区仍面临 COVID-19 相关住院人数激增的高风险。一些风险度量标准,如疾病预防控制中心的社区水平,是根据常规监测数据来预测 COVID-19 对社区医疗保健系统的影响。但是,这些指标的实用性有限,因为它们没有根据不断积累的数据进行例行更新,也没有与具体结果直接挂钩,例如 COVID-19 住院人数的激增超出了当地的承受能力。回归模型可以解决这些局限性,但它们的可解释性有限,无法传达预测背后的推理。在本文中,我们对 "实时 "开发的决策树分类器进行了评估,以预测 2020 年 7 月至 2022 年 11 月期间 COVID-19 在当地造成的住院人数激增。这些分类器将为当地决策者提供直观、可解释的决策规则,使其能够理解并采取行动,并且通过每周更新,可对疫情变化做出响应。我们的研究表明,这些分类器具有合理的预测能力,接收者工作特征曲线下面积(auROC)为 80%。这些分类器在时间上(即在疫情持续期间)和空间上(即在美国各县)都保持了良好的性能。我们还发现,这些分类器在预测高医院入住率方面的表现优于疾病预防控制中心的社区水平。
{"title":"Forecasting local surges in COVID-19 hospitalizations through adaptive decision tree classifiers","authors":"Rachel E Murray-Watson, Alyssa Bilinski, Reza Yaesoubi","doi":"10.1101/2024.09.12.24313570","DOIUrl":"https://doi.org/10.1101/2024.09.12.24313570","url":null,"abstract":"During the COVID-19 pandemic, many communities across the US experienced surges in hospitalizations, which strained the local hospital capacity and affected the overall quality of care. Even when effective vaccines became available, many communities remained at high risk of surges in COVID-19-related hospitalizations due to waning immunity, low uptake of booster vaccinations, and the continual emergence of new variations of SARS-CoV-2. Some risk metrics, such as the CDC's Community Levels, were developed to predict the impact of COVID-19 on the community-level healthcare system based on routine surveillance data. However, they had limited utility as they were not routinely updated based on accumulating data and were not directly linked to specific outcomes, such as surges in COVID-19 hospitalizations beyond local capacities. Regression models could resolve these limitations, but they have limited interpretability and do not convey the reasoning behind their predictions. In this paper, we evaluated decision tree classifiers that were developed in \"real-time\" to predict surges in local hospitalizations due to COVID-19 between July 2020 and November 2022. These classifiers would have provided visually intuitive and interpretable decision rules for local decision-makers to understand and act upon, and by being updated weekly, would have responded to changes in the epidemic. We showed that these classifiers exhibit reasonable predictive ability with the area under the receiver operating characteristic curve (auROC) >80%. These classifiers maintained their performance temporally (i.e, over the duration of the pandemic) and spatially (i.e., across US counties). We also showed that these classifiers outperformed the CDC's Community Levels for predicting high hospital occupancy.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"117 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gut-Brain Nexus: Mapping Multi-Modal Links to Neurodegeneration at Biobank Scale 肠道与大脑的联系:在生物库规模上绘制与神经退行性病变的多模式联系图
Pub Date : 2024-09-13 DOI: 10.1101/2024.09.12.24313490
Mohammad Shafieinouri, Samantha Hong, Artur Schuh, Mary B. Makarious, Rodrigo Sandon, Paul Suhwan Lee, Emily Simmonds, Hirotaka Iwaki, Gracelyn Hill, Cornelis Blauwendraat, Valentina Escott-Price, Yue A. Qi, Alastair J. Noyce, Armando Reyes-Palomares, Hampton Leonard, Malu Tansey, Andrew Singleton, Mike A. Nalls, Kristin S. Levine, Sara Bandres-Ciga
Alzheimer's disease (AD) and Parkinson's disease (PD) are influenced by genetic and environmental factors. Using data from UK Biobank, SAIL Biobank, and FinnGen, we conducted an unbiased, population-scale study to: 1) Investigate how 155 endocrine, nutritional, metabolic, and digestive system disorders are associated with AD and PD risk prior to their diagnosis, considering known genetic influences; 2) Assess plasma biomarkers' specificity for AD or PD in individuals with these conditions; 3) Develop a multi-classification model integrating genetics, proteomics, and clinical data relevant to conditions affecting the gut-brain axis. Our findings show that certain disorders elevate AD and PD risk before AD and PD diagnosis including: insulin and non-insulin dependent diabetes mellitus, noninfective gastro-enteritis and colitis, functional intestinal disorders, and bacterial intestinal infections, among others. Polygenic risk scores revealed lower genetic predisposition to AD and PD in individuals with co-occurring disorders in the study categories, underscoring the importance of regulating the gut-brain axis to potentially prevent or delay the onset of neurodegenerative diseases. The proteomic profile of AD/PD cases was influenced by comorbid endocrine, nutritional, metabolic, and digestive systems conditions. Importantly, we developed multi-omics prediction models integrating clinical, genetic, proteomic and demographic data, the combination of which performs better than any single paradigm approach in disease classification. This work aims to illuminate the intricate interplay between various physiological factors involved in the gut-brain axis and the development of AD and PD, providing a multifactorial systemic understanding that goes beyond traditional approaches.
阿尔茨海默病(AD)和帕金森病(PD)受遗传和环境因素的影响。利用英国生物库、SAIL 生物库和 FinnGen 的数据,我们开展了一项无偏见的人口规模研究,目的是1)考虑到已知的遗传影响因素,调查 155 种内分泌、营养、新陈代谢和消化系统疾病在确诊前如何与注意力缺失症和注意力缺失症风险相关联;2)评估血浆生物标志物对患有这些疾病的个体中注意力缺失症或注意力缺失症的特异性;3)开发一个多分类模型,整合与影响肠脑轴的疾病相关的遗传学、蛋白质组学和临床数据。我们的研究结果表明,某些疾病会在确诊 AD 和 PD 之前增加 AD 和 PD 风险,这些疾病包括:胰岛素和非胰岛素依赖型糖尿病、非感染性胃肠炎和结肠炎、功能性肠道疾病和细菌性肠道感染等。多基因风险评分显示,在研究类别中,共患疾病的个体对注意力缺失症和注意力缺失症的遗传易感性较低,这凸显了调节肠脑轴对预防或延缓神经退行性疾病发病的重要性。AD/PD病例的蛋白质组特征受到合并内分泌、营养、代谢和消化系统疾病的影响。重要的是,我们开发了整合临床、遗传、蛋白质组和人口统计学数据的多组学预测模型,在疾病分类中,这些数据的组合比任何单一范式方法的效果都要好。这项工作旨在阐明肠脑轴涉及的各种生理因素与AD和PD发病之间错综复杂的相互作用,提供一种超越传统方法的多因素系统性认识。
{"title":"Gut-Brain Nexus: Mapping Multi-Modal Links to Neurodegeneration at Biobank Scale","authors":"Mohammad Shafieinouri, Samantha Hong, Artur Schuh, Mary B. Makarious, Rodrigo Sandon, Paul Suhwan Lee, Emily Simmonds, Hirotaka Iwaki, Gracelyn Hill, Cornelis Blauwendraat, Valentina Escott-Price, Yue A. Qi, Alastair J. Noyce, Armando Reyes-Palomares, Hampton Leonard, Malu Tansey, Andrew Singleton, Mike A. Nalls, Kristin S. Levine, Sara Bandres-Ciga","doi":"10.1101/2024.09.12.24313490","DOIUrl":"https://doi.org/10.1101/2024.09.12.24313490","url":null,"abstract":"Alzheimer's disease (AD) and Parkinson's disease (PD) are influenced by genetic and environmental factors. Using data from UK Biobank, SAIL Biobank, and FinnGen, we conducted an unbiased, population-scale study to: 1) Investigate how 155 endocrine, nutritional, metabolic, and digestive system disorders are associated with AD and PD risk prior to their diagnosis, considering known genetic influences; 2) Assess plasma biomarkers' specificity for AD or PD in individuals with these conditions; 3) Develop a multi-classification model integrating genetics, proteomics, and clinical data relevant to conditions affecting the gut-brain axis. Our findings show that certain disorders elevate AD and PD risk before AD and PD diagnosis including: insulin and non-insulin dependent diabetes mellitus, noninfective gastro-enteritis and colitis, functional intestinal disorders, and bacterial intestinal infections, among others. Polygenic risk scores revealed lower genetic predisposition to AD and PD in individuals with co-occurring disorders in the study categories, underscoring the importance of regulating the gut-brain axis to potentially prevent or delay the onset of neurodegenerative diseases. The proteomic profile of AD/PD cases was influenced by comorbid endocrine, nutritional, metabolic, and digestive systems conditions. Importantly, we developed multi-omics prediction models integrating clinical, genetic, proteomic and demographic data, the combination of which performs better than any single paradigm approach in disease classification. This work aims to illuminate the intricate interplay between various physiological factors involved in the gut-brain axis and the development of AD and PD, providing a multifactorial systemic understanding that goes beyond traditional approaches.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technology-Supported Self-Triage Decision Making: A Mixed-Methods Study 技术辅助下的自我分诊决策:混合方法研究
Pub Date : 2024-09-13 DOI: 10.1101/2024.09.12.24313558
Marvin Kopka, Sonja Mei Wang, Samira Kunz, Christine Schmid, Markus A. Feufel
Symptom-Assessment Application (SAAs) and Large Language Models (LLMs) are increasingly used by laypeople to navigate care options. Although humans ultimately make a final decision when using these systems, previous research has typically examined the performance of humans and SAAs/LLMs separately. Thus, it is unclear how decision-making unfolds in such hybrid human-technology teams and if SAAs/LLMs can improve laypeople's decisions. To address this gap, we conducted a convergent parallel mixed-methods study with semi-structured interviews and a randomized controlled trial. Our interview data revealed that in human-technology teams, decision-making is influenced by factors before, during, and after interaction. Users tend to rely on technology for information gathering and analysis but remain responsible for information integration and the final decision. Based on these results, we developed a model for technology-assisted self-triage decision-making. Our quantitative results indicate that when using a high-performing SAA, laypeople's decision accuracy improved from 53.2% to 64.5% (OR = 2.52, p < .001). In contrast, decision accuracy remained unchanged when using a LLM (54.8% before vs. 54.2% after usage, p = .79). These findings highlight the importance of studying SAAs/LLMs with humans in the loop, as opposed to analyzing them in isolation.
非专业人士越来越多地使用症状评估应用程序(SAA)和大语言模型(LLM)来指导医疗选择。虽然人类在使用这些系统时最终会做出决定,但以往的研究通常都是分别考察人类和 SAA/LLM 的表现。因此,目前还不清楚在这种人类-技术混合团队中决策是如何展开的,也不清楚 SAA/LLM 是否能改善非专业人士的决策。为了填补这一空白,我们采用半结构式访谈和随机对照试验进行了一项融合并行的混合方法研究。我们的访谈数据显示,在人类-技术团队中,决策会受到互动前、互动中和互动后各种因素的影响。用户倾向于依赖技术来收集和分析信息,但仍对信息整合和最终决策负责。基于这些结果,我们开发了一个技术辅助自我分层决策模型。我们的定量结果表明,当使用高性能的 SAA 时,非专业人士的决策准确率从 53.2% 提高到 64.5%(OR = 2.52,p <.001)。相比之下,使用 LLM 时,决策准确率保持不变(使用前为 54.8% ,使用后为 54.2%,p = .79)。这些发现突出表明,与孤立地分析SAA/LLMs相比,与人类一起研究SAA/LLMs具有重要意义。
{"title":"Technology-Supported Self-Triage Decision Making: A Mixed-Methods Study","authors":"Marvin Kopka, Sonja Mei Wang, Samira Kunz, Christine Schmid, Markus A. Feufel","doi":"10.1101/2024.09.12.24313558","DOIUrl":"https://doi.org/10.1101/2024.09.12.24313558","url":null,"abstract":"Symptom-Assessment Application (SAAs) and Large Language Models (LLMs) are increasingly used by laypeople to navigate care options. Although humans ultimately make a final decision when using these systems, previous research has typically examined the performance of humans and SAAs/LLMs separately. Thus, it is unclear how decision-making unfolds in such hybrid human-technology teams and if SAAs/LLMs can improve laypeople's decisions. To address this gap, we conducted a convergent parallel mixed-methods study with semi-structured interviews and a randomized controlled trial. Our interview data revealed that in human-technology teams, decision-making is influenced by factors before, during, and after interaction. Users tend to rely on technology for information gathering and analysis but remain responsible for information integration and the final decision. Based on these results, we developed a model for technology-assisted self-triage decision-making. Our quantitative results indicate that when using a high-performing SAA, laypeople's decision accuracy improved from 53.2% to 64.5% (OR = 2.52, p &lt; .001). In contrast, decision accuracy remained unchanged when using a LLM (54.8% before vs. 54.2% after usage, p = .79). These findings highlight the importance of studying SAAs/LLMs with humans in the loop, as opposed to analyzing them in isolation.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
medRxiv - Health Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1