Mayo Clinic Proceedings. Digital health最新文献_第9页

Impact of Digital Interventions in Occupational Health Care: A Systematic Review 数字干预对职业卫生保健的影响：系统综述

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-18 DOI: 10.1016/j.mcpdig.2025.100216

Mirjam M. Jern-Matintupa MD, MPH , Anita M. Riipinen MD, PhD , Merja K. Laine MD, PhD

Objective

To assess the existing body of evidence and impact of digital interventions on occupational health care.

Methods

The search strategy and review process were conducted in accordance with the PRISMA guidelines. The search was carried out during a period from January 1, 2013 to June 5, 2023, using the SCOPUS and Ovid Medline databases. After the identification of the relevant records, screening was conducted in 3 stages, following specific predetermined inclusion and exclusion criteria. A data-extraction model was created on the basis of the aim of the review. The quality of the selected studies was evaluated using the Effective Public Health Practice framework. Owing to the heterogeneity of the outcome measures, we used narrative synthesis to summarize the findings.

Results

We identified 382 records in SCOPUS and 441 in Ovid Medline. We selected 54 studies to be included in the evidence synthesis. The health targets of the interventions varied widely, but we identified 2 main focus areas: sedentary behavior (n=17, 32%) and mental health (n=14, 26%). Even when the studies had the same health target, the outcomes and chosen measures varied widely. Given the considerable effect of the primary outcome, mental health appears to be a good target for digital interventions. Online training and computer software could be especially effective.

Conclusion

The potential positive impact of digital interventions on mental health, especially online training, should be leveraged by health care professionals and providers. In order to provide more specific recommendations for health care professionals, occupational health care researchers should strive for consensus on outcome measures.

目的评估现有证据和数字化干预对职业卫生保健的影响。方法按照PRISMA指南进行检索策略和评审过程。检索时间为2013年1月1日至2023年6月5日，检索对象为SCOPUS和Ovid Medline数据库。在确定相关记录后，按照特定的预定纳入和排除标准，分3个阶段进行筛查。根据综述的目的，建立了数据提取模型。所选研究的质量采用有效公共卫生实践框架进行评估。由于结果测量的异质性，我们使用叙事综合来总结研究结果。结果SCOPUS检索到382条，Ovid Medline检索到441条。我们选择了54项研究纳入证据综合。干预措施的健康目标差异很大，但我们确定了两个主要关注领域：久坐行为（n= 17,32%）和心理健康（n= 14,26%）。即使这些研究有相同的健康目标，结果和选择的测量方法也有很大的不同。鉴于主要结果的巨大影响，心理健康似乎是数字干预的一个很好的目标。在线培训和电脑软件可能特别有效。结论数字干预对心理健康的潜在积极影响，特别是在线培训，应被卫生保健专业人员和提供者利用。为了向卫生保健专业人员提供更具体的建议，职业卫生保健研究人员应该努力在结果测量上达成共识。

{"title":"Impact of Digital Interventions in Occupational Health Care: A Systematic Review","authors":"Mirjam M. Jern-Matintupa MD, MPH , Anita M. Riipinen MD, PhD , Merja K. Laine MD, PhD","doi":"10.1016/j.mcpdig.2025.100216","DOIUrl":"10.1016/j.mcpdig.2025.100216","url":null,"abstract":"<div><h3>Objective</h3><div>To assess the existing body of evidence and impact of digital interventions on occupational health care.</div></div><div><h3>Methods</h3><div>The search strategy and review process were conducted in accordance with the PRISMA guidelines. The search was carried out during a period from January 1, 2013 to June 5, 2023, using the SCOPUS and Ovid Medline databases. After the identification of the relevant records, screening was conducted in 3 stages, following specific predetermined inclusion and exclusion criteria. A data-extraction model was created on the basis of the aim of the review. The quality of the selected studies was evaluated using the Effective Public Health Practice framework. Owing to the heterogeneity of the outcome measures, we used narrative synthesis to summarize the findings.</div></div><div><h3>Results</h3><div>We identified 382 records in SCOPUS and 441 in Ovid Medline. We selected 54 studies to be included in the evidence synthesis. The health targets of the interventions varied widely, but we identified 2 main focus areas: sedentary behavior (n=17, 32%) and mental health (n=14, 26%). Even when the studies had the same health target, the outcomes and chosen measures varied widely. Given the considerable effect of the primary outcome, mental health appears to be a good target for digital interventions. Online training and computer software could be especially effective.</div></div><div><h3>Conclusion</h3><div>The potential positive impact of digital interventions on mental health, especially online training, should be leveraged by health care professionals and providers. In order to provide more specific recommendations for health care professionals, occupational health care researchers should strive for consensus on outcome measures.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100216"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Erratum to Leveraging the Metaverse for Enhanced Longevity as a Component of Health 4.0 [Mayo Clinic Proceedings: Digital Health. 2024;2:139-151] 作为健康 4.0 的一个组成部分，利用 "元宇宙 "提高寿命》的勘误 [Mayo Clinic Proceedings: Digital Health.］

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-12 DOI: 10.1016/j.mcpdig.2025.100215

引用次数: 0

Selecting Wearable Devices to Measure Cardiovascular Functions in Community-Dwelling Adults: Application of a Practical Guide for Device Selection 选择可穿戴设备来测量社区居住成年人的心血管功能：设备选择实用指南的应用

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-12 DOI: 10.1016/j.mcpdig.2025.100202

Jessica K. Lu MEng , Weilan Wang PhD , Jorming Goh PhD , Andrea B. Maier MD, PhD

Continuous monitoring of cardiovascular functions can provide crucial insights into the health status and lifestyle behaviors of an individual. Wearable devices offer a convenient and cost-effective solution for collecting cardiovascular measurements outside clinical settings. However, the abundance of available devices poses challenges for researchers, health care professionals, and device users in selecting the most suitable one. This article illustrates the application of a practical guide for selecting wearable devices for the continuous monitoring of cardiovascular functions in community-dwelling adults who are generally healthy or have minimal, well-managed chronic conditions. An initial systematic review of the literature revealed 216 devices, each of which were assessed on the basis of 5 core criteria from the guide: (1) continuous monitoring capability, (2) device availability and suitability, (3) technical performance (accuracy and precision), (4) feasibility of use, and (5) cost evaluation. From the 216 devices, there were 136 devices capable of continuous monitoring. After the exclusion of unavailable and unsuitable devices, 53 devices underwent validation assessment of accuracy and precision. Although COSMIN criteria were applied to evaluate technical performance, a lack of validation for certain devices limits a comprehensive evaluation. After selection of valid devices, the feasibility and cost of 20 devices were examined. Wearable devices, such as the Apple Watch Series 9, Fitbit Charge 6, Garmin vívosmart 5, and Oura Ring Gen3, emerged as suitable devices to measure cardiovascular function in community-dwelling adults. The systematic process for device selection could also be applied to select wearable devices for the measurement of other physiologic variables and lifestyle behaviors.

持续监测心血管功能可以对个人的健康状况和生活方式行为提供重要的见解。可穿戴设备为在临床环境之外收集心血管测量数据提供了一种方便且经济高效的解决方案。然而，大量的可用设备给研究人员、卫生保健专业人员和设备用户在选择最合适的设备方面带来了挑战。本文阐述了一种实用指南的应用，用于选择可穿戴设备，用于持续监测社区居住成年人的心血管功能，这些成年人通常健康或有最小的、管理良好的慢性病。对文献的初步系统回顾显示了216个设备，每个设备都是根据指南中的5个核心标准进行评估的：(1)持续监测能力，(2)设备可用性和适用性，(3)技术性能（准确性和精密度），(4)使用可行性，(5)成本评估。在216个装置中，有136个装置能够持续监测。在排除不可用和不合适的器械后，53个器械进行了准确性和精密度的验证评估。虽然采用了COSMIN标准来评价技术性能，但缺乏对某些设备的验证限制了全面评价。在选择了有效装置后，对20种装置的可行性和成本进行了考察。可穿戴设备，如Apple Watch Series 9、Fitbit Charge 6、Garmin vívosmart 5和Oura Ring Gen3，成为测量社区居民心血管功能的合适设备。设备选择的系统过程也可以应用于选择可穿戴设备来测量其他生理变量和生活方式行为。

{"title":"Selecting Wearable Devices to Measure Cardiovascular Functions in Community-Dwelling Adults: Application of a Practical Guide for Device Selection","authors":"Jessica K. Lu MEng , Weilan Wang PhD , Jorming Goh PhD , Andrea B. Maier MD, PhD","doi":"10.1016/j.mcpdig.2025.100202","DOIUrl":"10.1016/j.mcpdig.2025.100202","url":null,"abstract":"<div><div>Continuous monitoring of cardiovascular functions can provide crucial insights into the health status and lifestyle behaviors of an individual. Wearable devices offer a convenient and cost-effective solution for collecting cardiovascular measurements outside clinical settings. However, the abundance of available devices poses challenges for researchers, health care professionals, and device users in selecting the most suitable one. This article illustrates the application of a practical guide for selecting wearable devices for the continuous monitoring of cardiovascular functions in community-dwelling adults who are generally healthy or have minimal, well-managed chronic conditions. An initial systematic review of the literature revealed 216 devices, each of which were assessed on the basis of 5 core criteria from the guide: (1) continuous monitoring capability, (2) device availability and suitability, (3) technical performance (accuracy and precision), (4) feasibility of use, and (5) cost evaluation. From the 216 devices, there were 136 devices capable of continuous monitoring. After the exclusion of unavailable and unsuitable devices, 53 devices underwent validation assessment of accuracy and precision. Although COSMIN criteria were applied to evaluate technical performance, a lack of validation for certain devices limits a comprehensive evaluation. After selection of valid devices, the feasibility and cost of 20 devices were examined. Wearable devices, such as the Apple Watch Series 9, Fitbit Charge 6, Garmin vívosmart 5, and Oura Ring Gen3, emerged as suitable devices to measure cardiovascular function in community-dwelling adults. The systematic process for device selection could also be applied to select wearable devices for the measurement of other physiologic variables and lifestyle behaviors.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100202"},"PeriodicalIF":0.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143697823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The PERFORM Study: Artificial Intelligence Versus Human Residents in Cross-Sectional Obstetrics-Gynecology Scenarios Across Languages and Time Constraints PERFORM研究：人工智能与人类住院医生在跨语言和时间限制的横截面妇产科场景中的对比

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-08 DOI: 10.1016/j.mcpdig.2025.100206

Canio Martinelli MD , Antonio Giordano MD , Vincenzo Carnevale PhD , Sharon Raffaella Burk PhD , Lavinia Porto MD , Giuseppe Vizzielli MD , Alfredo Ercoli MD

Objective

To systematically evaluate the performance of artificial intelligence (AI) large language models (LLMs) compared with obstetrics-gynecology residents in clinical decision-making, examining diagnostic accuracy and error patterns across linguistic domains, time constraints, and experience levels.

Patients and Methods

In this cross-sectional study, we evaluated 8 AI LLMs and 24 obstetrics-gynecology residents (Years 1-5) using 60 standardized clinical scenarios. Most AI LLMs and all residents were assessed in May 2024, whereas chat GPT-01-preview, chat-GPT4o, and Claude Sonnet 3.5 were evaluated in November 2024. The assessment framework incorporated English and Italian scenarios under both timed and untimed conditions, along with systematic error pattern analysis. The primary outcome was diagnostic accuracy; secondary end points included AI system stratification, resident progression, language impact, time pressure effects, and integration potential.

Results

The AI LLMs reported superior overall accuracy (73.75%; 95% confidence interval [CI], 69.64%-77.49%) compared with residents (65.35%; 95% CI, 62.85%-67.76%; P<.001). High-performing AI systems (ChatGPT-01-preview, GPT4o, and Claude Sonnet 3.5) achieved consistently high cross-linguistic accuracy (88.33%) with minimal language impact (6.67%±0.00%). Resident performance declined significantly under time constraints (from 73.2% to 56.5% adjusted accuracy; Cohen’s d=1.009; P<.001), whereas AI systems reported lesser deterioration. Error pattern analysis indicated a moderate correlation between AI and human reasoning (r=0.666; P<.001). Residents exhibited systematic progression from year 1 (44.7%) to year 5 (87.1%). Integration analysis found variable benefits across training levels, with maximum enhancement in early-career residents (+29.7%; P<.001).

Conclusion

High-performing AI LLMs reported strong diagnostic accuracy and resilience under linguistic and temporal pressures. These findings suggest that AI-enhanced decision-making may offer particular benefits in obstetrics and gynecology training programs, especially for junior residents, by improving diagnostic consistency and potentially reducing cognitive load in time-sensitive clinical settings.

目的系统评估人工智能（AI）大语言模型（llm）与妇产科住院医师在临床决策中的表现，检查跨语言领域、时间限制和经验水平的诊断准确性和错误模式。患者和方法在这项横断面研究中，我们使用60个标准化的临床场景评估了8名AI法学硕士和24名妇产科住院医师（1-5年）。大多数AI llm和所有居民在2024年5月进行评估，而chat GPT-01-preview， chat- gpt40和Claude Sonnet 3.5在2024年11月进行评估。评估框架结合了定时和非定时条件下的英语和意大利语场景，以及系统的错误模式分析。主要结局是诊断准确性；次要终点包括人工智能系统分层、居民进展、语言影响、时间压力效应和整合潜力。结果人工智能LLMs总体准确率为73.75%；95%可信区间[CI]， 69.64%-77.49%)，而居民(65.35%；95% ci, 62.85%-67.76%；术;措施)。高性能的人工智能系统（ChatGPT-01-preview、gpt40和Claude Sonnet 3.5）在最小的语言影响（6.67%±0.00%）下实现了持续的高跨语言准确率（88.33%）。在时间限制下，住院医生的表现显著下降(调整后准确率从73.2%降至56.5%；科恩的d = 1.009;P<.001)，而人工智能系统报告的恶化程度较小。误差模式分析表明，人工智能与人类推理之间存在中度相关性(r=0.666；术;措施)。从第1年（44.7%）到第5年（87.1%），居民表现出系统的进展。综合分析发现，不同培训水平的收益各不相同，早期职业居民的收益最大(+29.7%；术;措施)。结论高性能AI llm在语言和时间压力下具有较强的诊断准确性和弹性。这些发现表明，人工智能增强的决策可以通过提高诊断一致性和潜在地减少时间敏感的临床环境中的认知负荷，为妇产科培训项目提供特别的好处，特别是对初级住院医生。

{"title":"The PERFORM Study: Artificial Intelligence Versus Human Residents in Cross-Sectional Obstetrics-Gynecology Scenarios Across Languages and Time Constraints","authors":"Canio Martinelli MD , Antonio Giordano MD , Vincenzo Carnevale PhD , Sharon Raffaella Burk PhD , Lavinia Porto MD , Giuseppe Vizzielli MD , Alfredo Ercoli MD","doi":"10.1016/j.mcpdig.2025.100206","DOIUrl":"10.1016/j.mcpdig.2025.100206","url":null,"abstract":"<div><h3>Objective</h3><div>To systematically evaluate the performance of artificial intelligence (AI) large language models (LLMs) compared with obstetrics-gynecology residents in clinical decision-making, examining diagnostic accuracy and error patterns across linguistic domains, time constraints, and experience levels.</div></div><div><h3>Patients and Methods</h3><div>In this cross-sectional study, we evaluated 8 AI LLMs and 24 obstetrics-gynecology residents (Years 1-5) using 60 standardized clinical scenarios. Most AI LLMs and all residents were assessed in May 2024, whereas chat GPT-01-preview, chat-GPT4o, and Claude Sonnet 3.5 were evaluated in November 2024. The assessment framework incorporated English and Italian scenarios under both timed and untimed conditions, along with systematic error pattern analysis. The primary outcome was diagnostic accuracy; secondary end points included AI system stratification, resident progression, language impact, time pressure effects, and integration potential.</div></div><div><h3>Results</h3><div>The AI LLMs reported superior overall accuracy (73.75%; 95% confidence interval [CI], 69.64%-77.49%) compared with residents (65.35%; 95% CI, 62.85%-67.76%; <em>P</em><.001). High-performing AI systems (ChatGPT-01-preview, GPT4o, and Claude Sonnet 3.5) achieved consistently high cross-linguistic accuracy (88.33%) with minimal language impact (6.67%±0.00%). Resident performance declined significantly under time constraints (from 73.2% to 56.5% adjusted accuracy; Cohen’s d=1.009; <em>P</em><.001), whereas AI systems reported lesser deterioration. Error pattern analysis indicated a moderate correlation between AI and human reasoning (r=0.666; <em>P</em><.001). Residents exhibited systematic progression from year 1 (44.7%) to year 5 (87.1%). Integration analysis found variable benefits across training levels, with maximum enhancement in early-career residents (+29.7%; <em>P</em><.001).</div></div><div><h3>Conclusion</h3><div>High-performing AI LLMs reported strong diagnostic accuracy and resilience under linguistic and temporal pressures. These findings suggest that AI-enhanced decision-making may offer particular benefits in obstetrics and gynecology training programs, especially for junior residents, by improving diagnostic consistency and potentially reducing cognitive load in time-sensitive clinical settings.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100206"},"PeriodicalIF":0.0,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143697951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Corrigendum to “Experience With an Optical Character Recognition Search Application for Review of Outside Medical Records” “使用光学字符识别检索程序查阅外部医疗记录的经验”的勘误表

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-08 DOI: 10.1016/j.mcpdig.2025.100208

引用次数: 0

Global Artificial Intelligence Arms Race: The Future of Artificial Intelligence in Medicine 全球人工智能军备竞赛：人工智能在医学领域的未来

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-07 DOI: 10.1016/j.mcpdig.2025.100207

Hamrish Kumar Rajakumar MBBS

引用次数: 0

A Systematic Review of Natural Language Processing Techniques for Early Detection of Cognitive Impairment 自然语言处理技术在认知障碍早期检测中的系统综述

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-05 DOI: 10.1016/j.mcpdig.2025.100205

Ravi Shankar PhD , Anjali Bundele MPH , Amartya Mukhopadhyay FRCP

Objective

To systematically evaluate the effectiveness and methodologic approaches of natural language processing (NLP) techniques for early detection of cognitive decline through speech and language analysis.

Methods

We conducted a comprehensive search of 8 databases from inception through August 31, 2024, following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Studies were included if they used NLP techniques to analyze speech or language data for detecting cognitive impairment and reported diagnostic accuracy metrics. Two independent reviewers (R.S. and A.B.) screened articles and extracted data on study characteristics, NLP methods, and outcomes.

Results

Of 23,562 records identified, 51 studies met inclusion criteria, involving 17,340 participants (mean age, 72.4 years). Combined linguistic and acoustic approaches achieved the highest diagnostic accuracy (average 87%; area under the curve [AUC], 0.89) compared with linguistic-only (83%; AUC, 0.85) or acoustic-only approaches (80%; AUC, 0.82). Lexical diversity, syntactic complexity, and semantic coherence were consistently strong predictors across cognitive conditions. Picture description tasks were most common (n=21), followed by spontaneous speech (n=15) and story recall (n=8). Crosslinguistic applicability was found across 8 languages, although language-specific adaptations were necessary. Longitudinal studies (n=9) reported potential for early detection but were limited by smaller sample sizes (average n=159) compared with cross-sectional studies (n=42; average n=274).

Conclusion

Natural language processing techniques show promising diagnostic accuracy for detecting cognitive impairment across multiple languages and clinical contexts. Although combined linguistic-acoustic approaches appear most effective, methodologic heterogeneity and small sample sizes in existing studies suggest the need for larger, standardized investigations to establish clinical utility.

目的系统评价自然语言处理（NLP）技术在语音和语言分析中早期发现认知衰退的有效性和方法方法。方法：我们按照系统评价和meta分析指南的首选报告项目，从研究开始到2024年8月31日，对8个数据库进行了全面检索。如果研究使用NLP技术来分析语音或语言数据以检测认知障碍并报告诊断准确性指标，则将其纳入研究。两位独立审稿人（R.S.和A.B.）筛选了文章并提取了研究特征、NLP方法和结果的数据。结果在确定的23,562份记录中，51项研究符合纳入标准，涉及17,340名参与者（平均年龄72.4岁）。语言和声学相结合的方法达到了最高的诊断准确率(平均87%；曲线下面积[AUC]， 0.89)与仅语言(83%；AUC, 0.85)或纯声学方法(80%；AUC, 0.82)。词汇多样性、句法复杂性和语义一致性在认知条件下始终是强有力的预测因子。图片描述任务最常见（n=21），其次是自发演讲（n=15）和故事回忆（n=8）。在8种语言中发现了跨语言适用性，尽管语言特定的适应性是必要的。纵向研究（n=9）报告了早期发现的潜力，但与横断面研究(n=42；平均n = 274)。结论自然语言处理技术在多语言和临床背景下诊断认知障碍具有良好的准确性。虽然结合语言-声学方法似乎是最有效的，但现有研究的方法异质性和小样本量表明需要更大规模的标准化调查来建立临床效用。

{"title":"A Systematic Review of Natural Language Processing Techniques for Early Detection of Cognitive Impairment","authors":"Ravi Shankar PhD , Anjali Bundele MPH , Amartya Mukhopadhyay FRCP","doi":"10.1016/j.mcpdig.2025.100205","DOIUrl":"10.1016/j.mcpdig.2025.100205","url":null,"abstract":"<div><h3>Objective</h3><div>To systematically evaluate the effectiveness and methodologic approaches of natural language processing (NLP) techniques for early detection of cognitive decline through speech and language analysis.</div></div><div><h3>Methods</h3><div>We conducted a comprehensive search of 8 databases from inception through August 31, 2024, following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Studies were included if they used NLP techniques to analyze speech or language data for detecting cognitive impairment and reported diagnostic accuracy metrics. Two independent reviewers (R.S. and A.B.) screened articles and extracted data on study characteristics, NLP methods, and outcomes.</div></div><div><h3>Results</h3><div>Of 23,562 records identified, 51 studies met inclusion criteria, involving 17,340 participants (mean age, 72.4 years). Combined linguistic and acoustic approaches achieved the highest diagnostic accuracy (average 87%; area under the curve [AUC], 0.89) compared with linguistic-only (83%; AUC, 0.85) or acoustic-only approaches (80%; AUC, 0.82). Lexical diversity, syntactic complexity, and semantic coherence were consistently strong predictors across cognitive conditions. Picture description tasks were most common (n=21), followed by spontaneous speech (n=15) and story recall (n=8). Crosslinguistic applicability was found across 8 languages, although language-specific adaptations were necessary. Longitudinal studies (n=9) reported potential for early detection but were limited by smaller sample sizes (average n=159) compared with cross-sectional studies (n=42; average n=274).</div></div><div><h3>Conclusion</h3><div>Natural language processing techniques show promising diagnostic accuracy for detecting cognitive impairment across multiple languages and clinical contexts. Although combined linguistic-acoustic approaches appear most effective, methodologic heterogeneity and small sample sizes in existing studies suggest the need for larger, standardized investigations to establish clinical utility.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100205"},"PeriodicalIF":0.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143697822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparison of Participant and Site Perceptions of Decentralized Clinical Trials in the USA 美国分散临床试验的参与者和地点感知的比较

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-03 DOI: 10.1016/j.mcpdig.2025.100201

Roland Barge PhD , Patrick Floody MBA

Objective

To define potential participant and site perceptions of decentralized clinical trials (DCTs).

Participants and Methods

Two qualitative surveys were conducted between January 2022 and August 2022 to assess current awareness of, and perceptions about, DCTs. The first survey received 141 responses from staff at our clinical trial sites; the second survey received 481 responses from US-based healthy individuals or those living with an illness.

Results

There was a difference in perceptions and willingness between participants and sites toward DCTs. Participants expressed more comfort with hybrid and fully remote trials than did the sites. Site staff were more concerned and less trusting than participants of DCTs; participants’ main concerns were regarding practicality and medical safety, whereas the focus for sites was on burden, trust, and security. Both sites and participants expressed confidence in fully remote clinical study activities when they have appropriate support; sites were less tolerant of fully remote clinical study activities if professional support was not provided. Overall, sites were more willing to manage the use of DCT-related technologies than were participants. It is highly likely that participants’ willingness to manage DCT technologies relates to the perceived burden of use (ie, willingness decreases as burden or impact on daily life increases). Sponsors, contract research organizations, and DCT vendors generally had positive views on DCTs. However, different stakeholders had different concerns.

Conclusion

These results highlight the need for collaborative research and development of DCTs, as well as a clear DCT framework and regulatory guidance.

目的确定分散临床试验（dct）的潜在参与者和地点观念。参与者和方法在2022年1月至2022年8月期间进行了两次定性调查，以评估当前对dct的认识和看法。第一次调查收到了141份来自我们临床试验点工作人员的回复；第二项调查收到了481份来自美国健康人士或疾病患者的回复。结果参试者和参试地点对dct的认知和意愿存在差异。参与者对混合和完全远程试验比现场试验更满意。现场工作人员比dct参与者更关心、更不信任；与会者主要关注的是实用性和医疗安全，而场地的重点是负担、信任和安全。当得到适当的支持时，试验点和参与者都表示对完全远程临床研究活动有信心；如果不提供专业支持，试验点对完全远程临床研究活动的容忍度较低。总体而言，网站比参与者更愿意管理dct相关技术的使用。参与者管理DCT技术的意愿极有可能与感知到的使用负担有关（即，意愿随着负担或对日常生活影响的增加而降低）。赞助商、合同研究组织和DCT供应商通常对DCT持积极态度。然而，不同的利益相关者有不同的关注点。结论DCT的协同研发、明确的DCT框架和监管指导十分必要。

{"title":"Comparison of Participant and Site Perceptions of Decentralized Clinical Trials in the USA","authors":"Roland Barge PhD , Patrick Floody MBA","doi":"10.1016/j.mcpdig.2025.100201","DOIUrl":"10.1016/j.mcpdig.2025.100201","url":null,"abstract":"<div><h3>Objective</h3><div>To define potential participant and site perceptions of decentralized clinical trials (DCTs).</div></div><div><h3>Participants and Methods</h3><div>Two qualitative surveys were conducted between January 2022 and August 2022 to assess current awareness of, and perceptions about, DCTs. The first survey received 141 responses from staff at our clinical trial sites; the second survey received 481 responses from US-based healthy individuals or those living with an illness.</div></div><div><h3>Results</h3><div>There was a difference in perceptions and willingness between participants and sites toward DCTs. Participants expressed more comfort with hybrid and fully remote trials than did the sites. Site staff were more concerned and less trusting than participants of DCTs; participants’ main concerns were regarding practicality and medical safety, whereas the focus for sites was on burden, trust, and security. Both sites and participants expressed confidence in fully remote clinical study activities when they have appropriate support; sites were less tolerant of fully remote clinical study activities if professional support was not provided. Overall, sites were more willing to manage the use of DCT-related technologies than were participants. It is highly likely that participants’ willingness to manage DCT technologies relates to the perceived burden of use (ie, willingness decreases as burden or impact on daily life increases). Sponsors, contract research organizations, and DCT vendors generally had positive views on DCTs. However, different stakeholders had different concerns.</div></div><div><h3>Conclusion</h3><div>These results highlight the need for collaborative research and development of DCTs, as well as a clear DCT framework and regulatory guidance.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100201"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143696654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CF Tummy Tracker: A Cystic Fibrosis–Specific Patient-Reported Outcome Measure for Daily Gastrointestinal Symptom Burden CF胃追踪器：囊性纤维化患者报告的每日胃肠道症状负担的结果测量

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-03 DOI: 10.1016/j.mcpdig.2025.100203

Rebecca J. Calthorpe BMBS , Hisham A. Saumtally MBChB , Laura M. Howells PhD , Natalie J. Goodchild BA (Hons) , Bethinn C. Evans MSc , Zoe Elliott , Bu’Hussain Hayee PhD , Siobhán B. Carr MBBS , Caroline M. Elston MBBS , Alexander A.R. Horsley PhD , Daniel G. Peckham DM , Helen L. Barr PhD , Giles A.D. Major PhD , Iain D. Stewart PhD , Kim S. Thomas , Alan R. Smyth MD

Objective

To develop a cystic fibrosis (CF)–specific patient-reported outcome measure (PROM) to measure the daily burden of gastrointestinal symptoms for people with cystic fibrosis (pwCF) aged 12 years and older and address the lack of validated outcome measures for gastrointestinal symptoms in CF.

Patients and Methods

CF Tummy Tracker was developed through a 5-stage approach in accordance with regulatory guidance. This included development and refinement of a conceptual framework; item generation; refinement; reduction; selection; and initial PROM testing. A mixed-methods approach, consisting of expert panel discussions, a focus group, interviews, and an online survey, was used. In initial testing, participants completed the PROM daily for 14 days via a smartphone application. This study was performed from March 14, 2022, December 12, 2023.

Results

The CF community were involved throughout the development via a focus group (n=7 pwCF), interviews (n=11 pwCF), and an online survey (n=180 pwCF). A formative model was confirmed for the PROM. The final PROM, CF Tummy Tracker, consists of 10 items capturing gastrointestinal symptom burden, tested in 151 pwCF. The PROM reported no floor or ceiling effects, high test–retest reliability (intra-class correlation coefficient=0.94), and strong correlation with the anchor question.

Conclusion

CF Tummy Tracker aims to address the gap in validated CF-specific PROMs for daily completion. Further testing of the psychometric properties of the PROM are planned in a new patient cohort to validate its use in clinical trials and support its use in both electronic and paper formats to increase accessibility.

目的开发一种囊性纤维化（CF）特异性患者报告的结果测量（PROM），以测量12岁及以上囊性纤维化（pwCF）患者胃肠道症状的日常负担，并解决CF患者胃肠道症状缺乏有效结果测量的问题。患者和方法根据监管指南，通过5阶段方法开发了scf Tummy Tracker。这包括发展和完善一个概念框架；项一代;细化;减少;选择;和初始PROM测试。采用了混合方法，包括专家小组讨论、焦点小组、访谈和在线调查。在最初的测试中，参与者通过智能手机应用程序每天完成14天的PROM。本研究于2022年3月14日至2023年12月12日进行。结果CF社区通过焦点小组（n=7 pwCF）、访谈（n=11 pwCF）和在线调查（n=180 pwCF）参与了整个开发过程。确定了PROM的形成模型。最后的PROM， CF胃追踪器，由10个项目组成，捕捉胃肠道症状负担，在151个pwCF中进行了测试。PROM报告没有下限或上限效应，高重测信度（类内相关系数=0.94），与锚定问题有很强的相关性。结论cf胃追踪器旨在解决日常完成的经过验证的cf特异性PROMs的差距。计划在一个新的患者队列中进一步测试PROM的心理测量特性，以验证其在临床试验中的使用，并支持其在电子和纸质格式中的使用，以增加可访问性。

{"title":"CF Tummy Tracker: A Cystic Fibrosis–Specific Patient-Reported Outcome Measure for Daily Gastrointestinal Symptom Burden","authors":"Rebecca J. Calthorpe BMBS , Hisham A. Saumtally MBChB , Laura M. Howells PhD , Natalie J. Goodchild BA (Hons) , Bethinn C. Evans MSc , Zoe Elliott , Bu’Hussain Hayee PhD , Siobhán B. Carr MBBS , Caroline M. Elston MBBS , Alexander A.R. Horsley PhD , Daniel G. Peckham DM , Helen L. Barr PhD , Giles A.D. Major PhD , Iain D. Stewart PhD , Kim S. Thomas , Alan R. Smyth MD","doi":"10.1016/j.mcpdig.2025.100203","DOIUrl":"10.1016/j.mcpdig.2025.100203","url":null,"abstract":"<div><h3>Objective</h3><div>To develop a cystic fibrosis (CF)–specific patient-reported outcome measure (PROM) to measure the daily burden of gastrointestinal symptoms for people with cystic fibrosis (pwCF) aged 12 years and older and address the lack of validated outcome measures for gastrointestinal symptoms in CF.</div></div><div><h3>Patients and Methods</h3><div>CF Tummy Tracker was developed through a 5-stage approach in accordance with regulatory guidance. This included development and refinement of a conceptual framework; item generation; refinement; reduction; selection; and initial PROM testing. A mixed-methods approach, consisting of expert panel discussions, a focus group, interviews, and an online survey, was used. In initial testing, participants completed the PROM daily for 14 days via a smartphone application. This study was performed from March 14, 2022, December 12, 2023.</div></div><div><h3>Results</h3><div>The CF community were involved throughout the development via a focus group (n=7 pwCF), interviews (n=11 pwCF), and an online survey (n=180 pwCF). A formative model was confirmed for the PROM. The final PROM, CF Tummy Tracker, consists of 10 items capturing gastrointestinal symptom burden, tested in 151 pwCF. The PROM reported no floor or ceiling effects, high test–retest reliability (intra-class correlation coefficient=0.94), and strong correlation with the anchor question.</div></div><div><h3>Conclusion</h3><div>CF Tummy Tracker aims to address the gap in validated CF-specific PROMs for daily completion. Further testing of the psychometric properties of the PROM are planned in a new patient cohort to validate its use in clinical trials and support its use in both electronic and paper formats to increase accessibility.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100203"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143680721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses From Closed-Domain Large Language Models Versus Clinical Teams 前列腺癌收件箱信息的回顾性比较分析：封闭大语言模型与临床团队的反应

Mayo Clinic Proceedings. Digital health

Pub Date : 2025-03-01 DOI: 10.1016/j.mcpdig.2025.100198

Yuexing Hao MS , Jason Holmes PhD , Jared Hobson MD , Alexandra Bennett MD , Elizabeth L. McKone MD , Daniel K. Ebner MD , David M. Routman MD , Satomi Shiraishi MD , Samir H. Patel MD , Nathan Y. Yu MD , Chris L. Hallemeier MD , Brooke E. Ball MSN , Mark Waddle MD , Wei Liu PhD

Objective

To evaluate the effectiveness of RadOnc-generative pretrained transformer (GPT), a GPT-4 based large language model, in assisting with in-basket message response generation for prostate cancer treatment, with the goal of reducing the workload and time on clinical care teams while maintaining response quality.

Patients and Methods

RadOnc-GPT was integrated with electronic health records from both Mayo Clinic-wide databases and a radiation-oncology-specific database. The model was evaluated on 158 previously recorded in-basket message interactions, selected from 90 patients with nonmetastatic prostate cancer from the Mayo Clinic Department of Radiation Oncology in-basket message database in the calendar years 2022-2024. Quantitative natural language processing analysis and 2 grading studies, conducted by 5 clinicians and 4 nurses, were used to assess RadOnc-GPT’s responses. Three primary clinicians independently graded all messages, whereas a fourth senior clinician reviewed 41 responses with relevant discrepancies, and a fifth senior clinician evaluated 2 additional responses. The grading focused on 5 key areas: completeness, correctness, clarity, empathy, and editing time. The grading study was performed from July 20, 2024 to December 15, 2024.

Results

The RadOnc-GPT slightly outperformed the clinical care team in empathy, whereas achieving comparable scores with the clinical care team in completeness, correctness, and clarity. Five clinician graders identified key limitations in RadOnc-GPT’s responses, such as lack of context, insufficient domain-specific knowledge, inability to perform essential meta-tasks, and hallucination. It was estimated that RadOnc-GPT could save an average of 5.2 minutes per message for nurses and 2.4 minutes for clinicians, from reading the inquiry to sending the response.

Conclusion

RadOnc-GPT has the potential to considerably reduce the workload of clinical care teams by generating high-quality, timely responses for in-basket message interactions. This could lead to improved efficiency in health care workflows and reduced costs while maintaining or enhancing the quality of communication between patients and health care providers.

目的评价基于GPT-4的大语言模型radonc - generated pretrained transformer （GPT）在辅助前列腺癌治疗的in-basket消息响应生成中的有效性，以减少临床护理团队的工作量和时间，同时保持响应质量。患者和方法radonc - gpt与来自梅奥诊所数据库和放射肿瘤学特定数据库的电子健康记录集成。该模型是根据之前记录的158个信息包交互进行评估的，这些信息包交互是从梅奥诊所放射肿瘤科信息包数据库中选出的90名非转移性前列腺癌患者，时间为2022-2024年。5名临床医生和4名护士进行了定量自然语言处理分析和2项评分研究，用于评估RadOnc-GPT的反应。三位主要临床医生独立地对所有信息进行评分，而第四位高级临床医生审查了41个相关差异的反馈，第五位高级临床医生评估了另外2个反馈。评分主要集中在5个关键领域：完整性、正确性、清晰度、同理心和编辑时间。分级研究时间为2024年7月20日至2024年12月15日。结果RadOnc-GPT在共情方面略优于临床护理组，而在完整性、正确性和清晰度方面与临床护理组得分相当。五名临床医生评分人员指出了RadOnc-GPT反应的主要局限性，如缺乏背景、领域特定知识不足、无法执行基本元任务和幻觉。据估计，从阅读问询到发送回复，RadOnc-GPT平均每条信息为护士节省5.2分钟，为临床医生节省2.4分钟。结论radonc - gpt通过生成高质量、及时的收件箱信息交互响应，有可能大大减少临床护理团队的工作量。这可以提高卫生保健工作流程的效率并降低成本，同时保持或提高患者与卫生保健提供者之间的沟通质量。

{"title":"Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses From Closed-Domain Large Language Models Versus Clinical Teams","authors":"Yuexing Hao MS , Jason Holmes PhD , Jared Hobson MD , Alexandra Bennett MD , Elizabeth L. McKone MD , Daniel K. Ebner MD , David M. Routman MD , Satomi Shiraishi MD , Samir H. Patel MD , Nathan Y. Yu MD , Chris L. Hallemeier MD , Brooke E. Ball MSN , Mark Waddle MD , Wei Liu PhD","doi":"10.1016/j.mcpdig.2025.100198","DOIUrl":"10.1016/j.mcpdig.2025.100198","url":null,"abstract":"<div><h3>Objective</h3><div>To evaluate the effectiveness of RadOnc-generative pretrained transformer (GPT), a GPT-4 based large language model, in assisting with in-basket message response generation for prostate cancer treatment, with the goal of reducing the workload and time on clinical care teams while maintaining response quality.</div></div><div><h3>Patients and Methods</h3><div>RadOnc-GPT was integrated with electronic health records from both Mayo Clinic-wide databases and a radiation-oncology-specific database. The model was evaluated on 158 previously recorded in-basket message interactions, selected from 90 patients with nonmetastatic prostate cancer from the Mayo Clinic Department of Radiation Oncology in-basket message database in the calendar years 2022-2024. Quantitative natural language processing analysis and 2 grading studies, conducted by 5 clinicians and 4 nurses, were used to assess RadOnc-GPT’s responses. Three primary clinicians independently graded all messages, whereas a fourth senior clinician reviewed 41 responses with relevant discrepancies, and a fifth senior clinician evaluated 2 additional responses. The grading focused on 5 key areas: completeness, correctness, clarity, empathy, and editing time. The grading study was performed from July 20, 2024 to December 15, 2024.</div></div><div><h3>Results</h3><div>The RadOnc-GPT slightly outperformed the clinical care team in empathy, whereas achieving comparable scores with the clinical care team in completeness, correctness, and clarity. Five clinician graders identified key limitations in RadOnc-GPT’s responses, such as lack of context, insufficient domain-specific knowledge, inability to perform essential meta-tasks, and hallucination. It was estimated that RadOnc-GPT could save an average of 5.2 minutes per message for nurses and 2.4 minutes for clinicians, from reading the inquiry to sending the response.</div></div><div><h3>Conclusion</h3><div>RadOnc-GPT has the potential to considerably reduce the workload of clinical care teams by generating high-quality, timely responses for in-basket message interactions. This could lead to improved efficiency in health care workflows and reduced costs while maintaining or enhancing the quality of communication between patients and health care providers.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 1","pages":"Article 100198"},"PeriodicalIF":0.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0