首页 > 最新文献

Journal of Medical Internet Research最新文献

英文 中文
Multimodal Large Language Models for Cystoscopic Image Interpretation and Bladder Lesion Classification: Comparative Study. 膀胱镜图像解释和膀胱病变分类的多模态大语言模型:比较研究。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-28 DOI: 10.2196/87193
Yung-Chi Shih, Cheng-Yang Wu, Shi-Wei Huang, Chung-You Tsai
<p><strong>Background: </strong>Cystoscopy remains the gold standard for diagnosing bladder lesions; however, its diagnostic accuracy is operator dependent and prone to missing subtle abnormalities such as carcinoma in situ or misinterpreting mimic lesions (tumor, inflammation, or normal variants). Artificial intelligence-based image-analysis systems are emerging, yet conventional models remain limited to single tasks and cannot produce explanatory reports or articulate diagnostic reasoning. Multimodal large language models (MM-LLMs) integrate visual recognition, contextual reasoning, and language generation, offering interpretive capabilities beyond conventional artificial intelligence.</p><p><strong>Objective: </strong>This study aims to rigorously evaluate state-of-the-art MM-LLMs for cystoscopic image interpretation and lesion classification using clinician-defined stress-test datasets enriched with rare, diverse, and challenging lesions, focusing on diagnostic accuracy, reasoning quality, and clinical relevance.</p><p><strong>Methods: </strong>Four MM-LLMs (OpenAI-o3 and ChatGPT-4o [OpenAI]; Gemini 2.5 Pro and MedGemma-27B [Google]) were evaluated under blinded, randomized procedures across two tasks: (1) free-text image interpretation for anatomic site, findings, lesion reasoning, and final diagnosis (n=401) and (2) seven-class tumor-like lesion classification (n=113) within a multiple-choice framework (cystitis, polyps, papilloma, papillary urothelial carcinoma, carcinoma in situ, non-urothelial carcinoma, and none of the above). Three raters independently scored outputs using a 5-point Likert scale, and classification metrics (accuracy, sensitivity, specificity, Youden J index (Youden J), and Matthews correlation coefficient [MCC]) were calculated for lesion detection, biopsy indication, and malignancy endpoints. For optimization, model performance was compared between zero-shot and text-based in-context learning prompts that were prefixed with brief descriptions of tumor features.</p><p><strong>Results: </strong>The 401-image test set spanned 40 subcategories, with 322 (80.3%) containing abnormal findings in the image interpretation task. OpenAI-o3 demonstrated strong reasoning, with high satisfaction for anatomy (339/401, 84.5%) and findings (305/401, 76%), but lower satisfaction for lesion reasoning (211/401, 52.5%) and final diagnosis (193/401, 48.2%), indicating increasing difficulty with higher-order synthesis. Mean Likert score differences (OpenAI-o3 minus Gemini 2.5 Pro) were +0.27 for findings (adjusted P value: q=0.002), +0.24 for lesion reasoning (q=0.047), and +0.19 for final diagnosis. For clinically relevant endpoints in the full set, OpenAI-o3 achieved the most balanced performance, with lesion detection accuracy of 88.3%, sensitivity of 92%, specificity of 73.1%, Youden J of 0.650, and MCC of 0.635. In 7-class tumor-like lesion classification, OpenAI-o3 achieved accuracies of 73.5% for biopsy indication and 62.8% for malig
背景:膀胱镜检查仍然是诊断膀胱病变的金标准;然而,它的诊断准确性依赖于操作者,并且容易遗漏细微的异常,如原位癌或误解模拟病变(肿瘤,炎症或正常变异)。基于人工智能的图像分析系统正在兴起,但传统模型仍然局限于单一任务,无法产生解释性报告或清晰的诊断推理。多模态大型语言模型(mm - llm)集成了视觉识别、上下文推理和语言生成,提供了超越传统人工智能的解释能力。目的:本研究旨在严格评估最先进的mm - llm用于膀胱镜图像解释和病变分类,使用临床定义的压力测试数据集,丰富罕见,多样和具有挑战性的病变,重点关注诊断准确性,推理质量和临床相关性。方法:4个MM-LLMs (OpenAI- 03和chatgpt - 40 [OpenAI]);Gemini 2.5 Pro和MedGemma-27B [b谷歌]在两项任务下采用盲法、随机程序进行评估:(1)对解剖部位、发现、病变推理和最终诊断的自由文本图像解释(n=401);(2)在多项选择框架(膀胱炎、息肉、乳头状瘤、乳头状尿路上皮癌、原位癌、非尿路上皮癌,以及以上均非)中对七类肿瘤样病变进行分类(n=113)。三位评分者使用5分制Likert量表对结果进行独立评分,并计算病变检测、活检指征和恶性终点的分类指标(准确性、敏感性、特异性、约登J指数(Youden J)和马修斯相关系数[MCC])。为了优化,比较了零射击和基于文本的上下文学习提示(前缀为肿瘤特征的简要描述)之间的模型性能。结果:401张图像测试集跨越40个子类别,其中322个(80.3%)在图像判读任务中包含异常发现。OpenAI-o3表现出较强的推理能力,对解剖(339/401,84.5%)和发现(305/401,76%)的满意度较高,但对病变推理(211/401,52.5%)和最终诊断(193/401,48.2%)的满意度较低,表明高阶综合难度增加。平均李克特评分差异(OpenAI-o3减去Gemini 2.5 Pro)在发现方面为+0.27(校正P值:q=0.002),在病变推理方面为+0.24 (q=0.047),在最终诊断方面为+0.19。对于全套临床相关终点,OpenAI-o3的表现最为平衡,病变检测准确率为88.3%,灵敏度为92%,特异性为73.1%,Youden J为0.650,MCC为0.635。在7级肿瘤样病变分类中,OpenAI-o3对活检指征的准确率为73.5%,对恶性肿瘤的准确率为62.8%,具有平衡的敏感性和特异性权衡,优于其他模型。值得注意的是,OpenAI-o3在常见恶性病变上表现最好。chatgpt - 40和Gemini 2.5 Pro表现出高灵敏度但低特异性,而MedGemma-27B表现不佳。上下文学习提高了OpenAI-o3的微平均准确率(40.7%→46.0%;MCC 0.311→0.370),但在其他模型中只产生了轻微的特异性增益和最小的准确率变化,可能受到缺少成对的图像-文本上下文的限制。结论:mm - llm在产生可解释的膀胱镜检查理由和支持活检分诊和培训方面显示出有意义的辅助潜力。然而,在困难的鉴别诊断方面的表现仍然温和,需要在安全的临床整合之前进一步优化。
{"title":"Multimodal Large Language Models for Cystoscopic Image Interpretation and Bladder Lesion Classification: Comparative Study.","authors":"Yung-Chi Shih, Cheng-Yang Wu, Shi-Wei Huang, Chung-You Tsai","doi":"10.2196/87193","DOIUrl":"10.2196/87193","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Cystoscopy remains the gold standard for diagnosing bladder lesions; however, its diagnostic accuracy is operator dependent and prone to missing subtle abnormalities such as carcinoma in situ or misinterpreting mimic lesions (tumor, inflammation, or normal variants). Artificial intelligence-based image-analysis systems are emerging, yet conventional models remain limited to single tasks and cannot produce explanatory reports or articulate diagnostic reasoning. Multimodal large language models (MM-LLMs) integrate visual recognition, contextual reasoning, and language generation, offering interpretive capabilities beyond conventional artificial intelligence.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to rigorously evaluate state-of-the-art MM-LLMs for cystoscopic image interpretation and lesion classification using clinician-defined stress-test datasets enriched with rare, diverse, and challenging lesions, focusing on diagnostic accuracy, reasoning quality, and clinical relevance.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Four MM-LLMs (OpenAI-o3 and ChatGPT-4o [OpenAI]; Gemini 2.5 Pro and MedGemma-27B [Google]) were evaluated under blinded, randomized procedures across two tasks: (1) free-text image interpretation for anatomic site, findings, lesion reasoning, and final diagnosis (n=401) and (2) seven-class tumor-like lesion classification (n=113) within a multiple-choice framework (cystitis, polyps, papilloma, papillary urothelial carcinoma, carcinoma in situ, non-urothelial carcinoma, and none of the above). Three raters independently scored outputs using a 5-point Likert scale, and classification metrics (accuracy, sensitivity, specificity, Youden J index (Youden J), and Matthews correlation coefficient [MCC]) were calculated for lesion detection, biopsy indication, and malignancy endpoints. For optimization, model performance was compared between zero-shot and text-based in-context learning prompts that were prefixed with brief descriptions of tumor features.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The 401-image test set spanned 40 subcategories, with 322 (80.3%) containing abnormal findings in the image interpretation task. OpenAI-o3 demonstrated strong reasoning, with high satisfaction for anatomy (339/401, 84.5%) and findings (305/401, 76%), but lower satisfaction for lesion reasoning (211/401, 52.5%) and final diagnosis (193/401, 48.2%), indicating increasing difficulty with higher-order synthesis. Mean Likert score differences (OpenAI-o3 minus Gemini 2.5 Pro) were +0.27 for findings (adjusted P value: q=0.002), +0.24 for lesion reasoning (q=0.047), and +0.19 for final diagnosis. For clinically relevant endpoints in the full set, OpenAI-o3 achieved the most balanced performance, with lesion detection accuracy of 88.3%, sensitivity of 92%, specificity of 73.1%, Youden J of 0.650, and MCC of 0.635. In 7-class tumor-like lesion classification, OpenAI-o3 achieved accuracies of 73.5% for biopsy indication and 62.8% for malig","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e87193"},"PeriodicalIF":6.0,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12895159/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146194584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ethical Knowledge, Challenges, and Institutional Strategies Among Medical AI Developers and Researchers: Focus Group Study. 医疗人工智能开发者和研究人员的伦理知识、挑战和制度策略:焦点小组研究。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-28 DOI: 10.2196/79613
Sophia Fantus, Jinxu Li, Tianci Wang, Lu Tang
<p><strong>Background: </strong>As artificial intelligence (AI) becomes increasingly embedded in clinical decision-making and preventive care, it is urgent to address ethical concerns such as bias, privacy, and transparency to protect clinician and patient populations. Although prior research has examined the perspectives of medical AI stakeholders, including clinicians, patients, and health system leaders, far less is known about how medical AI developers and researchers understand and engage with ethical challenges as they develop AI tools. This gap is consequential because developers' ethical awareness, decision-making, and institutional environments influence how AI tools are conceptualized and deployed in practice. Thus, it is essential to understand how developers perceive these issues and what supports they identify as necessary for ethical AI development.</p><p><strong>Objective: </strong>The objectives of the study were twofold: (1) to examine medical AI developers' and researchers' knowledge, attitudes, and experiences with AI ethics; and (2) to identify recommendations to enhance and strengthen interpersonal and institutional ethics-focused training and support.</p><p><strong>Methods: </strong>We conducted 2 semistructured focus groups (60-90 minutes each) in 2024 with 13 AI developers and researchers affiliated with 5 US-based academic institutions. Participants' work spanned a wide variety of medical AI applications, including Alzheimer disease prediction, clinical imaging, electronic health records analysis, digital health, counseling and behavioral health, and genotype-phenotype modeling. Focus groups were conducted via Microsoft Teams, recorded, and transcribed verbatim. We applied conventional qualitative content analysis to inductively identify emerging concepts, categories, and themes. Coding was performed independently by 3 researchers, with consensus reached through iterative team meetings.</p><p><strong>Results: </strong>The analysis identified four key themes: (1) AI ethics knowledge acquisition: participants reported learning about ethics informally through peer-reviewed literature, reviewer feedback, social media, and mentorship rather than through structured training; (2) ethical encounters: participants described recurring ethical challenges related to data bias, patient privacy, generative AI use, commercialization pressures, and a tendency for research environments to prioritize model accuracy over ethical reflection; (3) reflections on ethical implications: participants expressed concern about downstream effects on patient care and clinician autonomy, and model generalizability, noting that rapid technological innovation outpaces regulatory and evaluative processes; and (4) strategies to mitigate ethical concerns: recommendations included clearer institutional guidelines, ethics checklists, interdisciplinary collaboration, multi-institutional data sharing, enhanced institutional review board support, and the inclusio
背景:随着人工智能(AI)越来越多地融入临床决策和预防保健,迫切需要解决诸如偏见、隐私和透明度等伦理问题,以保护临床医生和患者群体。尽管之前的研究已经检查了医疗人工智能利益相关者(包括临床医生、患者和卫生系统领导者)的观点,但对于医疗人工智能开发人员和研究人员在开发人工智能工具时如何理解和应对伦理挑战,人们知之甚少。这种差距是必然的,因为开发人员的道德意识、决策和制度环境会影响人工智能工具在实践中的概念化和部署。因此,有必要了解开发人员如何看待这些问题,以及他们认为道德人工智能开发所必需的支持。目的:本研究的目的有两个:(1)调查医疗人工智能开发人员和研究人员对人工智能伦理的知识、态度和经验;(2)提出建议,加强以人际和机构道德为重点的培训和支持。方法:我们在2024年与隶属于5个美国学术机构的13名人工智能开发人员和研究人员进行了两次半结构化焦点小组(每次60-90分钟)。参与者的工作涵盖了各种各样的医疗人工智能应用,包括阿尔茨海默病预测、临床成像、电子健康记录分析、数字健康、咨询和行为健康以及基因型-表型建模。焦点小组通过微软团队进行,逐字记录和转录。我们应用传统的定性内容分析来归纳识别新出现的概念、类别和主题。编码由3名研究人员独立完成,通过反复的团队会议达成共识。结果:分析确定了四个关键主题:(1)人工智能伦理知识获取:参与者报告通过同行评审文献、审稿人反馈、社交媒体和指导等非正式方式学习伦理知识,而不是通过结构化培训;(2)伦理遭遇:参与者描述了与数据偏差、患者隐私、生成式人工智能使用、商业化压力以及研究环境优先考虑模型准确性而非伦理反思的趋势相关的反复出现的伦理挑战;(3)伦理影响的反思:与会者对患者护理和临床医生自主权的下游影响以及模型的可泛化性表示担忧,并指出快速的技术创新超过了监管和评估过程;(4)减轻伦理问题的策略:建议包括更明确的机构指南、伦理清单、跨学科合作、多机构数据共享、加强机构审查委员会的支持,以及将生物伦理学家纳入人工智能研究团队。结论:医疗人工智能开发人员和研究人员认识到他们的工作中存在重大的伦理挑战,但缺乏结构化的培训、资源和体制机制来解决这些问题。本研究的结果强调了机构需要考虑通过实用工具、指导和跨学科伙伴关系将伦理嵌入研究过程。加强这些支持对于培养下一代开发人员在卫生保健领域设计和部署合乎道德的人工智能至关重要。
{"title":"Ethical Knowledge, Challenges, and Institutional Strategies Among Medical AI Developers and Researchers: Focus Group Study.","authors":"Sophia Fantus, Jinxu Li, Tianci Wang, Lu Tang","doi":"10.2196/79613","DOIUrl":"10.2196/79613","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;As artificial intelligence (AI) becomes increasingly embedded in clinical decision-making and preventive care, it is urgent to address ethical concerns such as bias, privacy, and transparency to protect clinician and patient populations. Although prior research has examined the perspectives of medical AI stakeholders, including clinicians, patients, and health system leaders, far less is known about how medical AI developers and researchers understand and engage with ethical challenges as they develop AI tools. This gap is consequential because developers' ethical awareness, decision-making, and institutional environments influence how AI tools are conceptualized and deployed in practice. Thus, it is essential to understand how developers perceive these issues and what supports they identify as necessary for ethical AI development.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;The objectives of the study were twofold: (1) to examine medical AI developers' and researchers' knowledge, attitudes, and experiences with AI ethics; and (2) to identify recommendations to enhance and strengthen interpersonal and institutional ethics-focused training and support.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We conducted 2 semistructured focus groups (60-90 minutes each) in 2024 with 13 AI developers and researchers affiliated with 5 US-based academic institutions. Participants' work spanned a wide variety of medical AI applications, including Alzheimer disease prediction, clinical imaging, electronic health records analysis, digital health, counseling and behavioral health, and genotype-phenotype modeling. Focus groups were conducted via Microsoft Teams, recorded, and transcribed verbatim. We applied conventional qualitative content analysis to inductively identify emerging concepts, categories, and themes. Coding was performed independently by 3 researchers, with consensus reached through iterative team meetings.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The analysis identified four key themes: (1) AI ethics knowledge acquisition: participants reported learning about ethics informally through peer-reviewed literature, reviewer feedback, social media, and mentorship rather than through structured training; (2) ethical encounters: participants described recurring ethical challenges related to data bias, patient privacy, generative AI use, commercialization pressures, and a tendency for research environments to prioritize model accuracy over ethical reflection; (3) reflections on ethical implications: participants expressed concern about downstream effects on patient care and clinician autonomy, and model generalizability, noting that rapid technological innovation outpaces regulatory and evaluative processes; and (4) strategies to mitigate ethical concerns: recommendations included clearer institutional guidelines, ethics checklists, interdisciplinary collaboration, multi-institutional data sharing, enhanced institutional review board support, and the inclusio","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e79613"},"PeriodicalIF":6.0,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12895146/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146194589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and User-Centered Evaluation of Smart Systems for Loneliness Monitoring in Older Adults: Mixed Methods Study. 老年人孤独感监测智能系统的开发和以用户为中心的评估:混合方法研究。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-28 DOI: 10.2196/81027
Yi Zhou, Jessica Rees, Faith Matcham, Ashay Patel, Michela Antonelli, Anthea Tinker, Sebastien Ourselin, Wei Liu
<p><strong>Background: </strong>Loneliness is a critical issue among older adults and constitutes a significant risk factor for a range of physical and mental health conditions. However, current assessment methods primarily rely on self-report questionnaires and clinical evaluations, which are susceptible to recall bias and social desirability bias, highlighting the need for more objective and continuous assessment approaches. Recent studies have reported associations between physiological and behavioral indicators and the experience of loneliness in older adults. While these technologies have demonstrated correlations between physiological and behavioral sensor data and the experience of loneliness, their implementation has been limited. Most systems rely on fixed-location sensors or smartphone apps, with little attention given to the integration of these tools into users' daily routines. To date, no published studies have applied smart textile technology, which integrates sensing capabilities directly into garments or furniture, as a medium for loneliness detection. This study addresses that gap by exploring the usability, experiential acceptability, and ethical considerations of smart textile-based monitoring systems.</p><p><strong>Objective: </strong>This study aims to assess the perceived usability, acceptability, and emotional resonance of a smart loneliness monitoring system integrating sensing garments, furniture, and a mobile app and identify design implications to guide future improvement and promote sustained engagement among older adults.</p><p><strong>Methods: </strong>Building on earlier conceptual research, a functional prototype system was developed and evaluated through 2 immersive in-person workshops with older adults (N=10). A mixed methods approach was applied, combining structured questionnaires, sensory ethnographic observations, focus group discussions, and experience-based co-design. Quantitative data were analyzed descriptively, and qualitative data were analyzed thematically to explore user perceptions related to system usability, emotional response, lifestyle compatibility, and ethical considerations.</p><p><strong>Results: </strong>Quantitative data indicated high user satisfaction in dimensions such as comfort, ease of use, and feedback clarity. However, trust in long-term monitoring and willingness to use the system regularly varied. Thematic analysis revealed 4 main areas influencing acceptance, including wearability, usability, and daily integration; trust, privacy, and data control; perceptions of loneliness and the limits of detection; and adoption, applicability, and ethical futures. Participants emphasized the need for discretion, personalization, and human oversight in system feedback and data-sharing mechanisms.</p><p><strong>Conclusions: </strong>The resulting prototype was positively received, demonstrating the potential of smart systems for passive and personalized loneliness monitoring among older adults.
背景:孤独是老年人的一个关键问题,是一系列身心健康状况的重要风险因素。然而,目前的评估方法主要依赖于自我报告问卷和临床评估,容易受到回忆偏差和社会期望偏差的影响,因此需要更客观和持续的评估方法。最近的研究报告了生理和行为指标与老年人孤独感之间的联系。虽然这些技术已经证明了生理和行为传感器数据与孤独体验之间的相关性,但它们的实施受到限制。大多数系统依赖于固定位置传感器或智能手机应用程序,很少关注将这些工具整合到用户的日常生活中。到目前为止,还没有发表的研究将智能纺织品技术作为孤独检测的媒介,这种技术将传感能力直接集成到服装或家具中。本研究通过探索基于智能纺织品的监测系统的可用性、经验可接受性和伦理考虑来解决这一差距。目的:本研究旨在评估集成传感服装、家具和移动应用程序的智能孤独监测系统的感知可用性、可接受性和情感共鸣,并确定设计含义,以指导未来的改进并促进老年人的持续参与。方法:在早期概念研究的基础上,开发了一个功能原型系统,并通过两次与老年人(N=10)的沉浸式面对面研讨会进行了评估。采用混合方法,结合结构化问卷调查、感官人种学观察、焦点小组讨论和基于经验的共同设计。定量数据进行描述性分析,定性数据进行主题分析,以探索与系统可用性、情感反应、生活方式兼容性和伦理考虑相关的用户感知。结果:定量数据表明,在舒适度、易用性和反馈清晰度等方面,用户满意度较高。然而,对长期监测的信任和定期使用该系统的意愿各不相同。专题分析揭示了影响接受度的4个主要领域,包括可穿戴性、可用性和日常集成;信任、隐私和数据控制;对孤独的感知和检测的限制;以及采用,适用性和道德的未来。与会者强调在系统反馈和数据共享机制中需要酌情决定、个性化和人为监督。结论:由此产生的原型得到了积极的接受,证明了智能系统在老年人中被动和个性化孤独监测的潜力。然而,收养受自主性、情绪敏感性和情境整合的影响。未来的发展应侧重于护理基础设施的模块化、透明度和一体化,以确保合乎道德和可持续的部署。
{"title":"Development and User-Centered Evaluation of Smart Systems for Loneliness Monitoring in Older Adults: Mixed Methods Study.","authors":"Yi Zhou, Jessica Rees, Faith Matcham, Ashay Patel, Michela Antonelli, Anthea Tinker, Sebastien Ourselin, Wei Liu","doi":"10.2196/81027","DOIUrl":"10.2196/81027","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Loneliness is a critical issue among older adults and constitutes a significant risk factor for a range of physical and mental health conditions. However, current assessment methods primarily rely on self-report questionnaires and clinical evaluations, which are susceptible to recall bias and social desirability bias, highlighting the need for more objective and continuous assessment approaches. Recent studies have reported associations between physiological and behavioral indicators and the experience of loneliness in older adults. While these technologies have demonstrated correlations between physiological and behavioral sensor data and the experience of loneliness, their implementation has been limited. Most systems rely on fixed-location sensors or smartphone apps, with little attention given to the integration of these tools into users' daily routines. To date, no published studies have applied smart textile technology, which integrates sensing capabilities directly into garments or furniture, as a medium for loneliness detection. This study addresses that gap by exploring the usability, experiential acceptability, and ethical considerations of smart textile-based monitoring systems.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to assess the perceived usability, acceptability, and emotional resonance of a smart loneliness monitoring system integrating sensing garments, furniture, and a mobile app and identify design implications to guide future improvement and promote sustained engagement among older adults.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Building on earlier conceptual research, a functional prototype system was developed and evaluated through 2 immersive in-person workshops with older adults (N=10). A mixed methods approach was applied, combining structured questionnaires, sensory ethnographic observations, focus group discussions, and experience-based co-design. Quantitative data were analyzed descriptively, and qualitative data were analyzed thematically to explore user perceptions related to system usability, emotional response, lifestyle compatibility, and ethical considerations.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Quantitative data indicated high user satisfaction in dimensions such as comfort, ease of use, and feedback clarity. However, trust in long-term monitoring and willingness to use the system regularly varied. Thematic analysis revealed 4 main areas influencing acceptance, including wearability, usability, and daily integration; trust, privacy, and data control; perceptions of loneliness and the limits of detection; and adoption, applicability, and ethical futures. Participants emphasized the need for discretion, personalization, and human oversight in system feedback and data-sharing mechanisms.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The resulting prototype was positively received, demonstrating the potential of smart systems for passive and personalized loneliness monitoring among older adults.","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e81027"},"PeriodicalIF":6.0,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12895156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146194598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting the Intention to Use Generative Artificial Intelligence for Health Information: Comparative Survey Study. 预测健康信息使用生成式人工智能的意图:比较调查研究。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-28 DOI: 10.2196/75648
Jörg Matthes, Anne Reinhardt, Selma Hodzic, Jaroslava Kaňková, Alice Binder, Ljubisa Bojic, Helle Terkildsen Maindal, Corina Paraschiv, Knud Ryom
<p><strong>Background: </strong>The rise of generative artificial intelligence (AI) tools such as ChatGPT is rapidly transforming how people access information online. In the health context, generative AI is seen as a potentially disruptive information source due to its low entry barriers, conversational style, and ability to tailor content to users' needs. However, little is known about whether and how individuals use generative AI for health purposes, and which groups may benefit or be left behind, raising important questions of digital health equity.</p><p><strong>Objective: </strong>This study aimed to assess the current relevance of generative AI as a health information source and to identify key factors predicting individuals' intention to use it. We applied the Unified Theory of Acceptance and Use of Technology 2, focusing on 6 core predictors: performance expectancy, effort expectancy, facilitating conditions, social influence, habit, and hedonic motivation. In addition, we extended the model by including health literacy and health status. A cross-national design enabled comparison across 4 European countries.</p><p><strong>Methods: </strong>A representative online survey was conducted in September 2024 with 1990 participants aged 16 to 74 years from Austria (n=502), Denmark (n=507), France (n=498), and Serbia (n=483). Structural equation modeling with metric measurement invariance was used to test associations across countries.</p><p><strong>Results: </strong>Usage of generative AI for health information was still limited: only 39.5% of respondents reported having used it at least rarely. Generative AI ranked last among all measured health information sources (mean 2.08, SD 1.66); instead, medical experts (mean 4.77, SD 1.70) and online search engines (mean 4.57, SD 1.88) are still the most frequently used health information sources. Despite this, performance expectancy (b range=0.44-0.53; all P<.001), habit (b range=0.28-0.32; all P<.001), and hedonic motivation (b range=0.22-0.45; all P<.001) consistently predicted behavioral intention in all countries. Facilitating conditions also showed small but significant effects (b range=0.12-0.24; all P<.01). In contrast, effort expectancy, social influence, health literacy, and health status were unrelated to intention in all countries, with one marginal exception (France: health status, b=-0.09; P=.007). Model fit was good (comparative fit index=0.95; root mean square error of approximation=0.03), and metric invariance was confirmed.</p><p><strong>Conclusions: </strong>Generative AI use for health information is currently driven by early adopters-those who find it useful, easy to integrate, enjoyable, and have the necessary skills and infrastructure to do so. Cross-national consistency suggests a shared adoption pattern across Europe. To promote equitable adoption, communication efforts should focus on usefulness, convenience, and enjoyment, while ensuring digital access and safeguards for vul
背景:ChatGPT等生成式人工智能(AI)工具的兴起正在迅速改变人们在线获取信息的方式。在卫生领域,生成式人工智能被视为一种潜在的颠覆性信息源,因为它的入门门槛低、对话风格和根据用户需求定制内容的能力。然而,对于个人是否以及如何为健康目的使用生成人工智能,以及哪些群体可能受益或落后,人们知之甚少,这就提出了数字卫生公平的重要问题。目的:本研究旨在评估当前生成式人工智能作为健康信息源的相关性,并确定预测个人使用它的意愿的关键因素。我们应用了技术接受和使用统一理论2,重点关注6个核心预测因素:绩效预期、努力预期、促进条件、社会影响、习惯和享乐动机。此外,我们通过纳入健康素养和健康状况扩展了模型。一项跨国设计使4个欧洲国家之间的比较成为可能。方法:于2024年9月对来自奥地利(n=502)、丹麦(n=507)、法国(n=498)和塞尔维亚(n=483)的1990名16至74岁的参与者进行有代表性的在线调查。使用具有度量不变性的结构方程模型来检验各国之间的关联。结果:生成式人工智能在健康信息方面的使用仍然有限:只有39.5%的受访者表示至少很少使用它。生成式人工智能在所有测量的健康信息源中排名最后(平均值2.08,标准差1.66);相反,医学专家(平均4.77,标准差1.70)和在线搜索引擎(平均4.57,标准差1.88)仍然是最常用的健康信息来源。尽管如此,性能预期(b范围=0.44-0.53);所有结论:健康信息的生成式人工智能使用目前是由早期采用者驱动的——那些认为它有用、易于集成、令人愉快,并且拥有必要的技能和基础设施的人。跨国一致性表明,整个欧洲的采用模式是一致的。为促进公平采用,传播工作应侧重于有用性、便利性和享受性,同时确保弱势用户的数字访问和保障。
{"title":"Predicting the Intention to Use Generative Artificial Intelligence for Health Information: Comparative Survey Study.","authors":"Jörg Matthes, Anne Reinhardt, Selma Hodzic, Jaroslava Kaňková, Alice Binder, Ljubisa Bojic, Helle Terkildsen Maindal, Corina Paraschiv, Knud Ryom","doi":"10.2196/75648","DOIUrl":"10.2196/75648","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The rise of generative artificial intelligence (AI) tools such as ChatGPT is rapidly transforming how people access information online. In the health context, generative AI is seen as a potentially disruptive information source due to its low entry barriers, conversational style, and ability to tailor content to users' needs. However, little is known about whether and how individuals use generative AI for health purposes, and which groups may benefit or be left behind, raising important questions of digital health equity.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to assess the current relevance of generative AI as a health information source and to identify key factors predicting individuals' intention to use it. We applied the Unified Theory of Acceptance and Use of Technology 2, focusing on 6 core predictors: performance expectancy, effort expectancy, facilitating conditions, social influence, habit, and hedonic motivation. In addition, we extended the model by including health literacy and health status. A cross-national design enabled comparison across 4 European countries.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;A representative online survey was conducted in September 2024 with 1990 participants aged 16 to 74 years from Austria (n=502), Denmark (n=507), France (n=498), and Serbia (n=483). Structural equation modeling with metric measurement invariance was used to test associations across countries.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Usage of generative AI for health information was still limited: only 39.5% of respondents reported having used it at least rarely. Generative AI ranked last among all measured health information sources (mean 2.08, SD 1.66); instead, medical experts (mean 4.77, SD 1.70) and online search engines (mean 4.57, SD 1.88) are still the most frequently used health information sources. Despite this, performance expectancy (b range=0.44-0.53; all P&lt;.001), habit (b range=0.28-0.32; all P&lt;.001), and hedonic motivation (b range=0.22-0.45; all P&lt;.001) consistently predicted behavioral intention in all countries. Facilitating conditions also showed small but significant effects (b range=0.12-0.24; all P&lt;.01). In contrast, effort expectancy, social influence, health literacy, and health status were unrelated to intention in all countries, with one marginal exception (France: health status, b=-0.09; P=.007). Model fit was good (comparative fit index=0.95; root mean square error of approximation=0.03), and metric invariance was confirmed.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Generative AI use for health information is currently driven by early adopters-those who find it useful, easy to integrate, enjoyable, and have the necessary skills and infrastructure to do so. Cross-national consistency suggests a shared adoption pattern across Europe. To promote equitable adoption, communication efforts should focus on usefulness, convenience, and enjoyment, while ensuring digital access and safeguards for vul","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e75648"},"PeriodicalIF":6.0,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12851524/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146093398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extended Grammar of Systematized Nomenclature of Medicine - Clinical Terms for Semantic Representation of Clinical Data: Methodological Study. 医学系统化命名法的扩展语法。临床数据语义表示的临床术语:方法学研究
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-28 DOI: 10.2196/80314
Christophe Gaudet-Blavignac, Julien Ehrsam, Monika Baumann, Adel Bensahla, Mirjam Mattei, Yuanyuan Zheng, Christian Lovis

Background: Interoperability has been a challenge for half a century. Led by an informatics view of the world, the quest for interoperability has evolved from typing and categorizing data to building increasingly complex models. In parallel with the development of these models, the field of terminologies and ontologies emerged to refine granularity and introduce notions of hierarchy. Clinical data models and terminology systems vary in purpose, and their fixed categories shape and constrain representation, which inevitably leads to information loss.

Objective: Despite these efforts, semantic interoperability remains imperfect. Achieving it is essential for effective data reuse but requires more than rich terminologies and standardized models. This methodological study explores the extent to which the SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms) compositional grammar can be leveraged and extended to approximate a formal descriptive grammar, allowing clinical reality to be expressed in coherent, meaningful sentences rather than preconstrained categories.

Methods: Building on a decade of semantic representation efforts at the Geneva University Hospitals, we developed a framework to identify recurring semantic gaps in clinical data. We addressed these gaps by systematically modifying the SNOMED CT Machine Read` Concept Model and extending its Augmented Backus-Naur Form syntax to support necessary grammatical structures and external vocabularies.

Results: This approach enabled the semantic representation of over 119,000 distinct data elements covering 13 billion instances. By extending the grammar, we successfully addressed critical limitations such as negation, scalar values, uncertainty, temporality, and the integration of external terminologies like Pango. The extensions proved essential for capturing complex clinical nuances that standard precoordinated concepts could not represent.

Conclusions: Rather than creating a new standard from scratch, extending the grammatical capabilities of SNOMED CT offers a viable pathway toward high-fidelity semantic representation. This work serves as a proof-of-concept that separating the rules of composition from vocabulary allows for a more flexible and robust description of clinical reality, provided that challenges regarding governance and machine readability are addressed.

背景:半个世纪以来,互操作性一直是一个挑战。在信息学世界观的引导下,对互操作性的追求已经从输入和分类数据发展到构建越来越复杂的模型。在开发这些模型的同时,出现了术语和本体领域,以细化粒度并引入层次结构的概念。临床数据模型和术语系统的目的各不相同,其固定的类别塑造和约束了表征,不可避免地导致信息丢失。目的:尽管有这些努力,语义互操作性仍然不完善。实现这一点对于有效的数据重用至关重要,但需要的不仅仅是丰富的术语和标准化模型。本方法学研究探讨了SNOMED CT(系统化医学术语-临床术语)组成语法可以被利用和扩展到近似正式描述性语法的程度,允许临床现实用连贯、有意义的句子表达,而不是预先限制的类别。方法:在日内瓦大学医院十年语义表示工作的基础上,我们开发了一个框架来识别临床数据中反复出现的语义差距。我们通过系统地修改SNOMED CT机器读取概念模型并扩展其增强Backus-Naur形式语法来支持必要的语法结构和外部词汇来解决这些差距。结果:该方法实现了覆盖130亿个实例的超过119,000个不同数据元素的语义表示。通过扩展语法,我们成功地解决了一些关键的限制,比如否定、标量值、不确定性、时间性,以及像Pango这样的外部术语的集成。事实证明,扩展对于捕捉复杂的临床细微差别至关重要,而标准的预先协调概念无法表示这些细微差别。结论:与其从头开始创建一个新的标准,不如扩展SNOMED CT的语法能力,为实现高保真语义表示提供了一条可行的途径。这项工作作为概念证明,将组合规则从词汇表中分离出来,可以更灵活、更健壮地描述临床现实,前提是解决了有关治理和机器可读性的挑战。
{"title":"Extended Grammar of Systematized Nomenclature of Medicine - Clinical Terms for Semantic Representation of Clinical Data: Methodological Study.","authors":"Christophe Gaudet-Blavignac, Julien Ehrsam, Monika Baumann, Adel Bensahla, Mirjam Mattei, Yuanyuan Zheng, Christian Lovis","doi":"10.2196/80314","DOIUrl":"10.2196/80314","url":null,"abstract":"<p><strong>Background: </strong>Interoperability has been a challenge for half a century. Led by an informatics view of the world, the quest for interoperability has evolved from typing and categorizing data to building increasingly complex models. In parallel with the development of these models, the field of terminologies and ontologies emerged to refine granularity and introduce notions of hierarchy. Clinical data models and terminology systems vary in purpose, and their fixed categories shape and constrain representation, which inevitably leads to information loss.</p><p><strong>Objective: </strong>Despite these efforts, semantic interoperability remains imperfect. Achieving it is essential for effective data reuse but requires more than rich terminologies and standardized models. This methodological study explores the extent to which the SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms) compositional grammar can be leveraged and extended to approximate a formal descriptive grammar, allowing clinical reality to be expressed in coherent, meaningful sentences rather than preconstrained categories.</p><p><strong>Methods: </strong>Building on a decade of semantic representation efforts at the Geneva University Hospitals, we developed a framework to identify recurring semantic gaps in clinical data. We addressed these gaps by systematically modifying the SNOMED CT Machine Read` Concept Model and extending its Augmented Backus-Naur Form syntax to support necessary grammatical structures and external vocabularies.</p><p><strong>Results: </strong>This approach enabled the semantic representation of over 119,000 distinct data elements covering 13 billion instances. By extending the grammar, we successfully addressed critical limitations such as negation, scalar values, uncertainty, temporality, and the integration of external terminologies like Pango. The extensions proved essential for capturing complex clinical nuances that standard precoordinated concepts could not represent.</p><p><strong>Conclusions: </strong>Rather than creating a new standard from scratch, extending the grammatical capabilities of SNOMED CT offers a viable pathway toward high-fidelity semantic representation. This work serves as a proof-of-concept that separating the rules of composition from vocabulary allows for a more flexible and robust description of clinical reality, provided that challenges regarding governance and machine readability are addressed.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e80314"},"PeriodicalIF":6.0,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12895155/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146194552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Techniques Used for the Identification of Sociodemographic Factors Associated With Cancer: Systematic Literature Review. 用于识别与癌症相关的社会人口因素的机器学习技术:系统文献综述。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-28 DOI: 10.2196/79187
Liz González-Infante, Gaston Marquez, Solange Parra-Soto, Mónica Cardona-Valencia, Carla Taramasco

Background: Cancer remains one of the foremost global causes of mortality, with nearly 10 million deaths recorded by 2020. As incidence rates rise, there is a growing interest in leveraging machine learning (ML) to enhance prediction, diagnosis, and treatment strategies. Despite these advancements, insufficient attention has been directed toward the integration of sociodemographic variables, which are crucial determinants of health equity, into ML models in oncology.

Objective: This review aims to investigate how ML techniques have been used to identify patterns of predictive association between sociodemographic factors and cancer-related outcomes. Specifically, it seeks to map current research endeavors by detailing the types of algorithms used, the sociodemographic variables examined, and the validation methodologies used.

Methods: We conducted a systematic literature review in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Searches were executed across 6 databases, focusing on the primary studies using ML to investigate the association between sociodemographic characteristics and cancer-related outcomes. The search strategy was informed by the PICO (population, intervention, comparison, and outcome) framework, and a set of predefined inclusion criteria was used to screen the studies. The methodological quality of each included paper was assessed.

Results: Out of the 328 records examined, 19 satisfied the inclusion criteria. The majority of studies used supervised ML techniques, with random forest and extreme gradient boosting being the most commonly used. Frequently analyzed variables include age, male or female or intersex, education level, income, and geographic location. Cross-validation is the predominant method for evaluating model performance. Nevertheless, the integration of clinical and sociodemographic data is limited, and efforts toward external validation are infrequent.

Conclusions: ML holds significant potential for discerning patterns associated with the social determinants of cancer. Nevertheless, research in this domain remains fragmented and inconsistent. Future investigations should prioritize the integration of contextual factors, enhance model transparency, and bolster external validation. These measures are crucial for the development of more equitable, generalizable, and actionable ML applications in cancer care.

背景:癌症仍然是全球最主要的死亡原因之一,到2020年将有近1000万人死亡。随着发病率的上升,人们对利用机器学习(ML)来增强预测、诊断和治疗策略的兴趣越来越大。尽管取得了这些进展,但对将社会人口变量(健康公平的关键决定因素)整合到肿瘤学ML模型中的关注还不够。目的:本综述旨在探讨机器学习技术如何用于识别社会人口因素与癌症相关结果之间的预测关联模式。具体来说,它试图通过详细描述所使用的算法类型、所检查的社会人口变量和所使用的验证方法来绘制当前的研究成果。方法:我们按照PRISMA(系统评价和荟萃分析的首选报告项目)指南进行了系统的文献综述。在6个数据库中进行了搜索,重点关注使用ML调查社会人口统计学特征与癌症相关结果之间关系的初步研究。搜索策略采用PICO(人口、干预、比较和结果)框架,并使用一组预定义的纳入标准筛选研究。评估每篇纳入的论文的方法学质量。结果:328份病历中有19份符合纳入标准。大多数研究使用监督机器学习技术,其中随机森林和极端梯度增强是最常用的。经常分析的变量包括年龄、男性或女性或双性人、教育程度、收入和地理位置。交叉验证是评估模型性能的主要方法。然而,临床和社会人口学数据的整合是有限的,并且对外部验证的努力很少。结论:ML在识别与癌症的社会决定因素相关的模式方面具有重要的潜力。然而,这一领域的研究仍然是碎片化和不一致的。未来的研究应优先考虑上下文因素的整合,提高模型的透明度,并加强外部验证。这些措施对于在癌症治疗中发展更公平、可推广和可操作的ML应用至关重要。
{"title":"Machine Learning Techniques Used for the Identification of Sociodemographic Factors Associated With Cancer: Systematic Literature Review.","authors":"Liz González-Infante, Gaston Marquez, Solange Parra-Soto, Mónica Cardona-Valencia, Carla Taramasco","doi":"10.2196/79187","DOIUrl":"10.2196/79187","url":null,"abstract":"<p><strong>Background: </strong>Cancer remains one of the foremost global causes of mortality, with nearly 10 million deaths recorded by 2020. As incidence rates rise, there is a growing interest in leveraging machine learning (ML) to enhance prediction, diagnosis, and treatment strategies. Despite these advancements, insufficient attention has been directed toward the integration of sociodemographic variables, which are crucial determinants of health equity, into ML models in oncology.</p><p><strong>Objective: </strong>This review aims to investigate how ML techniques have been used to identify patterns of predictive association between sociodemographic factors and cancer-related outcomes. Specifically, it seeks to map current research endeavors by detailing the types of algorithms used, the sociodemographic variables examined, and the validation methodologies used.</p><p><strong>Methods: </strong>We conducted a systematic literature review in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Searches were executed across 6 databases, focusing on the primary studies using ML to investigate the association between sociodemographic characteristics and cancer-related outcomes. The search strategy was informed by the PICO (population, intervention, comparison, and outcome) framework, and a set of predefined inclusion criteria was used to screen the studies. The methodological quality of each included paper was assessed.</p><p><strong>Results: </strong>Out of the 328 records examined, 19 satisfied the inclusion criteria. The majority of studies used supervised ML techniques, with random forest and extreme gradient boosting being the most commonly used. Frequently analyzed variables include age, male or female or intersex, education level, income, and geographic location. Cross-validation is the predominant method for evaluating model performance. Nevertheless, the integration of clinical and sociodemographic data is limited, and efforts toward external validation are infrequent.</p><p><strong>Conclusions: </strong>ML holds significant potential for discerning patterns associated with the social determinants of cancer. Nevertheless, research in this domain remains fragmented and inconsistent. Future investigations should prioritize the integration of contextual factors, enhance model transparency, and bolster external validation. These measures are crucial for the development of more equitable, generalizable, and actionable ML applications in cancer care.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e79187"},"PeriodicalIF":6.0,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12851563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146086071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characteristics Influencing Support for the National Health Service COVID-19 App in England and Wales: Findings From a Longitudinal Survey. 影响英格兰和威尔士国家卫生服务COVID-19应用程序支持的特征:来自纵向调查的结果。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-28 DOI: 10.2196/76863
Josephine Exley, Paul Boadu, Kasim Allel, Bob Erens, Nicholas Mays, Mustafa Al-Haboubi
<p><strong>Background: </strong>The use of proximity (contact) tracing mobile phone apps during the COVID-19 pandemic to support manual contact tracing was novel. Uptake of the app was lower than expected.</p><p><strong>Objective: </strong>We sought to identify distinct subgroups of individuals based on their level of support for the National Health Service (NHS) COVID-19 app in the first 15 months of the app's implementation, and to identify the attitudes and characteristics associated with membership of more and less supportive groups.</p><p><strong>Methods: </strong>We conducted 8 waves of a longitudinal survey data of smartphone users, recruited from an online panel (n=2023 at baseline and n=1198 at survey wave 6) between October 14, 2020, and December 13, 2021. We used latent class analysis to identify subgroups of individuals with different inclinations of support for the NHS COVID-19 app. Sankey diagram analysis was used to assess individuals whose subgroup changed over the study period. We estimated population-weighted multinomial logistic regression models using sociodemographic characteristics as independent variables.</p><p><strong>Results: </strong>We identified 4 subgroups in survey waves 1 to 4-"not supportive" (1765/7210, 25%), "ambivalent" (2124/7210, 30%), "somewhat supportive" (1421/7219, 20%), and "completely supportive" (1900/7210, 26%). At wave 5, a total of 3 subgroups of support for the app emerged-"not supportive" (549/1613, 34%), "ambivalent" (497/1613, 31%), and "supportive" (567/1613, 35%). From wave 6 onward, the results showed 4 subgroups emerging-"least supportive" (1568/6952, 23%), "less supportive" (1179/6952, 17%), "ambivalent" (2105/6952, 30%), and "supportive" (2100/6952, 29%). The majority of respondents remained within their identified subgroups between survey waves. Among those who moved into different subgroups, most moved into a less supportive subgroup. Exceptions to this were from waves 2 to 3 and from waves 3 to 4, when higher percentages of respondents moved into more supportive subgroups. The biggest movement to less supportive subgroups occurred after wave 1 (October 2020), when 38% (2740/7210) of respondents moved into a less supportive subgroup. The biggest movement to more supportive subgroups, on the other hand, occurred after wave 2, when 22% (1586/7210) of respondents moved into more supportive subgroups. Over the course of the 8 waves, the percentage of respondents in supportive subgroups declined from 56% (3353/5988) to 29% (1737/5988). Key characteristics of more supportive individuals included having higher levels of trust in the government to control the spread of COVID-19 and having the app installed, while those less concerned about the risk COVID-19 posed to the country were more likely to be unsupportive (P<.05).</p><p><strong>Conclusions: </strong>When the app was launched, just over half of respondents were supportive, but this declined over the following 15 months. The attrition in s
背景:在COVID-19大流行期间,使用近距离(接触者)追踪手机应用程序来支持手动接触者追踪是新颖的。这款应用的使用率低于预期。目的:我们试图根据国民健康服务(NHS) COVID-19应用程序实施后的前15个月对其支持程度来确定不同的个人亚组,并确定与支持程度不同的群体成员的态度和特征。方法:在2020年10月14日至2021年12月13日期间,我们对智能手机用户进行了8波纵向调查数据,这些数据来自一个在线小组(基线时n=2023,第6波调查时n=1198)。我们使用潜在类别分析来确定对NHS COVID-19应用程序支持不同倾向的个体亚组。使用桑基图分析来评估其亚组在研究期间发生变化的个体。我们使用社会人口学特征作为自变量估计人口加权多项逻辑回归模型。结果:我们在调查波1到4中确定了4个亚组——“不支持”(1765/7210,25%)、“矛盾”(2124/7210,30%)、“有些支持”(1421/7219,20%)和“完全支持”(1900/7210,26%)。在第5波,总共出现了3个支持该应用的子群体——“不支持”(549/1613,34%),“矛盾”(497/1613,31%)和“支持”(567/1613,35%)。从第6波开始,结果显示出现了4个亚组-“最不支持”(1568/6952,23%),“不太支持”(1179/6952,17%),“矛盾”(2155 /6952,30%)和“支持”(2100/6952,29%)。在两次调查之间,大多数答复者仍在其确定的子群体内。在那些进入不同小组的人中,大多数人进入了一个不太支持的小组。从第二波到第三波和从第三波到第四波是例外,当更高比例的受访者进入更支持的子群体时。向不太支持的子组的最大运动发生在第一波(2020年10月)之后,当时38%(2740/7210)的受访者进入了不太支持的子组。另一方面,向更支持性的子群体的最大运动发生在第二波之后,当时22%(1586/7210)的受访者进入了更支持性的子群体。在8波的过程中,支持亚组的受访者比例从56%(3353/5988)下降到29%(1737/5988)。支持度更高的个人的关键特征包括对政府控制COVID-19传播的信任程度更高,并且安装了该应用程序,而那些不太担心COVID-19对国家构成风险的人更有可能不支持(p结论:当该应用程序启动时,略高于一半的受访者表示支持,但在接下来的15个月里,这一比例有所下降。支持的减少给各国政府在未来流行病中使用应用程序带来了重大挑战。一个潜在的原因是对政府处理疫情的不信任。
{"title":"Characteristics Influencing Support for the National Health Service COVID-19 App in England and Wales: Findings From a Longitudinal Survey.","authors":"Josephine Exley, Paul Boadu, Kasim Allel, Bob Erens, Nicholas Mays, Mustafa Al-Haboubi","doi":"10.2196/76863","DOIUrl":"10.2196/76863","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The use of proximity (contact) tracing mobile phone apps during the COVID-19 pandemic to support manual contact tracing was novel. Uptake of the app was lower than expected.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;We sought to identify distinct subgroups of individuals based on their level of support for the National Health Service (NHS) COVID-19 app in the first 15 months of the app's implementation, and to identify the attitudes and characteristics associated with membership of more and less supportive groups.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We conducted 8 waves of a longitudinal survey data of smartphone users, recruited from an online panel (n=2023 at baseline and n=1198 at survey wave 6) between October 14, 2020, and December 13, 2021. We used latent class analysis to identify subgroups of individuals with different inclinations of support for the NHS COVID-19 app. Sankey diagram analysis was used to assess individuals whose subgroup changed over the study period. We estimated population-weighted multinomial logistic regression models using sociodemographic characteristics as independent variables.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;We identified 4 subgroups in survey waves 1 to 4-\"not supportive\" (1765/7210, 25%), \"ambivalent\" (2124/7210, 30%), \"somewhat supportive\" (1421/7219, 20%), and \"completely supportive\" (1900/7210, 26%). At wave 5, a total of 3 subgroups of support for the app emerged-\"not supportive\" (549/1613, 34%), \"ambivalent\" (497/1613, 31%), and \"supportive\" (567/1613, 35%). From wave 6 onward, the results showed 4 subgroups emerging-\"least supportive\" (1568/6952, 23%), \"less supportive\" (1179/6952, 17%), \"ambivalent\" (2105/6952, 30%), and \"supportive\" (2100/6952, 29%). The majority of respondents remained within their identified subgroups between survey waves. Among those who moved into different subgroups, most moved into a less supportive subgroup. Exceptions to this were from waves 2 to 3 and from waves 3 to 4, when higher percentages of respondents moved into more supportive subgroups. The biggest movement to less supportive subgroups occurred after wave 1 (October 2020), when 38% (2740/7210) of respondents moved into a less supportive subgroup. The biggest movement to more supportive subgroups, on the other hand, occurred after wave 2, when 22% (1586/7210) of respondents moved into more supportive subgroups. Over the course of the 8 waves, the percentage of respondents in supportive subgroups declined from 56% (3353/5988) to 29% (1737/5988). Key characteristics of more supportive individuals included having higher levels of trust in the government to control the spread of COVID-19 and having the app installed, while those less concerned about the risk COVID-19 posed to the country were more likely to be unsupportive (P&lt;.05).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;When the app was launched, just over half of respondents were supportive, but this declined over the following 15 months. The attrition in s","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e76863"},"PeriodicalIF":6.0,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12895152/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146194557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Therapeutic Effects of a WeChat Mini-Program on Metabolic Dysfunction-Associated Fatty Liver Disease: Randomized Controlled Trial. b微信小程序对代谢功能障碍相关脂肪肝的治疗效果:随机对照试验
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-27 DOI: 10.2196/76204
Chao Sun, Guangyu Chen, Cuicui Shi, Haixia Cao, Ruixu Yang, Jing Zeng, Xiaoyan Duan, Xin Sun, Jian-Gao Fan
<p><strong>Background: </strong>For patients with metabolic dysfunction-associated fatty liver disease (MAFLD), weight loss is advised but challenging in practice. In China, there is a pronounced shortage of tailored digital lifestyle interventions for this population.</p><p><strong>Objective: </strong>This study aimed to assess the effects of a WeChat mini-program-delivered lifestyle intervention on weight loss and hepatic steatosis among individuals with MAFLD who were overweight or obese.</p><p><strong>Methods: </strong>Adults who are overweight or obese and have clinically diagnosed MAFLD with transient elastography examination were enrolled in this prospective randomized controlled trial. Patients were randomly assigned to receive either WeChat mini-program management (intervention group) or standard care (control group) at a 1:1 ratio. The intervention was structured around the development and implementation of personalized diet and exercise plans, supplemented by guided exercise video courses and reinforced through continuous monitoring and informational support. Body weight and clinical parameters were assessed at baseline and then at 6 months.</p><p><strong>Results: </strong>A total of 89 patients met the inclusion criteria and were randomly assigned to the intervention group (n=45) or control group (n=44). Among the 89 patients with MAFLD, 60% (27/45) of them achieved a weight loss of ≥5%, and 24.4% (11/45) of them had a weight loss of ≥10% in the intervention group, which was greater than those in the control group (27/45 vs 7/44; relative risk [RR] 3.771, 95% CI 1.836-7.748; P<.001; 11/45 vs 3/44, RR 3.585, 95% CI 1.072-11.988; P=.02). Importantly, patients receiving the intervention were significantly more likely to achieve a ≥10% reduction or normalization of controlled attenuation parameter (CAP) than those in the control group (26/45 vs 14/44; RR 1.816, 95% CI 1.102-2.992; P=.01). After adjusting for key baseline covariates, multivariate analysis confirmed the intervention's positive effect on achieving a weight loss of ≥5% (OR [odds ratio] 8.380, 95% CI 2.886-24.331; P<.001) of ≥10% (OR 4.612, 95% CI 1.138-18.686; P=.03), as well as on CAP reduction of ≥10 % or normalization (OR 2.853, 95% CI 1.092-7.456; P=.03). In parallel, the intervention group presented greater reductions in liver enzymes (alanine aminotransferase, aspartate aminotransferase, and γ-glutamyl transpeptidase) and metabolic parameters (fasting insulin, hemoglobin A1c, and triglyceride) than the control group (all P<.05). According to the fibrosis assessment, only the FibroScan-aspartate aminotransferase score decreased more in the intervention group than in the control group (median difference -0.06, 95% CI -0.13 to -0.01; P=.02), as compared to other non-invasive indicators.</p><p><strong>Conclusions: </strong>Readily scalable in primary care and varied-resource settings, our WeChat mini-program-based intervention extends beyond weight loss to reduce hepatic st
背景:对于代谢功能障碍相关的脂肪肝(MAFLD)患者,建议减肥,但在实践中具有挑战性。在中国,为这一人群量身定制的数字生活方式干预措施明显不足。目的:本研究旨在评估b微信迷你计划提供的生活方式干预对超重或肥胖的MAFLD患者体重减轻和肝脂肪变性的影响。方法:这项前瞻性随机对照试验纳入了超重或肥胖并通过瞬时弹性成像检查被临床诊断为MAFLD的成年人。患者按1:1的比例随机分配接受微信小程序管理(干预组)或标准治疗(对照组)。干预是围绕个性化饮食和运动计划的制定和实施,辅以指导运动的视频课程,并通过持续监测和信息支持来加强。在基线和6个月时分别评估体重和临床参数。结果:89例患者符合纳入标准,随机分为干预组(n=45)和对照组(n=44)。89例MAFLD患者中,干预组60%(27/45)患者体重减轻≥5%,干预组24.4%(11/45)患者体重减轻≥10%,高于对照组(27/45 vs 7/44;相对危险度[RR] 3.771, 95% CI 1.836-7.748;结论:我们的微信小程序干预在初级保健和各种资源环境中易于扩展,不仅可以减轻体重,还可以减少肝脏脂肪变性和改善代谢参数,从而通过低成本模式解决中国高负担人群靶向mld管理的关键缺口。然而,未来需要更大规模的研究来更精确地证实这些发现并评估长期可持续性。
{"title":"Therapeutic Effects of a WeChat Mini-Program on Metabolic Dysfunction-Associated Fatty Liver Disease: Randomized Controlled Trial.","authors":"Chao Sun, Guangyu Chen, Cuicui Shi, Haixia Cao, Ruixu Yang, Jing Zeng, Xiaoyan Duan, Xin Sun, Jian-Gao Fan","doi":"10.2196/76204","DOIUrl":"10.2196/76204","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;For patients with metabolic dysfunction-associated fatty liver disease (MAFLD), weight loss is advised but challenging in practice. In China, there is a pronounced shortage of tailored digital lifestyle interventions for this population.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to assess the effects of a WeChat mini-program-delivered lifestyle intervention on weight loss and hepatic steatosis among individuals with MAFLD who were overweight or obese.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Adults who are overweight or obese and have clinically diagnosed MAFLD with transient elastography examination were enrolled in this prospective randomized controlled trial. Patients were randomly assigned to receive either WeChat mini-program management (intervention group) or standard care (control group) at a 1:1 ratio. The intervention was structured around the development and implementation of personalized diet and exercise plans, supplemented by guided exercise video courses and reinforced through continuous monitoring and informational support. Body weight and clinical parameters were assessed at baseline and then at 6 months.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;A total of 89 patients met the inclusion criteria and were randomly assigned to the intervention group (n=45) or control group (n=44). Among the 89 patients with MAFLD, 60% (27/45) of them achieved a weight loss of ≥5%, and 24.4% (11/45) of them had a weight loss of ≥10% in the intervention group, which was greater than those in the control group (27/45 vs 7/44; relative risk [RR] 3.771, 95% CI 1.836-7.748; P&lt;.001; 11/45 vs 3/44, RR 3.585, 95% CI 1.072-11.988; P=.02). Importantly, patients receiving the intervention were significantly more likely to achieve a ≥10% reduction or normalization of controlled attenuation parameter (CAP) than those in the control group (26/45 vs 14/44; RR 1.816, 95% CI 1.102-2.992; P=.01). After adjusting for key baseline covariates, multivariate analysis confirmed the intervention's positive effect on achieving a weight loss of ≥5% (OR [odds ratio] 8.380, 95% CI 2.886-24.331; P&lt;.001) of ≥10% (OR 4.612, 95% CI 1.138-18.686; P=.03), as well as on CAP reduction of ≥10 % or normalization (OR 2.853, 95% CI 1.092-7.456; P=.03). In parallel, the intervention group presented greater reductions in liver enzymes (alanine aminotransferase, aspartate aminotransferase, and γ-glutamyl transpeptidase) and metabolic parameters (fasting insulin, hemoglobin A1c, and triglyceride) than the control group (all P&lt;.05). According to the fibrosis assessment, only the FibroScan-aspartate aminotransferase score decreased more in the intervention group than in the control group (median difference -0.06, 95% CI -0.13 to -0.01; P=.02), as compared to other non-invasive indicators.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Readily scalable in primary care and varied-resource settings, our WeChat mini-program-based intervention extends beyond weight loss to reduce hepatic st","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e76204"},"PeriodicalIF":6.0,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12843888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146064200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Phases of Living Evidence Synthesis Using AI AI: Living Evidence Synthesis (Version 1). 使用人工智能合成活证据的阶段AI:活证据合成(版本1)。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-27 DOI: 10.2196/76130
Xuping Song, Zhenjie Lian, Rui Wang, Ruixin Li, Zhenzhen Yang, Xufei Luo, Lei Feng, Zhiming Ma, Zhen Pu, Qi Wang, Long Ge, Caihong Li, Yaolong Chen, Kehu Yang, John Lavis

Background: Living evidence (LE) synthesis refers to the method of continuously updating systematic evidence reviews to incorporate new evidence. It has emerged to address the limitations of the traditional systematic review process, particularly the absence of or delays in publication updates. The emergence of COVID-19 accelerated the progress in the field of LE synthesis, and currently, the applications of artificial intelligence (AI) in LE synthesis are expanding rapidly. However, in which phases of LE synthesis should AI be used remains an unanswered question.

Objective: This study aims to (1) document the phases of LE synthesis where AI is used and (2) investigate whether AI improves the efficiency, accuracy, or utility of LE synthesis.

Methods: We searched Web of Science, PubMed, the Cochrane Library, Epistemonikos, the Campbell Library, IEEE Xplore, medRxiv, COVID-19 Evidence Network to support Decision-making, and McMaster Health Forum. We used Covidence to facilitate the monthly screening and extraction processes to maintain the LE synthesis process. Studies that used or developed AI or semiautomated tools in the phases of LE synthesis were included.

Results: A total of 24 studies were included, including 17 on LE syntheses, with 4 involving tool development, and 7 on living meta-analyses, with 3 involving tool development. First, a total of 34 AI or semiautomated tools were involved, comprising 12 AI tools and 22 semiautomated tools. The most frequently used AI or semiautomated tools were machine learning classifiers (n=5) and the Living Interactive Evidence synthesis platform (n=3). Second, 20 AI or semiautomated tools were used for the data extraction or collection and risk of bias assessment phase, and only 1 AI tool was used for the publication update phase. Third, 3 studies demonstrated the improvement in efficiency achieved based on time, workload, and conflict rate metrics. Nine studies applied AI or semiautomated tools in LE synthesis, obtaining a mean recall rate of 96.24%, and 6 studies achieved a mean F1-score of 92.17%. Additionally, 8 studies reported precision values ranging from 0.2% to 100%.

Conclusions: AI and semiautomated tools primarily facilitate data extraction or collection and risk of bias assessment. The use of AI or semiautomated tools in LE synthesis improves efficiency, leading to high accuracy, recall, and F1-scores, while precision varies across tools.

背景:活证据综合是指不断更新系统证据综述以纳入新证据的方法。它的出现是为了解决传统系统审查过程的局限性,特别是出版物更新的缺乏或延迟。COVID-19的出现加速了LE合成领域的进展,目前人工智能(AI)在LE合成中的应用正在迅速扩大。然而,人工智能应该用于LE合成的哪个阶段仍然是一个悬而未决的问题。目的:本研究旨在(1)记录使用人工智能合成LE的阶段,(2)研究人工智能是否提高了LE合成的效率、准确性或实用性。方法:检索Web of Science、PubMed、Cochrane Library、Epistemonikos、Campbell Library、IEEE explore、medRxiv、COVID-19 Evidence Network to support Decision-making和McMaster Health Forum。我们使用covid来促进每月的筛选和提取过程,以维持LE合成过程。包括在LE合成阶段使用或开发人工智能或半自动化工具的研究。结果:共纳入24项研究,其中17项关于LE合成,4项涉及工具开发;7项关于生活荟萃分析,3项涉及工具开发。首先,总共涉及34个人工智能或半自动化工具,包括12个人工智能工具和22个半自动化工具。最常用的人工智能或半自动化工具是机器学习分类器(n=5)和活体交互证据合成平台(n=3)。其次,20个人工智能或半自动工具用于数据提取或收集和偏倚风险评估阶段,只有1个人工智能工具用于出版物更新阶段。第三,3项研究证明了基于时间、工作量和冲突率度量的效率改进。9项研究将AI或半自动工具应用于LE合成,平均召回率为96.24%,6项研究的平均f1得分为92.17%。此外,8项研究报告的精度值在0.2%到100%之间。结论:人工智能和半自动化工具主要促进数据提取或收集和偏见风险评估。在LE合成中使用人工智能或半自动工具可以提高效率,从而提高准确性、召回率和f1分数,而不同工具的精度不同。
{"title":"The Phases of Living Evidence Synthesis Using AI AI: Living Evidence Synthesis (Version 1).","authors":"Xuping Song, Zhenjie Lian, Rui Wang, Ruixin Li, Zhenzhen Yang, Xufei Luo, Lei Feng, Zhiming Ma, Zhen Pu, Qi Wang, Long Ge, Caihong Li, Yaolong Chen, Kehu Yang, John Lavis","doi":"10.2196/76130","DOIUrl":"10.2196/76130","url":null,"abstract":"<p><strong>Background: </strong>Living evidence (LE) synthesis refers to the method of continuously updating systematic evidence reviews to incorporate new evidence. It has emerged to address the limitations of the traditional systematic review process, particularly the absence of or delays in publication updates. The emergence of COVID-19 accelerated the progress in the field of LE synthesis, and currently, the applications of artificial intelligence (AI) in LE synthesis are expanding rapidly. However, in which phases of LE synthesis should AI be used remains an unanswered question.</p><p><strong>Objective: </strong>This study aims to (1) document the phases of LE synthesis where AI is used and (2) investigate whether AI improves the efficiency, accuracy, or utility of LE synthesis.</p><p><strong>Methods: </strong>We searched Web of Science, PubMed, the Cochrane Library, Epistemonikos, the Campbell Library, IEEE Xplore, medRxiv, COVID-19 Evidence Network to support Decision-making, and McMaster Health Forum. We used Covidence to facilitate the monthly screening and extraction processes to maintain the LE synthesis process. Studies that used or developed AI or semiautomated tools in the phases of LE synthesis were included.</p><p><strong>Results: </strong>A total of 24 studies were included, including 17 on LE syntheses, with 4 involving tool development, and 7 on living meta-analyses, with 3 involving tool development. First, a total of 34 AI or semiautomated tools were involved, comprising 12 AI tools and 22 semiautomated tools. The most frequently used AI or semiautomated tools were machine learning classifiers (n=5) and the Living Interactive Evidence synthesis platform (n=3). Second, 20 AI or semiautomated tools were used for the data extraction or collection and risk of bias assessment phase, and only 1 AI tool was used for the publication update phase. Third, 3 studies demonstrated the improvement in efficiency achieved based on time, workload, and conflict rate metrics. Nine studies applied AI or semiautomated tools in LE synthesis, obtaining a mean recall rate of 96.24%, and 6 studies achieved a mean F1-score of 92.17%. Additionally, 8 studies reported precision values ranging from 0.2% to 100%.</p><p><strong>Conclusions: </strong>AI and semiautomated tools primarily facilitate data extraction or collection and risk of bias assessment. The use of AI or semiautomated tools in LE synthesis improves efficiency, leading to high accuracy, recall, and F1-scores, while precision varies across tools.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"28 ","pages":"e76130"},"PeriodicalIF":6.0,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12842881/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146064162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Products, Performance, and Technological Development of Ambulatory Oxygen Therapy Devices: Scoping Review. 动态氧疗设备的产品、性能和技术发展:范围综述。
IF 6 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-27 DOI: 10.2196/81077
Shohei Kawachi, Mariana Hoffman, Lorena Romero, Magnus Ekström, Jerry A Krishnan, Anne E Holland
<p><strong>Background: </strong>Ambulatory oxygen therapy is prescribed for patients with chronic lung diseases who experience exertional hypoxemia. However, available devices may not adequately meet user requirements, and their performance characteristics are heterogeneous.</p><p><strong>Objective: </strong>This study aims to identify devices available for delivery of ambulatory oxygen therapy, the technologies that they use to generate oxygen, the performance characteristics of each device, and the development status.</p><p><strong>Methods: </strong>We used medical and engineering databases to identify peer-reviewed papers (eg, MEDLINE, IEEE). Gray literature was used to identify additional descriptions of ambulatory oxygen devices in military medicine, space exploration, or patents. The last search was conducted in September 2025. Documents that described a device that can deliver oxygen in an ambulatory context (defined as weighing less than 10 kg) and were written in English were included. Search results were screened for inclusion by 2 independent reviewers. Data were synthesized by descriptively mapping the performance of each product, the technology used, and the development status of emerging technologies.</p><p><strong>Results: </strong>From 9702 records identified, a total of 166 met eligibility criteria (106 scientific publications and 60 gray literature). We identified 33 portable oxygen concentrators (POCs; 29 commercially available), 10 oxygen cylinders, and 6 portable liquid oxygen (LOX) devices. The POC products showed a trade-off between portability and oxygen delivery capacity (maximum flow rate ranging from 2.0 to 6.0 L/min; device weight ranging from 1.0 to 9.1 kg). Pressure swing adsorption with zeolite was the most common oxygen generation technology in POCs on the market. The mean maximum continuous operating time of POCs was 3.8 hours. Two prototype POCs (maximum flow rate of 4-6 L/min and device weight of 8-9 kg) were developed for space exploration using modified adsorbents. LOX devices were the lightest and had the longest continuous operating time. Innovations in delivery included the downsizing of a POC by using nanozeolite as an adsorbent and pulse oximeter oxygen saturation (SpO<sub>2</sub>)-targeted automatic titration of oxygen delivery based on the user's SpO<sub>2</sub>.</p><p><strong>Conclusions: </strong>This scoping review is the first study to integrate medical, engineering, and gray literature on ambulatory oxygen devices and their development. Although prior literature has narratively explained the products and technologies, no previous research has systematically investigated them. This review showed that POCs available to consumers may not meet the needs of patients in terms of flow rate, portability, and operating time. LOX devices offered superior performance but are limited by high costs. Limitations of this review include the difficulty of comparing product performance across oxygen delivery setting
背景:动态氧疗是为慢性肺部疾病患者谁经历运动性低氧血症开处方。然而,现有的动态氧疗设备可能不能充分满足用户的需求,其性能特点也不尽相同。目的:了解可用于门诊供氧的设备、供氧技术、各设备的性能特点及发展现状。方法:使用医学和工程数据库(如MEDLINE, IEEE)识别同行评议论文。灰色文献用于确定军事医学、空间探索或专利中动态氧气装置的附加描述。最后一次搜寻是在2025年9月。包括描述可以在动态环境中输送氧气的设备(定义为重量小于10kg)并以英文书写的文件。搜索结果由两名独立审稿人筛选纳入。通过描述每个产品的性能、使用的技术和新兴技术的发展状况来合成数据。结果:9702篇文献中,166篇符合入选标准(106篇科学出版物和60篇灰色文献)。我们确定了33个便携式氧气浓缩器(POCs, 29个市售),10个氧气瓶和6个便携式液氧(LOX)。POC产品显示了便携性和氧气输送能力之间的权衡(最大流量范围为2.0至6.0 LPM,设备重量范围为1.0至9.1 kg)。沸石变压吸附是市面上最常用的poc制氧技术。POCs平均最长连续工作时间为3.8 h。利用改性吸附剂研制了两个最大流量为4 ~ 6lpm、设备重量为8 ~ 9kg的POCs原型机,用于空间探索领域。液氧装置最轻,连续工作时间最长。输送方面的创新包括使用纳米沸石作为吸附剂缩小POC的体积,以及根据用户的SpO₂自动滴定氧输送。结论:本综述是第一个整合动态供氧装置及其发展的医学、工程和灰色文献的研究。虽然以前的文献叙述了产品和技术,但没有研究系统地调查过它们。本综述显示,消费者可获得的POCs在流量、便携性和手术时间方面可能无法满足患者的需求。LOX提供了卓越的性能,但受到高成本的限制。本综述的局限性包括难以比较不同输氧环境下的产品性能,并且记录主要来自英语来源。总之,在过去的十年中,动态氧气技术的创新受到了限制。迫切需要研究和开发具有更大氧气输送能力的新型轻型设备。临床试验:开放科学框架;https://osf.io/qs7fx。
{"title":"Products, Performance, and Technological Development of Ambulatory Oxygen Therapy Devices: Scoping Review.","authors":"Shohei Kawachi, Mariana Hoffman, Lorena Romero, Magnus Ekström, Jerry A Krishnan, Anne E Holland","doi":"10.2196/81077","DOIUrl":"10.2196/81077","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Ambulatory oxygen therapy is prescribed for patients with chronic lung diseases who experience exertional hypoxemia. However, available devices may not adequately meet user requirements, and their performance characteristics are heterogeneous.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to identify devices available for delivery of ambulatory oxygen therapy, the technologies that they use to generate oxygen, the performance characteristics of each device, and the development status.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We used medical and engineering databases to identify peer-reviewed papers (eg, MEDLINE, IEEE). Gray literature was used to identify additional descriptions of ambulatory oxygen devices in military medicine, space exploration, or patents. The last search was conducted in September 2025. Documents that described a device that can deliver oxygen in an ambulatory context (defined as weighing less than 10 kg) and were written in English were included. Search results were screened for inclusion by 2 independent reviewers. Data were synthesized by descriptively mapping the performance of each product, the technology used, and the development status of emerging technologies.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;From 9702 records identified, a total of 166 met eligibility criteria (106 scientific publications and 60 gray literature). We identified 33 portable oxygen concentrators (POCs; 29 commercially available), 10 oxygen cylinders, and 6 portable liquid oxygen (LOX) devices. The POC products showed a trade-off between portability and oxygen delivery capacity (maximum flow rate ranging from 2.0 to 6.0 L/min; device weight ranging from 1.0 to 9.1 kg). Pressure swing adsorption with zeolite was the most common oxygen generation technology in POCs on the market. The mean maximum continuous operating time of POCs was 3.8 hours. Two prototype POCs (maximum flow rate of 4-6 L/min and device weight of 8-9 kg) were developed for space exploration using modified adsorbents. LOX devices were the lightest and had the longest continuous operating time. Innovations in delivery included the downsizing of a POC by using nanozeolite as an adsorbent and pulse oximeter oxygen saturation (SpO&lt;sub&gt;2&lt;/sub&gt;)-targeted automatic titration of oxygen delivery based on the user's SpO&lt;sub&gt;2&lt;/sub&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;This scoping review is the first study to integrate medical, engineering, and gray literature on ambulatory oxygen devices and their development. Although prior literature has narratively explained the products and technologies, no previous research has systematically investigated them. This review showed that POCs available to consumers may not meet the needs of patients in terms of flow rate, portability, and operating time. LOX devices offered superior performance but are limited by high costs. Limitations of this review include the difficulty of comparing product performance across oxygen delivery setting","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":" ","pages":"e81077"},"PeriodicalIF":6.0,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12892029/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145810130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Medical Internet Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1