首页 > 最新文献

JMIR Medical Informatics最新文献

英文 中文
Distributed Statistical Analyses: A Scoping Review and Examples of Operational Frameworks Adapted to Health Analytics. 分布式统计分析:范围审查和适用于健康分析的操作框架实例》。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-19 DOI: 10.2196/53622
Félix Camirand Lemyre, Simon Lévesque, Marie-Pier Domingue, Klaus Herrmann, Jean-François Ethier

Background: Data from multiple organizations are crucial for advancing learning health systems. However, ethical, legal, and social concerns may restrict the use of standard statistical methods that rely on pooling data. Although distributed algorithms offer alternatives, they may not always be suitable for health frameworks.

Objective: This paper aims to support researchers and data custodians in three ways: (1) providing a concise overview of the literature on statistical inference methods for horizontally partitioned data; (2) describing the methods applicable to generalized linear models (GLM) and assessing their underlying distributional assumptions; (3) adapting existing methods to make them fully usable in health settings.

Methods: A scoping review methodology was employed for the literature mapping, from which methods presenting a methodological framework for GLM analyses with horizontally partitioned data were identified and assessed from the perspective of applicability in health settings. Statistical theory was used to adapt methods and to derive the properties of the resulting estimators.

Results: From the review, 41 articles were selected, and six approaches were extracted for conducting standard GLM-based statistical analysis. However, these approaches assumed evenly and identically distributed data across nodes. Consequently, statistical procedures were derived to accommodate uneven node sample sizes and heterogeneous data distributions across nodes. Workflows and detailed algorithms were developed to highlight information-sharing requirements and operational complexity.

Conclusions: This paper contributes to the field of health analytics by providing an overview of the methods that can be used with horizontally partitioned data, by adapting these methods to the context of heterogeneous health data and by clarifying the workflows and quantities exchanged by the methods discussed. Further analysis of the confidentiality preserved by these methods is needed to fully understand the risk associated with the sharing of summary statistics.

背景:来自多个组织的数据对于推进学习型医疗系统至关重要。然而,伦理、法律和社会问题可能会限制使用依赖于数据汇集的标准统计方法。尽管分布式算法提供了替代方案,但它们并不总是适合健康框架:本文旨在从三个方面为研究人员和数据保管人员提供支持:(1)提供有关横向分割数据统计推断方法的文献概览;(2)描述适用于广义线性模型(GLM)的方法并评估其基本分布假设;(3)调整现有方法,使其完全适用于卫生环境:方法:采用范围综述的方法绘制文献图谱,从中发现并从卫生环境适用性的角度评估了为横向分割数据的广义线性模型分析提供方法框架的方法。统计理论被用来调整方法和推导所产生的估计器的特性:从综述中筛选出 41 篇文章,并提取出六种方法用于进行基于 GLM 的标准统计分析。然而,这些方法都假定各节点的数据分布均匀且相同。因此,为了适应节点样本大小不均和节点间数据分布不均的情况,我们推导出了统计程序。还开发了工作流程和详细算法,以突出信息共享要求和操作复杂性:本文概述了可用于水平分割数据的方法,将这些方法调整到异构健康数据的环境中,并阐明了所讨论方法的工作流程和交换的数量,从而为健康分析领域做出了贡献。需要进一步分析这些方法的保密性,以充分了解与共享汇总统计数据相关的风险。
{"title":"Distributed Statistical Analyses: A Scoping Review and Examples of Operational Frameworks Adapted to Health Analytics.","authors":"Félix Camirand Lemyre, Simon Lévesque, Marie-Pier Domingue, Klaus Herrmann, Jean-François Ethier","doi":"10.2196/53622","DOIUrl":"https://doi.org/10.2196/53622","url":null,"abstract":"<p><strong>Background: </strong>Data from multiple organizations are crucial for advancing learning health systems. However, ethical, legal, and social concerns may restrict the use of standard statistical methods that rely on pooling data. Although distributed algorithms offer alternatives, they may not always be suitable for health frameworks.</p><p><strong>Objective: </strong>This paper aims to support researchers and data custodians in three ways: (1) providing a concise overview of the literature on statistical inference methods for horizontally partitioned data; (2) describing the methods applicable to generalized linear models (GLM) and assessing their underlying distributional assumptions; (3) adapting existing methods to make them fully usable in health settings.</p><p><strong>Methods: </strong>A scoping review methodology was employed for the literature mapping, from which methods presenting a methodological framework for GLM analyses with horizontally partitioned data were identified and assessed from the perspective of applicability in health settings. Statistical theory was used to adapt methods and to derive the properties of the resulting estimators.</p><p><strong>Results: </strong>From the review, 41 articles were selected, and six approaches were extracted for conducting standard GLM-based statistical analysis. However, these approaches assumed evenly and identically distributed data across nodes. Consequently, statistical procedures were derived to accommodate uneven node sample sizes and heterogeneous data distributions across nodes. Workflows and detailed algorithms were developed to highlight information-sharing requirements and operational complexity.</p><p><strong>Conclusions: </strong>This paper contributes to the field of health analytics by providing an overview of the methods that can be used with horizontally partitioned data, by adapting these methods to the context of heterogeneous health data and by clarifying the workflows and quantities exchanged by the methods discussed. Further analysis of the confidentiality preserved by these methods is needed to fully understand the risk associated with the sharing of summary statistics.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141728335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Information and Communication Technology Maturity Assessment at Primary Health Care Services Across 9 Provinces in Indonesia: Evaluation Study. 印度尼西亚 9 个省的初级卫生保健服务信息和通信技术成熟度评估:评价研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-18 DOI: 10.2196/55959
Dewi Nur Aisyah, Agus Heri Setiawan, Alfiano Fawwaz Lokopessy, Nadia Faradiba, Setiaji Setiaji, Logan Manikam, Zisis Kozlakidis

Background: Indonesia has rapidly embraced digital health, particularly during the COVID-19 pandemic, with over 15 million daily health application users. To advance its digital health vision, the government is prioritizing the development of health data and application systems into an integrated health care technology ecosystem. This initiative involves all levels of health care, from primary to tertiary, across all provinces. In particular, it aims to enhance primary health care services (as the main interface with the general population) and contribute to Indonesia's digital health transformation.

Objective: This study assesses the information and communication technology (ICT) maturity in Indonesian health care services to advance digital health initiatives. ICT maturity assessment tools, specifically designed for middle-income countries, were used to evaluate digital health capabilities in 9 provinces across 5 Indonesian islands.

Methods: A cross-sectional survey was conducted from February to March 2022, in 9 provinces across Indonesia, representing the country's diverse conditions on its major islands. Respondents included staff from public health centers (Puskesmas), primary care clinics (Klinik Pratama), and district health offices (Dinas Kesehatan Kabupaten/Kota). The survey used adapted ICT maturity assessment questionnaires, covering human resources, software and system, hardware, and infrastructure. It was administered electronically and involved 121 public health centers, 49 primary care clinics, and 67 IT staff from district health offices. Focus group discussions were held to delve deeper into the assessment results and gain more descriptive insights.

Results: In this study, 237 participants represented 3 distinct categories: 121 public health centers, 67 district health offices, and 49 primary clinics. These instances were selected from a sample of 9 of the 34 provinces in Indonesia. Collected data from interviews and focus group discussions were transformed into scores on a scale of 1 to 5, with 1 indicating low ICT readiness and 5 indicating high ICT readiness. On average, the breakdown of ICT maturity scores was as follows: 2.71 for human resources' capability in ICT use and system management, 2.83 for software and information systems, 2.59 for hardware, and 2.84 for infrastructure, resulting in an overall average score of 2.74. According to the ICT maturity level pyramid, the ICT maturity of health care providers in Indonesia fell between the basic and good levels. The need to pursue best practices also emerged strongly. Further analysis of the ICT maturity scores, when examined by province, revealed regional variations.

Conclusions: The maturity of ICT use is influenced by several critical components. Enhancing human resources, ensuring infrastructure, the availability of supportive hardware, and optimizing informa

背景:印度尼西亚已迅速采用数字医疗技术,尤其是在 COVID-19 大流行期间,每天有超过 1500 万健康应用用户。为推进其数字医疗愿景,政府正在优先发展医疗数据和应用系统,使其成为一个综合医疗保健技术生态系统。这一举措涉及各省从初级到三级的各级医疗保健。尤其是,它旨在加强初级医疗保健服务(作为与普通民众的主要联系渠道),并为印尼的数字医疗转型做出贡献:本研究对印尼医疗保健服务中的信息与通信技术(ICT)成熟度进行评估,以推进数字医疗计划。采用专为中等收入国家设计的 ICT 成熟度评估工具,对印度尼西亚 5 个岛屿 9 个省的数字医疗能力进行评估:2022 年 2 月至 3 月,我们在印度尼西亚的 9 个省份进行了横向调查,这些省份代表了该国主要岛屿的不同情况。受访者包括公共卫生中心(Puskesmas)、初级保健诊所(Klinik Pratama)和地区卫生办公室(Dinas Kesehatan Kabupaten/Kota)的工作人员。调查使用了经过改编的信息与通信技术成熟度评估问卷,涵盖人力资源、软件与系统、硬件和基础设施。调查以电子方式进行,涉及 121 家公共卫生中心、49 家初级保健诊所和 67 名来自地区卫生办公室的 IT 人员。为深入了解评估结果并获得更多描述性见解,还举行了焦点小组讨论:在这项研究中,237 名参与者代表了 3 个不同的类别:结果:在这项研究中,237 名参与者代表了 3 个不同的类别:121 个公共卫生中心、67 个地区卫生办公室和 49 个初级诊所。这些实例选自印度尼西亚 34 个省中的 9 个省。从访谈和焦点小组讨论中收集到的数据被转换成 1 到 5 的分数,1 表示信息与通讯技术准备程度低,5 表示信息与通讯技术准备程度高。平均而言,信息和通信技术成熟度得分细分如下:人力资源在信息与传播技术使用和系统管理方面的能力为 2.71 分,软件和信息系统为 2.83 分,硬件为 2.59 分,基础设施为 2.84 分,总平均分为 2.74 分。根据信息和通信技术成熟度金字塔,印度尼西亚医疗机构的信息和通信技术成熟度介于基本和良好之间。追求最佳做法的必要性也非常明显。按省份对信息和通信技术成熟度得分进行的进一步分析表明,各地区之间存在差异:信息和通信技术使用的成熟度受几个关键因素的影响。加强人力资源、确保基础设施、支持性硬件的可用性以及优化信息系统,是在医疗保健服务中实现信息与传播技术成熟度的当务之急。在信息和通信技术成熟度评估方面,9 个省的各级医疗保健机构的得分差异很大,这突出表明了信息和通信技术就绪程度的多样性,以及采取因地制宜的后续行动的必要性。
{"title":"The Information and Communication Technology Maturity Assessment at Primary Health Care Services Across 9 Provinces in Indonesia: Evaluation Study.","authors":"Dewi Nur Aisyah, Agus Heri Setiawan, Alfiano Fawwaz Lokopessy, Nadia Faradiba, Setiaji Setiaji, Logan Manikam, Zisis Kozlakidis","doi":"10.2196/55959","DOIUrl":"10.2196/55959","url":null,"abstract":"<p><strong>Background: </strong>Indonesia has rapidly embraced digital health, particularly during the COVID-19 pandemic, with over 15 million daily health application users. To advance its digital health vision, the government is prioritizing the development of health data and application systems into an integrated health care technology ecosystem. This initiative involves all levels of health care, from primary to tertiary, across all provinces. In particular, it aims to enhance primary health care services (as the main interface with the general population) and contribute to Indonesia's digital health transformation.</p><p><strong>Objective: </strong>This study assesses the information and communication technology (ICT) maturity in Indonesian health care services to advance digital health initiatives. ICT maturity assessment tools, specifically designed for middle-income countries, were used to evaluate digital health capabilities in 9 provinces across 5 Indonesian islands.</p><p><strong>Methods: </strong>A cross-sectional survey was conducted from February to March 2022, in 9 provinces across Indonesia, representing the country's diverse conditions on its major islands. Respondents included staff from public health centers (Puskesmas), primary care clinics (Klinik Pratama), and district health offices (Dinas Kesehatan Kabupaten/Kota). The survey used adapted ICT maturity assessment questionnaires, covering human resources, software and system, hardware, and infrastructure. It was administered electronically and involved 121 public health centers, 49 primary care clinics, and 67 IT staff from district health offices. Focus group discussions were held to delve deeper into the assessment results and gain more descriptive insights.</p><p><strong>Results: </strong>In this study, 237 participants represented 3 distinct categories: 121 public health centers, 67 district health offices, and 49 primary clinics. These instances were selected from a sample of 9 of the 34 provinces in Indonesia. Collected data from interviews and focus group discussions were transformed into scores on a scale of 1 to 5, with 1 indicating low ICT readiness and 5 indicating high ICT readiness. On average, the breakdown of ICT maturity scores was as follows: 2.71 for human resources' capability in ICT use and system management, 2.83 for software and information systems, 2.59 for hardware, and 2.84 for infrastructure, resulting in an overall average score of 2.74. According to the ICT maturity level pyramid, the ICT maturity of health care providers in Indonesia fell between the basic and good levels. The need to pursue best practices also emerged strongly. Further analysis of the ICT maturity scores, when examined by province, revealed regional variations.</p><p><strong>Conclusions: </strong>The maturity of ICT use is influenced by several critical components. Enhancing human resources, ensuring infrastructure, the availability of supportive hardware, and optimizing informa","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11269960/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Lake, Data Warehouse, Datamart, and Feature Store: Their Contributions to the Complete Data Reuse Pipeline 数据湖、数据仓库、数据图表和特征库:它们对完整数据重用管道的贡献
IF 3.2 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-17 DOI: 10.2196/54590
Antoine Lamer, Chloé Saint-Dizier, Nicolas Paris, Emmanuel Chazard
The growing adoption and utilization of health information technology has generated a wealth of clinical data in electronic format, offering opportunities for data reuse beyond direct patient care. However, as data are distributed across multiple software, it becomes challenging to cross-reference information between sources due to differences in formats, vocabularies, technologies, and the absence of common identifiers among software. To address these challenges, hospitals have adopted data warehouses to consolidate and standardize these data for research. Additionally, as a complement or alternative, data lakes store both source data and metadata in a detailed and unprocessed format, empowering exploration, manipulation, and adaptation of the data to meet specific analytical needs. Subsequently, datamarts are utilized to further refine data into usable information tailored to specific research questions. However, for efficient analysis, a feature store is essential to pivot and denormalize the data, simplifying queries. In conclusion, while data warehouses are crucial, data lakes, datamarts and feature stores play essential and complementary roles in facilitating data reuse for research and analysis in healthcare.
随着医疗信息技术被越来越多地采用和使用,产生了大量电子格式的临床数据,为病人直接护理之外的数据再利用提供了机会。然而,由于数据分布在多个软件中,格式、词汇、技术存在差异,而且软件之间缺乏通用标识符,因此在不同来源之间交叉引用信息变得十分困难。为了应对这些挑战,医院采用了数据仓库来整合这些数据并使其标准化,以便进行研究。此外,作为一种补充或替代方法,数据湖以详细和未经处理的格式存储源数据和元数据,允许对数据进行探索、操作和调整,以满足特定的分析需求。随后,数据图表可用于将数据进一步细化为针对特定研究问题的可用信息。不过,为了进行高效分析,必须使用特征存储来透视和去规范化数据,从而简化查询。总之,虽然数据仓库至关重要,但数据湖、数据图表和特征库在促进医疗保健研究和分析中的数据再利用方面也发挥着重要的互补作用。
{"title":"Data Lake, Data Warehouse, Datamart, and Feature Store: Their Contributions to the Complete Data Reuse Pipeline","authors":"Antoine Lamer, Chloé Saint-Dizier, Nicolas Paris, Emmanuel Chazard","doi":"10.2196/54590","DOIUrl":"https://doi.org/10.2196/54590","url":null,"abstract":"The growing adoption and utilization of health information technology has generated a wealth of clinical data in electronic format, offering opportunities for data reuse beyond direct patient care. However, as data are distributed across multiple software, it becomes challenging to cross-reference information between sources due to differences in formats, vocabularies, technologies, and the absence of common identifiers among software. To address these challenges, hospitals have adopted data warehouses to consolidate and standardize these data for research. Additionally, as a complement or alternative, data lakes store both source data and metadata in a detailed and unprocessed format, empowering exploration, manipulation, and adaptation of the data to meet specific analytical needs. Subsequently, datamarts are utilized to further refine data into usable information tailored to specific research questions. However, for efficient analysis, a feature store is essential to pivot and denormalize the data, simplifying queries. In conclusion, while data warehouses are crucial, data lakes, datamarts and feature stores play essential and complementary roles in facilitating data reuse for research and analysis in healthcare.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Large Language Models for Automated Reporting and Data Systems Categorization: Cross-Sectional Study. 评估用于自动报告和数据系统分类的大型语言模型:横断面研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-17 DOI: 10.2196/55799
Qingxia Wu, Qingxia Wu, Huali Li, Yan Wang, Yan Bai, Yaping Wu, Xuan Yu, Xiaodong Li, Pei Dong, Jon Xue, Dinggang Shen, Meiyun Wang

Background: Large language models show promise for improving radiology workflows, but their performance on structured radiological tasks such as Reporting and Data Systems (RADS) categorization remains unexplored.

Objective: This study aims to evaluate 3 large language model chatbots-Claude-2, GPT-3.5, and GPT-4-on assigning RADS categories to radiology reports and assess the impact of different prompting strategies.

Methods: This cross-sectional study compared 3 chatbots using 30 radiology reports (10 per RADS criteria), using a 3-level prompting strategy: zero-shot, few-shot, and guideline PDF-informed prompts. The cases were grounded in Liver Imaging Reporting & Data System (LI-RADS) version 2018, Lung CT (computed tomography) Screening Reporting & Data System (Lung-RADS) version 2022, and Ovarian-Adnexal Reporting & Data System (O-RADS) magnetic resonance imaging, meticulously prepared by board-certified radiologists. Each report underwent 6 assessments. Two blinded reviewers assessed the chatbots' response at patient-level RADS categorization and overall ratings. The agreement across repetitions was assessed using Fleiss κ.

Results: Claude-2 achieved the highest accuracy in overall ratings with few-shot prompts and guideline PDFs (prompt-2), attaining 57% (17/30) average accuracy over 6 runs and 50% (15/30) accuracy with k-pass voting. Without prompt engineering, all chatbots performed poorly. The introduction of a structured exemplar prompt (prompt-1) increased the accuracy of overall ratings for all chatbots. Providing prompt-2 further improved Claude-2's performance, an enhancement not replicated by GPT-4. The interrun agreement was substantial for Claude-2 (k=0.66 for overall rating and k=0.69 for RADS categorization), fair for GPT-4 (k=0.39 for both), and fair for GPT-3.5 (k=0.21 for overall rating and k=0.39 for RADS categorization). All chatbots showed significantly higher accuracy with LI-RADS version 2018 than with Lung-RADS version 2022 and O-RADS (P<.05); with prompt-2, Claude-2 achieved the highest overall rating accuracy of 75% (45/60) in LI-RADS version 2018.

Conclusions: When equipped with structured prompts and guideline PDFs, Claude-2 demonstrated potential in assigning RADS categories to radiology cases according to established criteria such as LI-RADS version 2018. However, the current generation of chatbots lags in accurately categorizing cases based on more recent RADS criteria.

背景:大型语言模型有望改善放射学工作流程,但它们在报告和数据系统(RADS)分类等结构化放射学任务中的表现仍有待探索:本研究旨在评估 3 个大型语言模型聊天机器人--Claude-2、GPT-3.5 和 GPT-4--在为放射学报告分配 RADS 类别方面的表现,并评估不同提示策略的影响:这项横断面研究使用 30 份放射学报告(每份 RADS 标准 10 份)对 3 个聊天机器人进行了比较,并使用了 3 级提示策略:零镜头、少镜头和指南 PDF 信息提示。这些病例基于肝脏成像报告和数据系统(LI-RADS)2018 年版、肺部 CT(计算机断层扫描)筛查报告和数据系统(Lung-RADS)2022 年版以及卵巢-附件报告和数据系统(O-RADS)磁共振成像,由经委员会认证的放射科医生精心准备。每份报告都经过 6 次评估。两名盲审员评估聊天机器人在患者级别 RADS 分类和总体评分方面的反应。结果:结果:Claude-2 在使用很少的提示和指南 PDF(提示-2)进行总体评分时达到了最高的准确率,6 次运行的平均准确率为 57%(17/30),使用 k-pass 投票的准确率为 50%(15/30)。如果没有提示工程,所有聊天机器人的表现都很差。引入结构化示例提示(提示-1)后,所有聊天机器人的总体评分准确率都有所提高。提示-2进一步提高了Claude-2的表现,而GPT-4没有复制这种提高。Claude-2 的运行间一致性很高(总体评分 k=0.66,RADS 分类 k=0.69),GPT-4 的运行间一致性一般(两者的 k=0.39),GPT-3.5 的运行间一致性一般(总体评分 k=0.21,RADS 分类 k=0.39)。所有聊天机器人对LI-RADS 2018版的准确率都明显高于Lung-RADS 2022版和O-RADS(PConclusions:当配备结构化提示和指南 PDF 时,Claude-2 在根据既定标准(如 LI-RADS 2018 版)为放射病例分配 RADS 类别方面表现出了潜力。然而,目前的聊天机器人在根据最新的 RADS 标准对病例进行准确分类方面还比较落后。
{"title":"Evaluating Large Language Models for Automated Reporting and Data Systems Categorization: Cross-Sectional Study.","authors":"Qingxia Wu, Qingxia Wu, Huali Li, Yan Wang, Yan Bai, Yaping Wu, Xuan Yu, Xiaodong Li, Pei Dong, Jon Xue, Dinggang Shen, Meiyun Wang","doi":"10.2196/55799","DOIUrl":"10.2196/55799","url":null,"abstract":"<p><strong>Background: </strong>Large language models show promise for improving radiology workflows, but their performance on structured radiological tasks such as Reporting and Data Systems (RADS) categorization remains unexplored.</p><p><strong>Objective: </strong>This study aims to evaluate 3 large language model chatbots-Claude-2, GPT-3.5, and GPT-4-on assigning RADS categories to radiology reports and assess the impact of different prompting strategies.</p><p><strong>Methods: </strong>This cross-sectional study compared 3 chatbots using 30 radiology reports (10 per RADS criteria), using a 3-level prompting strategy: zero-shot, few-shot, and guideline PDF-informed prompts. The cases were grounded in Liver Imaging Reporting & Data System (LI-RADS) version 2018, Lung CT (computed tomography) Screening Reporting & Data System (Lung-RADS) version 2022, and Ovarian-Adnexal Reporting & Data System (O-RADS) magnetic resonance imaging, meticulously prepared by board-certified radiologists. Each report underwent 6 assessments. Two blinded reviewers assessed the chatbots' response at patient-level RADS categorization and overall ratings. The agreement across repetitions was assessed using Fleiss κ.</p><p><strong>Results: </strong>Claude-2 achieved the highest accuracy in overall ratings with few-shot prompts and guideline PDFs (prompt-2), attaining 57% (17/30) average accuracy over 6 runs and 50% (15/30) accuracy with k-pass voting. Without prompt engineering, all chatbots performed poorly. The introduction of a structured exemplar prompt (prompt-1) increased the accuracy of overall ratings for all chatbots. Providing prompt-2 further improved Claude-2's performance, an enhancement not replicated by GPT-4. The interrun agreement was substantial for Claude-2 (k=0.66 for overall rating and k=0.69 for RADS categorization), fair for GPT-4 (k=0.39 for both), and fair for GPT-3.5 (k=0.21 for overall rating and k=0.39 for RADS categorization). All chatbots showed significantly higher accuracy with LI-RADS version 2018 than with Lung-RADS version 2022 and O-RADS (P<.05); with prompt-2, Claude-2 achieved the highest overall rating accuracy of 75% (45/60) in LI-RADS version 2018.</p><p><strong>Conclusions: </strong>When equipped with structured prompts and guideline PDFs, Claude-2 demonstrated potential in assigning RADS categories to radiology cases according to established criteria such as LI-RADS version 2018. However, the current generation of chatbots lags in accurately categorizing cases based on more recent RADS criteria.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11292156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diagnostic Accuracy of Artificial Intelligence in Endoscopy: Umbrella Review 人工智能在内窥镜检查中的诊断准确性:综述
IF 3.2 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-15 DOI: 10.2196/56361
Bowen Zha, Angshu Cai, Guiqi Wang
Background: Some research has already reported the diagnostic value of artificial intelligence (AI) in different endoscopy outcomes. However, the evidence is confusing and of varying quality. Objective: To comprehensively evaluate the credibility of the evidence of the diagnostic accuracy of artificial intelligence in endoscopy. Methods: Before the study began, the protocol was registered in the International prospective register of systematic reviews (CRD42023483073). Firstly, two researchers searched PubMed, Web of Science, Embase, and Cochrane Library using comprehensive search terms. The deadline is November 2023. Then, researchers conduct screening research and extract information. We use A Measurement Tool to Assess Systematic Reviews 2 (AMSTAR2) to evaluate the quality of the article. We choose the research with higher quality evaluation for the same outcome for further analysis. In order to ensure the reliability of the conclusion, we have calculated each outcome again. Finally, the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) is used to evaluate the credibility of the outcome. Results: A total of 21 studies were included for analysis. Through AMSTAR2, it was found that eight research methodologies were of moderate quality, while other studies were regarded as low or critical low. The sensitivity and specificity of 17 different outcomes were analyzed. There are four different outcomes related to the esophagus, stomach, and colorectal, respectively. Two outcomes are associated with capsule endoscopy and laryngoscope, respectively. While the other is related to ultrasonic endoscopy. In terms of sensitivity, gastroesophageal reflux disease has the highest accuracy rate, reaching 97%, while the invasion depth of colon neoplasia has the lowest accuracy rate, only 71%. On the other hand, the specificity of colorectal cancer is the highest, reaching 98%, while the gastrointestinal stromal tumor has the lowest, only 80%. The GRADE evaluation suggests that the reliability of most outcomes are evaluated as low or very low. Conclusions: AI shows the value of diagnosis in endoscopy, especially in esophageal and colorectal diseases. These findings provide a theoretical basis for the development and evaluation of the use of AI-assisted systems, which are aimed at assisting endoscopists to carry out examinations to improve human health. However, it is worth noting further high-quality research is needed in the future.
背景:一些研究已经报告了人工智能(AI)在不同内窥镜检查结果中的诊断价值。然而,这些证据混乱且质量参差不齐。目的全面评估人工智能在内窥镜检查中诊断准确性证据的可信度。研究方法研究开始前,研究方案已在国际前瞻性系统综述注册中心(CRD42023483073)注册。首先,两名研究人员使用综合检索词检索了 PubMed、Web of Science、Embase 和 Cochrane Library。截止日期为 2023 年 11 月。然后,研究人员进行筛选研究并提取信息。我们使用 "评估系统性综述的测量工具 2"(AMSTAR2)来评价文章的质量。我们会选择对相同结果评价质量较高的研究进行进一步分析。为了确保结论的可靠性,我们对每项结果都进行了重新计算。最后,我们采用建议评估、发展和评价分级法(GRADE)来评价结果的可信度。结果:共纳入 21 项研究进行分析。通过 AMSTAR2,发现有 8 项研究方法的质量为中等,其他研究则被视为低质量或临界低质量。分析了 17 种不同结果的敏感性和特异性。有四种不同的结果分别与食道、胃和结肠直肠有关。两种结果分别与胶囊内窥镜和喉镜有关。另一种则与超声波内窥镜检查有关。在灵敏度方面,胃食管反流病的准确率最高,达到 97%,而结肠肿瘤侵犯深度的准确率最低,仅为 71%。另一方面,结直肠癌的特异性最高,达到 98%,而胃肠道间质瘤的特异性最低,仅为 80%。GRADE 评估表明,大多数结果的可靠性被评为较低或非常低。结论人工智能显示了内窥镜诊断的价值,尤其是在食道和结直肠疾病方面。这些发现为开发和评估人工智能辅助系统的使用提供了理论依据,这些系统旨在协助内镜医师进行检查,以改善人类健康。不过,值得注意的是,今后还需要进一步开展高质量的研究。
{"title":"Diagnostic Accuracy of Artificial Intelligence in Endoscopy: Umbrella Review","authors":"Bowen Zha, Angshu Cai, Guiqi Wang","doi":"10.2196/56361","DOIUrl":"https://doi.org/10.2196/56361","url":null,"abstract":"Background: Some research has already reported the diagnostic value of artificial intelligence (AI) in different endoscopy outcomes. However, the evidence is confusing and of varying quality. Objective: To comprehensively evaluate the credibility of the evidence of the diagnostic accuracy of artificial intelligence in endoscopy. Methods: Before the study began, the protocol was registered in the International prospective register of systematic reviews (CRD42023483073). Firstly, two researchers searched PubMed, Web of Science, Embase, and Cochrane Library using comprehensive search terms. The deadline is November 2023. Then, researchers conduct screening research and extract information. We use A Measurement Tool to Assess Systematic Reviews 2 (AMSTAR2) to evaluate the quality of the article. We choose the research with higher quality evaluation for the same outcome for further analysis. In order to ensure the reliability of the conclusion, we have calculated each outcome again. Finally, the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) is used to evaluate the credibility of the outcome. Results: A total of 21 studies were included for analysis. Through AMSTAR2, it was found that eight research methodologies were of moderate quality, while other studies were regarded as low or critical low. The sensitivity and specificity of 17 different outcomes were analyzed. There are four different outcomes related to the esophagus, stomach, and colorectal, respectively. Two outcomes are associated with capsule endoscopy and laryngoscope, respectively. While the other is related to ultrasonic endoscopy. In terms of sensitivity, gastroesophageal reflux disease has the highest accuracy rate, reaching 97%, while the invasion depth of colon neoplasia has the lowest accuracy rate, only 71%. On the other hand, the specificity of colorectal cancer is the highest, reaching 98%, while the gastrointestinal stromal tumor has the lowest, only 80%. The GRADE evaluation suggests that the reliability of most outcomes are evaluated as low or very low. Conclusions: AI shows the value of diagnosis in endoscopy, especially in esophageal and colorectal diseases. These findings provide a theoretical basis for the development and evaluation of the use of AI-assisted systems, which are aimed at assisting endoscopists to carry out examinations to improve human health. However, it is worth noting further high-quality research is needed in the future.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating Clinical Data and Medical Imaging in Lung Cancer: Feasibility Study Using the Observational Medical Outcomes Partnership Common Data Model Extension. 整合肺癌临床数据和医学影像:使用观察性医疗结果伙伴关系通用数据模型扩展的可行性研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-12 DOI: 10.2196/59187
Hyerim Ji, Seok Kim, Leonard Sunwoo, Sowon Jang, Ho-Young Lee, Sooyoung Yoo

Background: Digital transformation, particularly the integration of medical imaging with clinical data, is vital in personalized medicine. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) standardizes health data. However, integrating medical imaging remains a challenge.

Objective: This study proposes a method for combining medical imaging data with the OMOP CDM to improve multimodal research.

Methods: Our approach included the analysis and selection of digital imaging and communications in medicine header tags, validation of data formats, and alignment according to the OMOP CDM framework. The Fast Healthcare Interoperability Resources ImagingStudy profile guided our consistency in column naming and definitions. Imaging Common Data Model (I-CDM), constructed using the entity-attribute-value model, facilitates scalable and efficient medical imaging data management. For patients with lung cancer diagnosed between 2010 and 2017, we introduced 4 new tables-IMAGING_STUDY, IMAGING_SERIES, IMAGING_ANNOTATION, and FILEPATH-to standardize various imaging-related data and link to clinical data.

Results: This framework underscores the effectiveness of I-CDM in enhancing our understanding of lung cancer diagnostics and treatment strategies. The implementation of the I-CDM tables enabled the structured organization of a comprehensive data set, including 282,098 IMAGING_STUDY, 5,674,425 IMAGING_SERIES, and 48,536 IMAGING_ANNOTATION records, illustrating the extensive scope and depth of the approach. A scenario-based analysis using actual data from patients with lung cancer underscored the feasibility of our approach. A data quality check applying 44 specific rules confirmed the high integrity of the constructed data set, with all checks successfully passed, underscoring the reliability of our findings.

Conclusions: These findings indicate that I-CDM can improve the integration and analysis of medical imaging and clinical data. By addressing the challenges in data standardization and management, our approach contributes toward enhancing diagnostics and treatment strategies. Future research should expand the application of I-CDM to diverse disease populations and explore its wide-ranging utility for medical conditions.

背景介绍数字化转型,尤其是医学影像与临床数据的整合,对个性化医疗至关重要。观察性医疗结果合作组织(OMOP)通用数据模型(CDM)实现了健康数据的标准化。然而,整合医学影像仍然是一项挑战:本研究提出了一种将医学影像数据与 OMOP CDM 相结合的方法,以改进多模态研究:方法:我们的方法包括分析和选择医学数字成像和通信标题标签,验证数据格式,并根据 OMOP CDM 框架进行调整。快速医疗保健互操作性资源成像研究(Fast Healthcare Interoperability Resources ImagingStudy profile)指导我们保持列命名和定义的一致性。成像通用数据模型(I-CDM)采用实体-属性-值模型构建,有利于可扩展和高效的医学成像数据管理。对于 2010 年至 2017 年期间确诊的肺癌患者,我们引入了 4 个新表--IMAGING_STUDY、IMAGING_SERIES、IMAGING_ANNOTATION 和 FILEPATH--以标准化各种影像相关数据并链接到临床数据:该框架强调了 I-CDM 在增强我们对肺癌诊断和治疗策略的理解方面的有效性。通过实施 I-CDM 表格,我们可以结构化地组织一个综合数据集,其中包括 282,098 个 IMAGING_STUDY、5,674,425 个 IMAGING_SERIES 和 48,536 个 IMAGING_ANNOTATION 记录,这说明了该方法的广度和深度。利用肺癌患者的实际数据进行的情景分析强调了我们方法的可行性。应用 44 条特定规则进行的数据质量检查证实了所构建数据集的高度完整性,所有检查均顺利通过,从而强调了我们研究结果的可靠性:这些研究结果表明,I-CDM 可以改进医学成像和临床数据的整合与分析。通过应对数据标准化和管理方面的挑战,我们的方法有助于改进诊断和治疗策略。未来的研究应将 I-CDM 的应用扩展到不同的疾病人群,并探索其在医疗条件方面的广泛用途。
{"title":"Integrating Clinical Data and Medical Imaging in Lung Cancer: Feasibility Study Using the Observational Medical Outcomes Partnership Common Data Model Extension.","authors":"Hyerim Ji, Seok Kim, Leonard Sunwoo, Sowon Jang, Ho-Young Lee, Sooyoung Yoo","doi":"10.2196/59187","DOIUrl":"10.2196/59187","url":null,"abstract":"<p><strong>Background: </strong>Digital transformation, particularly the integration of medical imaging with clinical data, is vital in personalized medicine. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) standardizes health data. However, integrating medical imaging remains a challenge.</p><p><strong>Objective: </strong>This study proposes a method for combining medical imaging data with the OMOP CDM to improve multimodal research.</p><p><strong>Methods: </strong>Our approach included the analysis and selection of digital imaging and communications in medicine header tags, validation of data formats, and alignment according to the OMOP CDM framework. The Fast Healthcare Interoperability Resources ImagingStudy profile guided our consistency in column naming and definitions. Imaging Common Data Model (I-CDM), constructed using the entity-attribute-value model, facilitates scalable and efficient medical imaging data management. For patients with lung cancer diagnosed between 2010 and 2017, we introduced 4 new tables-IMAGING_STUDY, IMAGING_SERIES, IMAGING_ANNOTATION, and FILEPATH-to standardize various imaging-related data and link to clinical data.</p><p><strong>Results: </strong>This framework underscores the effectiveness of I-CDM in enhancing our understanding of lung cancer diagnostics and treatment strategies. The implementation of the I-CDM tables enabled the structured organization of a comprehensive data set, including 282,098 IMAGING_STUDY, 5,674,425 IMAGING_SERIES, and 48,536 IMAGING_ANNOTATION records, illustrating the extensive scope and depth of the approach. A scenario-based analysis using actual data from patients with lung cancer underscored the feasibility of our approach. A data quality check applying 44 specific rules confirmed the high integrity of the constructed data set, with all checks successfully passed, underscoring the reliability of our findings.</p><p><strong>Conclusions: </strong>These findings indicate that I-CDM can improve the integration and analysis of medical imaging and clinical data. By addressing the challenges in data standardization and management, our approach contributes toward enhancing diagnostics and treatment strategies. Future research should expand the application of I-CDM to diverse disease populations and explore its wide-ranging utility for medical conditions.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11282389/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time Series AI Model for Acute Kidney Injury Detection Based on a Multicenter Distributed Research Network: Development and Verification Study 基于多中心分布式研究网络的急性肾损伤检测时间序列人工智能模型:开发与验证研究
IF 3.2 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-05 DOI: 10.2196/47693
Suncheol Heo, Eun-Ae Kang, Jae Yong Yu, Hae Reong Kim, Suehyun Lee, Kwangsoo Kim, Yul Hwangbo, Rae Woong Park, Hyunah Shin, Kyeongmin Ryu, Chungsoo Kim, Hyojung Jung, Yebin Chegal, Jae-Hyun Lee, Yu Rang Park
Background: Acute kidney injury (AKI) is a marker of clinical deterioration and renal toxicity. While there are many studies offering prediction models for the early detection of AKI, those predicting AKI occurrence using distributed research network (DRN)-based time series data are rare. Objective: In this study, we aimed to detect the early occurrence of AKI by applying the interpretable LSTM-based model on a hospital EHR-based time series in patients who took nephrotoxic drugs using a DRN Methods: We conducted a multi-institutional retrospective cohort study of data from six hospitals using a DRN. For each institution, a patient-based dataset was constructed using five drugs for AKI, and the interpretable multi-variable long short-term memory (IMV-LSTM) model was used for training. This study employed propensity score matching to mitigate differences in demographics and clinical characteristics. Additionally, the temporal attention values of the AKI prediction model's contribution variables were demonstrated for each institution and drug, with differences in highly important feature distributions between the case and control data confirmed using one-way analysis of variance. Results: This study analyzed 8,643 and 31,012 patients with and without AKI, respectively, across six hospitals. When analyzing the distribution of AKI onset, vancomycin showed an earlier onset (median: 12 days), and acyclovir was the slowest compared to the other drugs (median: 23 days). Our temporal deep learning model for AKI prediction performed well for most drugs. Acyclovir had the highest average area under the receiver operating characteristic curve score per drug (0.94), followed by acetaminophen (0.93), vancomycin (0.92), naproxen (0.90), and celecoxib (0.89). Based on the temporal attention values of the variables in the AKI prediction model, verified lymphocytes and calcium had the highest attention, whereas lymphocytes, albumin, and hemoglobin tended to decrease over time, and urine pH and prothrombin time tended to increase. Conclusions: Early surveillance of AKI outbreaks can be achieved by applying the IMV-LSTM based on time series data through hospital electronic health records (EHR)-based DRNs. This approach can help identify risk factors and enable early detection of adverse drug reactions when prescribing drugs that cause renal toxicity before AKI occurs.
背景:急性肾损伤(AKI)是临床恶化和肾毒性的标志。虽然有许多研究提供了早期检测 AKI 的预测模型,但利用基于分布式研究网络(DRN)的时间序列数据预测 AKI 发生率的研究却很少见。研究目的在本研究中,我们的目的是通过在使用 DRN 的服用肾毒性药物患者的医院电子病历时间序列中应用基于 LSTM 的可解释模型来检测 AKI 的早期发生:我们对六家使用 DRN 的医院的数据进行了多机构回顾性队列研究。我们为每家医院构建了一个基于患者的数据集,其中使用了五种治疗 AKI 的药物,并使用可解释多变量长短期记忆(IMV-LSTM)模型进行训练。本研究采用倾向得分匹配来减少人口统计学和临床特征的差异。此外,AKI 预测模型的贡献变量的时间注意力值在每个机构和药物中都得到了证明,病例数据和对照数据之间的高重要性特征分布差异也通过单因素方差分析得到了证实。研究结果本研究分析了六家医院的 8643 名 AKI 患者和 31012 名无 AKI 患者。在分析 AKI 发病时间分布时,万古霉素的发病时间较早(中位数:12 天),而阿昔洛韦的发病时间与其他药物相比最慢(中位数:23 天)。我们用于预测 AKI 的时空深度学习模型对大多数药物都表现良好。阿昔洛韦的每种药物接收者操作特征曲线下的平均面积得分最高(0.94),其次是对乙酰氨基酚(0.93)、万古霉素(0.92)、萘普生(0.90)和塞来昔布(0.89)。根据 AKI 预测模型中各变量的时间关注值,经核实的淋巴细胞和钙的关注度最高,而淋巴细胞、白蛋白和血红蛋白随着时间的推移呈下降趋势,尿 pH 值和凝血酶原时间呈上升趋势。结论:通过基于医院电子病历 (EHR) 的 DRN,基于时间序列数据应用 IMV-LSTM 可实现对 AKI 爆发的早期监控。这种方法有助于识别风险因素,并在发生 AKI 之前,在处方会引起肾毒性的药物时,及早发现药物不良反应。
{"title":"Time Series AI Model for Acute Kidney Injury Detection Based on a Multicenter Distributed Research Network: Development and Verification Study","authors":"Suncheol Heo, Eun-Ae Kang, Jae Yong Yu, Hae Reong Kim, Suehyun Lee, Kwangsoo Kim, Yul Hwangbo, Rae Woong Park, Hyunah Shin, Kyeongmin Ryu, Chungsoo Kim, Hyojung Jung, Yebin Chegal, Jae-Hyun Lee, Yu Rang Park","doi":"10.2196/47693","DOIUrl":"https://doi.org/10.2196/47693","url":null,"abstract":"Background: Acute kidney injury (AKI) is a marker of clinical deterioration and renal toxicity. While there are many studies offering prediction models for the early detection of AKI, those predicting AKI occurrence using distributed research network (DRN)-based time series data are rare. Objective: In this study, we aimed to detect the early occurrence of AKI by applying the interpretable LSTM-based model on a hospital EHR-based time series in patients who took nephrotoxic drugs using a DRN Methods: We conducted a multi-institutional retrospective cohort study of data from six hospitals using a DRN. For each institution, a patient-based dataset was constructed using five drugs for AKI, and the interpretable multi-variable long short-term memory (IMV-LSTM) model was used for training. This study employed propensity score matching to mitigate differences in demographics and clinical characteristics. Additionally, the temporal attention values of the AKI prediction model's contribution variables were demonstrated for each institution and drug, with differences in highly important feature distributions between the case and control data confirmed using one-way analysis of variance. Results: This study analyzed 8,643 and 31,012 patients with and without AKI, respectively, across six hospitals. When analyzing the distribution of AKI onset, vancomycin showed an earlier onset (median: 12 days), and acyclovir was the slowest compared to the other drugs (median: 23 days). Our temporal deep learning model for AKI prediction performed well for most drugs. Acyclovir had the highest average area under the receiver operating characteristic curve score per drug (0.94), followed by acetaminophen (0.93), vancomycin (0.92), naproxen (0.90), and celecoxib (0.89). Based on the temporal attention values of the variables in the AKI prediction model, verified lymphocytes and calcium had the highest attention, whereas lymphocytes, albumin, and hemoglobin tended to decrease over time, and urine pH and prothrombin time tended to increase. Conclusions: Early surveillance of AKI outbreaks can be achieved by applying the IMV-LSTM based on time series data through hospital electronic health records (EHR)-based DRNs. This approach can help identify risk factors and enable early detection of adverse drug reactions when prescribing drugs that cause renal toxicity before AKI occurs.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141569919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-Preserving Prediction of Postoperative Mortality in Multi-Institutional Data: Development and Usability Study. 多机构数据中术后死亡率的隐私保护预测:开发和可用性研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-05 DOI: 10.2196/56893
Jungyo Suh, Garam Lee, Jung Woo Kim, Junbum Shin, Yi-Jun Kim, Sang-Wook Lee, Sulgi Kim

Background: To circumvent regulatory barriers that limit medical data exchange due to personal information security concerns, we use homomorphic encryption (HE) technology, enabling computation on encrypted data and enhancing privacy.

Objective: This study explores whether using HE to integrate encrypted multi-institutional data enhances predictive power in research, focusing on the integration feasibility across institutions and determining the optimal size of hospital data sets for improved prediction models.

Methods: We used data from 341,007 individuals aged 18 years and older who underwent noncardiac surgeries across 3 medical institutions. The study focused on predicting in-hospital mortality within 30 days postoperatively, using secure logistic regression based on HE as the prediction model. We compared the predictive performance of this model using plaintext data from a single institution against a model using encrypted data from multiple institutions.

Results: The predictive model using encrypted data from all 3 institutions exhibited the best performance based on area under the receiver operating characteristic curve (0.941); the model combining Asan Medical Center (AMC) and Seoul National University Hospital (SNUH) data exhibited the best predictive performance based on area under the precision-recall curve (0.132). Both Ewha Womans University Medical Center and SNUH demonstrated improvement in predictive power for their own institutions upon their respective data's addition to the AMC data.

Conclusions: Prediction models using multi-institutional data sets processed with HE outperformed those using single-institution data sets, especially when our model adaptation approach was applied, which was further validated on a smaller host hospital with a limited data set.

背景:为了规避因个人信息安全问题而限制医疗数据交换的监管障碍,我们使用了同态加密(HE)技术,从而能够对加密数据进行计算并提高隐私保护:本研究探讨了使用 HE 整合加密的多机构数据是否能提高研究预测能力,重点关注跨机构整合的可行性,并确定医院数据集的最佳规模,以改进预测模型:我们使用了 341 007 名年龄在 18 岁及以上、在 3 家医疗机构接受过非心脏手术的患者的数据。研究的重点是预测术后 30 天内的院内死亡率,使用基于 HE 的安全逻辑回归作为预测模型。我们比较了该模型使用来自单一机构的明文数据和使用来自多个机构的加密数据的预测性能:结果:根据接收者操作特征曲线下面积(0.941),使用所有 3 家机构加密数据的预测模型表现最佳;根据精确度-调用曲线下面积(0.132),结合牙山医疗中心(AMC)和首尔国立大学医院(SNUH)数据的模型表现最佳。梨花女子大学医学中心和首尔国立大学医院在将各自的数据添加到AMC数据后,对各自机构的预测能力都有所提高:结论:使用 HE 处理的多机构数据集建立的预测模型优于使用单机构数据集建立的预测模型,尤其是在采用我们的模型适应方法时。
{"title":"Privacy-Preserving Prediction of Postoperative Mortality in Multi-Institutional Data: Development and Usability Study.","authors":"Jungyo Suh, Garam Lee, Jung Woo Kim, Junbum Shin, Yi-Jun Kim, Sang-Wook Lee, Sulgi Kim","doi":"10.2196/56893","DOIUrl":"10.2196/56893","url":null,"abstract":"<p><strong>Background: </strong>To circumvent regulatory barriers that limit medical data exchange due to personal information security concerns, we use homomorphic encryption (HE) technology, enabling computation on encrypted data and enhancing privacy.</p><p><strong>Objective: </strong>This study explores whether using HE to integrate encrypted multi-institutional data enhances predictive power in research, focusing on the integration feasibility across institutions and determining the optimal size of hospital data sets for improved prediction models.</p><p><strong>Methods: </strong>We used data from 341,007 individuals aged 18 years and older who underwent noncardiac surgeries across 3 medical institutions. The study focused on predicting in-hospital mortality within 30 days postoperatively, using secure logistic regression based on HE as the prediction model. We compared the predictive performance of this model using plaintext data from a single institution against a model using encrypted data from multiple institutions.</p><p><strong>Results: </strong>The predictive model using encrypted data from all 3 institutions exhibited the best performance based on area under the receiver operating characteristic curve (0.941); the model combining Asan Medical Center (AMC) and Seoul National University Hospital (SNUH) data exhibited the best predictive performance based on area under the precision-recall curve (0.132). Both Ewha Womans University Medical Center and SNUH demonstrated improvement in predictive power for their own institutions upon their respective data's addition to the AMC data.</p><p><strong>Conclusions: </strong>Prediction models using multi-institutional data sets processed with HE outperformed those using single-institution data sets, especially when our model adaptation approach was applied, which was further validated on a smaller host hospital with a limited data set.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11259763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141539029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study. 边界注释有必要吗?评估无边界方法以提高临床命名实体注释效率:案例研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-07-02 DOI: 10.2196/59680
Gabriel Herman Bernardim Andrade, Shuntaro Yada, Eiji Aramaki

Background: Named entity recognition (NER) is a fundamental task in natural language processing. However, it is typically preceded by named entity annotation, which poses several challenges, especially in the clinical domain. For instance, determining entity boundaries is one of the most common sources of disagreements between annotators due to questions such as whether modifiers or peripheral words should be annotated. If unresolved, these can induce inconsistency in the produced corpora, yet, on the other hand, strict guidelines or adjudication sessions can further prolong an already slow and convoluted process.

Objective: The aim of this study is to address these challenges by evaluating 2 novel annotation methodologies, lenient span and point annotation, aiming to mitigate the difficulty of precisely determining entity boundaries.

Methods: We evaluate their effects through an annotation case study on a Japanese medical case report data set. We compare annotation time, annotator agreement, and the quality of the produced labeling and assess the impact on the performance of an NER system trained on the annotated corpus.

Results: We saw significant improvements in the labeling process efficiency, with up to a 25% reduction in overall annotation time and even a 10% improvement in annotator agreement compared to the traditional boundary-strict approach. However, even the best-achieved NER model presented some drop in performance compared to the traditional annotation methodology.

Conclusions: Our findings demonstrate a balance between annotation speed and model performance. Although disregarding boundary information affects model performance to some extent, this is counterbalanced by significant reductions in the annotator's workload and notable improvements in the speed of the annotation process. These benefits may prove valuable in various applications, offering an attractive compromise for developers and researchers.

背景命名实体识别(NER)是自然语言处理中的一项基本任务。然而,在进行命名实体识别之前通常需要进行命名实体注释,这就带来了一些挑战,尤其是在临床领域。例如,确定实体边界是注释者之间产生分歧的最常见原因之一,原因在于修饰词或外围词是否应该注释等问题。如果这些问题得不到解决,就会导致生成的语料库不一致,而另一方面,严格的指导原则或裁定会议又会进一步延长本已缓慢而复杂的过程:本研究旨在通过评估两种新型注释方法--宽松跨度注释法和点注释法来应对这些挑战,从而减轻精确确定实体边界的难度:我们通过对日本医学病例报告数据集的注释案例研究来评估这两种方法的效果。我们比较了标注时间、标注者的一致意见和生成的标注质量,并评估了对在标注语料库上训练的 NER 系统性能的影响:我们发现标注过程的效率有了明显提高,与传统的边界严格方法相比,整体标注时间最多缩短了 25%,标注者的一致性甚至提高了 10%。不过,与传统标注方法相比,即使是效果最好的 NER 模型,其性能也会有所下降:我们的研究结果表明了注释速度和模型性能之间的平衡。虽然忽略边界信息会在一定程度上影响模型性能,但注释者工作量的显著减少和注释过程速度的明显提高抵消了这一影响。这些优势可能会在各种应用中证明是有价值的,为开发人员和研究人员提供了一个有吸引力的折中方案。
{"title":"Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study.","authors":"Gabriel Herman Bernardim Andrade, Shuntaro Yada, Eiji Aramaki","doi":"10.2196/59680","DOIUrl":"10.2196/59680","url":null,"abstract":"<p><strong>Background: </strong>Named entity recognition (NER) is a fundamental task in natural language processing. However, it is typically preceded by named entity annotation, which poses several challenges, especially in the clinical domain. For instance, determining entity boundaries is one of the most common sources of disagreements between annotators due to questions such as whether modifiers or peripheral words should be annotated. If unresolved, these can induce inconsistency in the produced corpora, yet, on the other hand, strict guidelines or adjudication sessions can further prolong an already slow and convoluted process.</p><p><strong>Objective: </strong>The aim of this study is to address these challenges by evaluating 2 novel annotation methodologies, lenient span and point annotation, aiming to mitigate the difficulty of precisely determining entity boundaries.</p><p><strong>Methods: </strong>We evaluate their effects through an annotation case study on a Japanese medical case report data set. We compare annotation time, annotator agreement, and the quality of the produced labeling and assess the impact on the performance of an NER system trained on the annotated corpus.</p><p><strong>Results: </strong>We saw significant improvements in the labeling process efficiency, with up to a 25% reduction in overall annotation time and even a 10% improvement in annotator agreement compared to the traditional boundary-strict approach. However, even the best-achieved NER model presented some drop in performance compared to the traditional annotation methodology.</p><p><strong>Conclusions: </strong>Our findings demonstrate a balance between annotation speed and model performance. Although disregarding boundary information affects model performance to some extent, this is counterbalanced by significant reductions in the annotator's workload and notable improvements in the speed of the annotation process. These benefits may prove valuable in various applications, offering an attractive compromise for developers and researchers.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11252629/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Considerations for Quality Control Monitoring of Machine Learning Models in Clinical Practice. 临床实践中机器学习模型质量控制监测的注意事项。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-06-28 DOI: 10.2196/50437
Louis Faust, Patrick Wilson, Shusaku Asai, Sunyang Fu, Hongfang Liu, Xiaoyang Ruan, Curt Storlie

Integrating machine learning (ML) models into clinical practice presents a challenge of maintaining their efficacy over time. While existing literature offers valuable strategies for detecting declining model performance, there is a need to document the broader challenges and solutions associated with the real-world development and integration of model monitoring solutions. This work details the development and use of a platform for monitoring the performance of a production-level ML model operating in Mayo Clinic. In this paper, we aimed to provide a series of considerations and guidelines necessary for integrating such a platform into a team's technical infrastructure and workflow. We have documented our experiences with this integration process, discussed the broader challenges encountered with real-world implementation and maintenance, and included the source code for the platform. Our monitoring platform was built as an R shiny application, developed and implemented over the course of 6 months. The platform has been used and maintained for 2 years and is still in use as of July 2023. The considerations necessary for the implementation of the monitoring platform center around 4 pillars: feasibility (what resources can be used for platform development?); design (through what statistics or models will the model be monitored, and how will these results be efficiently displayed to the end user?); implementation (how will this platform be built, and where will it exist within the IT ecosystem?); and policy (based on monitoring feedback, when and what actions will be taken to fix problems, and how will these problems be translated to clinical staff?). While much of the literature surrounding ML performance monitoring emphasizes methodological approaches for capturing changes in performance, there remains a battery of other challenges and considerations that must be addressed for successful real-world implementation.

将机器学习(ML)模型集成到临床实践中,面临着长期保持其有效性的挑战。虽然现有文献为检测模型性能下降提供了有价值的策略,但仍有必要记录与模型监控解决方案的实际开发和集成相关的更广泛挑战和解决方案。本文详细介绍了梅奥诊所开发和使用生产级 ML 模型性能监控平台的情况。在本文中,我们旨在提供将此类平台集成到团队技术基础设施和工作流程中所需的一系列注意事项和指导原则。我们记录了我们在集成过程中的经验,讨论了在实际实施和维护过程中遇到的更广泛的挑战,并包含了该平台的源代码。我们的监控平台是以 R shiny 应用程序的形式构建的,开发和实施过程历时 6 个月。该平台已使用和维护了 2 年,截至 2023 年 7 月仍在使用。实施监控平台所需的考虑因素主要围绕 4 个支柱:可行性(可用于平台开发的资源有哪些?);设计(将通过哪些统计数据或模型对模型进行监控,以及如何将这些结果有效地显示给最终用户?尽管有关流式传输性能监控的许多文献都强调了捕捉性能变化的方法,但要在现实世界中成功实施,还必须应对一系列其他挑战和考虑因素。
{"title":"Considerations for Quality Control Monitoring of Machine Learning Models in Clinical Practice.","authors":"Louis Faust, Patrick Wilson, Shusaku Asai, Sunyang Fu, Hongfang Liu, Xiaoyang Ruan, Curt Storlie","doi":"10.2196/50437","DOIUrl":"10.2196/50437","url":null,"abstract":"<p><p>Integrating machine learning (ML) models into clinical practice presents a challenge of maintaining their efficacy over time. While existing literature offers valuable strategies for detecting declining model performance, there is a need to document the broader challenges and solutions associated with the real-world development and integration of model monitoring solutions. This work details the development and use of a platform for monitoring the performance of a production-level ML model operating in Mayo Clinic. In this paper, we aimed to provide a series of considerations and guidelines necessary for integrating such a platform into a team's technical infrastructure and workflow. We have documented our experiences with this integration process, discussed the broader challenges encountered with real-world implementation and maintenance, and included the source code for the platform. Our monitoring platform was built as an R shiny application, developed and implemented over the course of 6 months. The platform has been used and maintained for 2 years and is still in use as of July 2023. The considerations necessary for the implementation of the monitoring platform center around 4 pillars: feasibility (what resources can be used for platform development?); design (through what statistics or models will the model be monitored, and how will these results be efficiently displayed to the end user?); implementation (how will this platform be built, and where will it exist within the IT ecosystem?); and policy (based on monitoring feedback, when and what actions will be taken to fix problems, and how will these problems be translated to clinical staff?). While much of the literature surrounding ML performance monitoring emphasizes methodological approaches for capturing changes in performance, there remains a battery of other challenges and considerations that must be addressed for successful real-world implementation.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11245651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141473025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1