首页 > 最新文献

International Journal of Medical Informatics最新文献

英文 中文
Cross-modal similar clinical case retrieval using a modular model based on contrastive learning and k-nearest neighbor search 使用基于对比学习和 k-nearest neighbor 搜索的模块化模型进行跨模态相似临床病例检索
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-31 DOI: 10.1016/j.ijmedinf.2024.105680
Shichao Fang , Shenda Hong , Qing Li , Pengfei Li , Tim Coats , Beiji Zou , Guilan Kong

Objective

Electronic health record systems have made it possible for clinicians to use previously encountered similar cases to support clinical decision-making. However, most studies for similar case retrieval were based on single-modal data. The existing studies on cross-modal clinical case retrieval were limited. We aimed to develop a CRoss-Modal Retrieval (CRMR) model to retrieve similar clinical cases recorded in different data modalities.

Materials and methods

The publically available Medical Information Mart for Intensive Care-Chest X-ray (MIMIC-CXR) dataset was used for model development and testing. The CRMR model was designed as a modular model containing two feature extraction models, two feature transformation models, one feature transformation optimization model, and one case retrieval model. The ability to retrieve similar clinical cases recorded in different data modalities was facilitated by the use of contrastive deep learning and k-nearest neighbor search.

Results

The average retrieval precision, denoted as AP@k, of the developed CRMR model, were 76.9 %@5, 76.7 %@10, 76.5 %@20, 76.3 %@50, and 77.9 %@100, respectively. Here k is the number of similar cases returned after retrieval. The average retrieval time varied from 0.013 ms to 0.016 ms with k varying from 5 to 100. Moreover, the model can retrieve similar cases with the same multiple radiographic manifestations as the query case.

Discussion

The CRMR model has shown promising cross-modal retrieval performance in clinical case analysis, with the potential for future scalability and improvement in handling diverse disease types and data modalities. The CRMR model has promising potential to aid clinicians in making optimal and explainable clinical decisions.
目标电子健康记录系统使临床医生有可能利用以前遇到的类似病例来支持临床决策。然而,大多数关于类似病例检索的研究都是基于单一模式的数据。现有的跨模态临床病例检索研究非常有限。我们的目标是开发一个CRoss-Modal Retrieval(CRMR)模型,以检索不同数据模式下记录的类似临床病例。材料与方法公开可用的重症监护医学信息市场-胸部X光(MIMIC-CXR)数据集用于模型开发和测试。CRMR 模型被设计为一个模块化模型,包含两个特征提取模型、两个特征转换模型、一个特征转换优化模型和一个病例检索模型。结果所开发的 CRMR 模型的平均检索精度(以 AP@k 表示)分别为 76.9 %@5、76.7 %@10、76.5 %@20、76.3 %@50 和 77.9 %@100。这里 k 是检索后返回的相似案例数。当 k 为 5 到 100 时,平均检索时间从 0.013 毫秒到 0.016 毫秒不等。此外,该模型还能检索出与查询病例具有相同的多种放射学表现的相似病例。 讨论 CRMR 模型在临床病例分析中显示出了良好的跨模态检索性能,在处理不同的疾病类型和数据模式方面具有可扩展性和改进潜力。CRMR 模型有望帮助临床医生做出最佳和可解释的临床决策。
{"title":"Cross-modal similar clinical case retrieval using a modular model based on contrastive learning and k-nearest neighbor search","authors":"Shichao Fang ,&nbsp;Shenda Hong ,&nbsp;Qing Li ,&nbsp;Pengfei Li ,&nbsp;Tim Coats ,&nbsp;Beiji Zou ,&nbsp;Guilan Kong","doi":"10.1016/j.ijmedinf.2024.105680","DOIUrl":"10.1016/j.ijmedinf.2024.105680","url":null,"abstract":"<div><h3>Objective</h3><div>Electronic health record systems have made it possible for clinicians to use previously encountered similar cases to support clinical decision-making. However, most studies for similar case retrieval were based on single-modal data. The existing studies on cross-modal clinical case retrieval were limited. We aimed to develop a CRoss-Modal Retrieval (CRMR) model to retrieve similar clinical cases recorded in different data modalities.</div></div><div><h3>Materials and methods</h3><div>The publically available Medical Information Mart for Intensive Care-Chest X-ray (MIMIC-CXR) dataset was used for model development and testing. The CRMR model was designed as a modular model containing two feature extraction models, two feature transformation models, one feature transformation optimization model, and one case retrieval model. The ability to retrieve similar clinical cases recorded in different data modalities was facilitated by the use of contrastive deep learning and <em>k</em>-nearest neighbor search.</div></div><div><h3>Results</h3><div>The average retrieval precision, denoted as AP@<em>k</em>, of the developed CRMR model, were 76.9 %@5, 76.7 %@10, 76.5 %@20, 76.3 %@50, and 77.9 %@100, respectively. Here <em>k</em> is the number of similar cases returned after retrieval. The average retrieval time varied from 0.013 ms to 0.016 ms with <em>k</em> varying from 5 to 100. Moreover, the model can retrieve similar cases with the same multiple radiographic manifestations as the query case.</div></div><div><h3>Discussion</h3><div>The CRMR model has shown promising cross-modal retrieval performance in clinical case analysis, with the potential for future scalability and improvement in handling diverse disease types and data modalities. The CRMR model has promising potential to aid clinicians in making optimal and explainable clinical decisions.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105680"},"PeriodicalIF":3.7,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142579092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Expert opinion elicitation for assisting deep learning based Lyme disease classifier with patient data 征询专家意见,利用患者数据辅助基于深度学习的莱姆病分类器
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-31 DOI: 10.1016/j.ijmedinf.2024.105682
Sk Imran Hossain , Jocelyn de Goër de Herve , David Abrial , Richard Emilion , Isabelle Lebert , Yann Frendo , Delphine Martineau , Olivier Lesens , Engelbert Mephu Nguifo

Background

Diagnosing erythema migrans (EM) skin lesion, the most common early symptom of Lyme disease, using deep learning techniques can be effective to prevent long-term complications. Existing works on deep learning based EM recognition only utilizes lesion image due to the lack of a dataset of Lyme disease related images with associated patient data. Doctors rely on patient information about the background of the skin lesion to confirm their diagnosis. To assist deep learning model with a probability score calculated from patient data, this study elicited opinions from fifteen expert doctors. To the best of our knowledge, this is the first expert elicitation work to calculate Lyme disease probability from patient data.

Methods

For the elicitation process, a questionnaire with questions and possible answers related to EM was prepared. Doctors provided relative weights to different answers to the questions. We converted doctors' evaluations to probability scores using Gaussian mixture based density estimation. We exploited formal concept analysis and decision tree for elicited model validation and explanation. We also proposed an algorithm for combining independent probability estimates from multiple modalities, such as merging the EM probability score from a deep learning image classifier with the elicited score from patient data.

Results

We successfully elicited opinions from fifteen expert doctors to create a model for obtaining EM probability scores from patient data.

Conclusions

The elicited probability score and the proposed algorithm can be utilized to make image based deep learning Lyme disease pre-scanners robust. The proposed elicitation and validation process is easy for doctors to follow and can help address related medical diagnosis problems where it is challenging to collect patient data.
背景利用深度学习技术诊断莱姆病最常见的早期症状--迁延性红斑(EM)皮损,可以有效预防长期并发症。由于缺乏与莱姆病相关的图像数据集和相关患者数据,现有基于深度学习的 EM 识别工作只能利用皮损图像。医生只能依靠患者提供的皮损背景信息来确诊。为了帮助深度学习模型从患者数据中计算出概率分数,本研究征求了 15 位专家医生的意见。据我们所知,这是首次从患者数据中计算莱姆病概率的专家征询工作。方法在征询过程中,我们准备了一份问卷,其中包含与EM相关的问题和可能的答案。医生对问题的不同答案给出了相对权重。我们使用基于高斯混合物的密度估计法将医生的评价转换为概率分数。我们利用正式概念分析和决策树来验证和解释模型。我们还提出了一种算法,用于合并来自多种模式的独立概率估计值,例如将来自深度学习图像分类器的EM概率分数与来自患者数据的诱导分数合并。结果我们成功地从15位专家医生那里获得了意见,从而创建了一个从患者数据中获得EM概率分数的模型。所提出的诱导和验证过程对医生来说很容易操作,有助于解决收集患者数据具有挑战性的相关医疗诊断问题。
{"title":"Expert opinion elicitation for assisting deep learning based Lyme disease classifier with patient data","authors":"Sk Imran Hossain ,&nbsp;Jocelyn de Goër de Herve ,&nbsp;David Abrial ,&nbsp;Richard Emilion ,&nbsp;Isabelle Lebert ,&nbsp;Yann Frendo ,&nbsp;Delphine Martineau ,&nbsp;Olivier Lesens ,&nbsp;Engelbert Mephu Nguifo","doi":"10.1016/j.ijmedinf.2024.105682","DOIUrl":"10.1016/j.ijmedinf.2024.105682","url":null,"abstract":"<div><h3>Background</h3><div>Diagnosing erythema migrans (EM) skin lesion, the most common early symptom of Lyme disease, using deep learning techniques can be effective to prevent long-term complications. Existing works on deep learning based EM recognition only utilizes lesion image due to the lack of a dataset of Lyme disease related images with associated patient data. Doctors rely on patient information about the background of the skin lesion to confirm their diagnosis. To assist deep learning model with a probability score calculated from patient data, this study elicited opinions from fifteen expert doctors. To the best of our knowledge, this is the first expert elicitation work to calculate Lyme disease probability from patient data.</div></div><div><h3>Methods</h3><div>For the elicitation process, a questionnaire with questions and possible answers related to EM was prepared. Doctors provided relative weights to different answers to the questions. We converted doctors' evaluations to probability scores using Gaussian mixture based density estimation. We exploited formal concept analysis and decision tree for elicited model validation and explanation. We also proposed an algorithm for combining independent probability estimates from multiple modalities, such as merging the EM probability score from a deep learning image classifier with the elicited score from patient data.</div></div><div><h3>Results</h3><div>We successfully elicited opinions from fifteen expert doctors to create a model for obtaining EM probability scores from patient data.</div></div><div><h3>Conclusions</h3><div>The elicited probability score and the proposed algorithm can be utilized to make image based deep learning Lyme disease pre-scanners robust. The proposed elicitation and validation process is easy for doctors to follow and can help address related medical diagnosis problems where it is challenging to collect patient data.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105682"},"PeriodicalIF":3.7,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A literature-based approach to predict continuous hospital length of stay in adult acute care patients using admission variables: A single university center experience 基于文献的方法,利用入院变量预测成人急症患者的连续住院时间:一个大学中心的经验。
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-28 DOI: 10.1016/j.ijmedinf.2024.105678
Mieke Deschepper , Chloë De Smedt , Kirsten Colpaert

Purpose

To review the existing literature on predicting length of stay (LOS) and to apply the findings on a Real World Data example in a single hospital.

Methods

Performing a literature review on PubMed and Embase, focusing on adults, acute conditions, and hospital-wide prediction of LOS, summarizing all the variables and statistical methods used to predict LOS. Then, we use this set of variables on a single university hospital and run an XGBoost model with Survival Cox regression on the LOS, as well as a logistic regression on binary LOS (cut-off at 4 days). Model metrics are the concordance index (c-index) and area under the curve (AUC).

Results

After applying the search strategy and exclusion criteria, 57 articles are included in the study. The list of variables is long, but mostly non-clinical data are used in the existing literature. A wide range of statistical methods are used, with a recent trend toward machine learning models. The XGBoost model results for the Cox regression in a C-index of 0.87, and the logistic regression on binary LOS has an AUC of 0.94.

Conclusions

Many variables identified in the literature are not available at the time of admission, yet they are still used in models for predicting LOS. Machine learning has become the preferred statistical approach in recent studies, though mainly for binary LOS predictions. Based on the current literature, it remains challenging to derive a practical and high performing model for continuous LOS prediction.
目的:回顾有关预测住院时间(LOS)的现有文献,并将研究结果应用于一家医院的真实世界数据示例:方法: 在 PubMed 和 Embase 上进行文献综述,重点关注成人、急性病和全医院的 LOS 预测,总结用于预测 LOS 的所有变量和统计方法。然后,我们将这组变量用于一家大学医院,并运行一个 XGBoost 模型,对 LOS 进行生存 Cox 回归,并对二元 LOS(以 4 天为截止时间)进行逻辑回归。模型指标为一致性指数(c-index)和曲线下面积(AUC):采用检索策略和排除标准后,本研究共纳入 57 篇文章。变量清单很长,但现有文献大多使用非临床数据。使用了多种统计方法,最近的趋势是使用机器学习模型。XGBoost 模型对 Cox 回归的 C 指数为 0.87,对二元 LOS 的逻辑回归 AUC 为 0.94:文献中确定的许多变量在入院时并不存在,但它们仍被用于预测 LOS 的模型中。在最近的研究中,机器学习已成为首选的统计方法,但主要用于二元 LOS 预测。从目前的文献来看,为连续 LOS 预测建立一个实用且高性能的模型仍具有挑战性。
{"title":"A literature-based approach to predict continuous hospital length of stay in adult acute care patients using admission variables: A single university center experience","authors":"Mieke Deschepper ,&nbsp;Chloë De Smedt ,&nbsp;Kirsten Colpaert","doi":"10.1016/j.ijmedinf.2024.105678","DOIUrl":"10.1016/j.ijmedinf.2024.105678","url":null,"abstract":"<div><h3>Purpose</h3><div>To review the existing literature on predicting length of stay (LOS) and to apply the findings on a Real World Data example in a single hospital.</div></div><div><h3>Methods</h3><div>Performing a literature review on PubMed and Embase, focusing on adults, acute conditions, and hospital-wide prediction of LOS, summarizing all the variables and statistical methods used to predict LOS. Then, we use this set of variables on a single university hospital and run an XGBoost model with Survival Cox regression on the LOS, as well as a logistic regression on binary LOS (cut-off at 4 days). Model metrics are the concordance index (c-index) and area under the curve (AUC).</div></div><div><h3>Results</h3><div>After applying the search strategy and exclusion criteria, 57 articles are included in the study. The list of variables is long, but mostly non-clinical data are used in the existing literature. A wide range of statistical methods are used, with a recent trend toward machine learning models. The XGBoost model results for the Cox regression in a C-index of 0.87, and the logistic regression on binary LOS has an AUC of 0.94.</div></div><div><h3>Conclusions</h3><div>Many variables identified in the literature are not available at the time of admission, yet they are still used in models for predicting LOS. Machine learning has become the preferred statistical approach in recent studies, though mainly for binary LOS predictions. Based on the current literature, it remains challenging to derive a practical and high performing model for continuous LOS prediction.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105678"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Post-Cardiac arrest outcome prediction using machine learning: A systematic review and meta-analysis 使用机器学习预测心脏骤停后的结果:系统回顾和荟萃分析
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-28 DOI: 10.1016/j.ijmedinf.2024.105659
Amirhosein Zobeiri , Alireza Rezaee , Farshid Hajati , Ahmadreza Argha , Hamid Alinejad-Rokny

Background

Early and reliable prognostication in post-cardiac arrest patients remains challenging, with various factors linked to return of spontaneous circulation (ROSC), survival, and neurological results. Machine learning and deep learning models show promise in improving these predictions. This systematic review and meta-analysis evaluates how effective these approaches are in predicting clinical outcomes at different time points using structured data.

Methods

This study followed PRISMA guidelines, involving a comprehensive search across PubMed, Scopus, and Web of Science databases until March 2024. Studies aimed at predicting ROSC, survival (or mortality), and neurological outcomes after cardiac arrest through the application of machine learning or deep learning techniques with structured data were included. Data extraction followed the guidelines of the CHARMS checklist, and the bias risk was evaluated using PROBAST tool. Models reporting the AUC metric with 95 % confidence intervals were incorporated into the quantitative synthesis and meta-analysis.

Results

After extracting 2,753 initial records, 41 studies met the inclusion criteria, yielding 97 machine learning and 16 deep learning models. The pooled AUC for predicting favorable neurological outcomes (CPC 1 or 2) at hospital discharge was 0.871 (95 % CI: 0.813 – 0.928) for machine learning models and 0.877 (95 % CI: 0.831–0.924) across deep learning algorithms. For survival prediction, this value was found to be 0.837 (95 % CI: 0.757–0.916). Considerable heterogeneity and high risk of bias were observed, mainly attributable to inadequate management of missing data and the absence of calibration plots. Most studies focused on pre-hospital factors, with age, sex, and initial arrest rhythm being the most frequent features.

Conclusion

Predictive models utilizing AI-based approaches, including machine and deep learning models exhibit enhanced effectiveness compared to previous regression algorithms, but significant heterogeneity and high risk of bias limit their dependability. Evaluating state-of-the-art deep learning models tailored for tabular data and their clinical generalizability can enhance outcome prediction after cardiac arrest.
背景心脏骤停后患者的早期可靠预后仍然具有挑战性,自发性循环恢复(ROSC)、存活率和神经功能结果与各种因素有关。机器学习和深度学习模型有望改善这些预测。本系统综述和荟萃分析评估了这些方法在使用结构化数据预测不同时间点的临床结果方面的有效性。方法本研究遵循 PRISMA 指南,在 2024 年 3 月之前对 PubMed、Scopus 和 Web of Science 数据库进行了全面检索。纳入的研究旨在通过应用机器学习或深度学习技术和结构化数据,预测心脏骤停后的ROSC、存活率(或死亡率)和神经系统预后。数据提取遵循CHARMS核对表指南,并使用PROBAST工具评估偏倚风险。结果在提取了2753条初始记录后,有41项研究符合纳入标准,产生了97个机器学习模型和16个深度学习模型。机器学习模型预测出院时良好神经功能预后(CPC 1 或 2)的集合 AUC 为 0.871(95 % CI:0.813 - 0.928),深度学习算法的集合 AUC 为 0.877(95 % CI:0.831-0.924)。在生存预测方面,这一数值为 0.837(95 % CI:0.757-0.916)。研究发现存在很大的异质性和较高的偏倚风险,这主要归因于对缺失数据的管理不足和校准图的缺失。结论与以往的回归算法相比,利用基于人工智能方法(包括机器学习和深度学习模型)的预测模型显示出更高的有效性,但显著的异质性和高偏倚风险限制了其可靠性。评估为表格数据定制的最先进的深度学习模型及其临床普适性可以提高心脏骤停后的预后预测。
{"title":"Post-Cardiac arrest outcome prediction using machine learning: A systematic review and meta-analysis","authors":"Amirhosein Zobeiri ,&nbsp;Alireza Rezaee ,&nbsp;Farshid Hajati ,&nbsp;Ahmadreza Argha ,&nbsp;Hamid Alinejad-Rokny","doi":"10.1016/j.ijmedinf.2024.105659","DOIUrl":"10.1016/j.ijmedinf.2024.105659","url":null,"abstract":"<div><h3>Background</h3><div>Early and reliable prognostication in post-cardiac arrest patients remains challenging, with various factors linked to return of spontaneous circulation (ROSC), survival, and neurological results. Machine learning and deep learning models show promise in improving these predictions. This systematic review and <em>meta</em>-analysis evaluates how effective these approaches are in predicting clinical outcomes at different time points using structured data.</div></div><div><h3>Methods</h3><div>This study followed PRISMA guidelines, involving a comprehensive search across PubMed, Scopus, and Web of Science databases until March 2024. Studies aimed at predicting ROSC, survival (or mortality), and neurological outcomes after cardiac arrest through the application of machine learning or deep learning techniques with structured data were included. Data extraction followed the guidelines of the CHARMS checklist, and the bias risk was evaluated using PROBAST tool. Models reporting the AUC metric with 95 % confidence intervals were incorporated into the quantitative synthesis and <em>meta</em>-analysis.</div></div><div><h3>Results</h3><div>After extracting 2,753 initial records, 41 studies met the inclusion criteria, yielding 97 machine learning and 16 deep learning models. The pooled AUC for predicting favorable neurological outcomes (CPC 1 or 2) at hospital discharge was 0.871 (95 % CI: 0.813 – 0.928) for machine learning models and 0.877 (95 % CI: 0.831–0.924) across deep learning algorithms. For survival prediction, this value was found to be 0.837 (95 % CI: 0.757–0.916). Considerable heterogeneity and high risk of bias were observed, mainly attributable to inadequate management of missing data and the absence of calibration plots. Most studies focused on pre-hospital factors, with age, sex, and initial arrest rhythm being the most frequent features.</div></div><div><h3>Conclusion</h3><div>Predictive models utilizing AI-based approaches, including machine and deep learning models exhibit enhanced effectiveness compared to previous regression algorithms, but significant heterogeneity and high risk of bias limit their dependability. Evaluating state-of-the-art deep learning models tailored for tabular data and their clinical generalizability can enhance outcome prediction after cardiac arrest.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105659"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination 评估高级大型语言模型在医学知识中的有效性:使用日本国家医学考试的比较研究。
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-28 DOI: 10.1016/j.ijmedinf.2024.105673
Mingxin Liu , Tsuyoshi Okuhara , Zhehao Dai , Wenbo Huang , Lin Gu , Hiroko Okada , Emi Furukawa , Takahiro Kiuchi
Study aims and objectives.
This study aims to evaluate the accuracy of medical knowledge in the most advanced LLMs (GPT-4o, GPT-4, Gemini 1.5 Pro, and Claude 3 Opus) as of 2024. It is the first to evaluate these LLMs using a non-English medical licensing exam. The insights from this study will guide educators, policymakers, and technical experts in the effective use of AI in medical education and clinical diagnosis.

Method

Authors inputted 790 questions from Japanese National Medical Examination into the chat windows of the LLMs to obtain responses. Two authors independently assessed the correctness. Authors analyzed the overall accuracy rates of the LLMs and compared their performance on image and non-image questions, questions of varying difficulty levels, general and clinical questions, and questions from different medical specialties. Additionally, authors examined the correlation between the number of publications and LLMs’ performance in different medical specialties.

Results

GPT-4o achieved highest accuracy rate of 89.2% and outperformed the other LLMs in overall performance and each specific category. All four LLMs performed better on non-image questions than image questions, with a 10% accuracy gap. They also performed better on easy questions compared to normal and difficult ones. GPT-4o achieved a 95.0% accuracy rate on easy questions, marking it as an effective knowledge source for medical education. Four LLMs performed worst on “Gastroenterology and Hepatology” specialty. There was a positive correlation between the number of publications and LLM performance in different specialties.

Conclusions

GPT-4o achieved an overall accuracy rate close to 90%, with 95.0% on easy questions, significantly outperforming the other LLMs. This indicates GPT-4o’s potential as a knowledge source for easy questions. Image-based questions and question difficulty significantly impact LLM accuracy. “Gastroenterology and Hepatology” is the specialty with the lowest performance. The LLMs’ performance across medical specialties correlates positively with the number of related publications.
研究目的和目标。本研究旨在评估截至 2024 年最先进的 LLM(GPT-4o、GPT-4、Gemini 1.5 Pro 和 Claude 3 Opus)中医学知识的准确性。这是首次使用非英语医学执照考试来评估这些 LLM。本研究的见解将指导教育工作者、政策制定者和技术专家在医学教育和临床诊断中有效使用人工智能:方法:作者将日本国家医学考试中的 790 个问题输入法学硕士的聊天窗口,以获取回复。两名作者独立评估正确率。作者分析了 LLMs 的总体正确率,并比较了它们在图像和非图像问题、不同难度的问题、普通和临床问题以及不同医学专业问题上的表现。此外,作者还研究了发表论文的数量与 LLMs 在不同医学专业中的表现之间的相关性:结果:GPT-4o 的准确率最高,达到 89.2%,在整体表现和每个特定类别中都优于其他 LLM。所有四种 LLM 在非图像问题上的表现均优于图像问题,准确率差距为 10%。它们在简单问题上的表现也优于普通问题和难题。GPT-4o 在简单问题上的准确率达到 95.0%,是医学教育的有效知识来源。四名法学硕士在 "胃肠病学和肝病学 "专业的成绩最差。在不同专业中,发表论文的数量与法学硕士的表现呈正相关:结论:GPT-4o 的总体准确率接近 90%,简单问题的准确率为 95.0%,明显优于其他 LLM。这表明 GPT-4o 具有作为简单问题知识源的潜力。基于图像的问题和问题难度对 LLM 的准确性有很大影响。"胃肠病学和肝病学 "是成绩最低的专业。LLM 在各医学专业中的表现与相关出版物的数量呈正相关。
{"title":"Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination","authors":"Mingxin Liu ,&nbsp;Tsuyoshi Okuhara ,&nbsp;Zhehao Dai ,&nbsp;Wenbo Huang ,&nbsp;Lin Gu ,&nbsp;Hiroko Okada ,&nbsp;Emi Furukawa ,&nbsp;Takahiro Kiuchi","doi":"10.1016/j.ijmedinf.2024.105673","DOIUrl":"10.1016/j.ijmedinf.2024.105673","url":null,"abstract":"<div><div>Study aims and objectives.</div><div>This study aims to evaluate the accuracy of medical knowledge in the most advanced LLMs (GPT-4o, GPT-4, Gemini 1.5 Pro, and Claude 3 Opus) as of 2024. It is the first to evaluate these LLMs using a non-English medical licensing exam. The insights from this study will guide educators, policymakers, and technical experts in the effective use of AI in medical education and clinical diagnosis.</div></div><div><h3>Method</h3><div>Authors inputted 790 questions from Japanese National Medical Examination into the chat windows of the LLMs to obtain responses. Two authors independently assessed the correctness. Authors analyzed the overall accuracy rates of the LLMs and compared their performance on image and non-image questions, questions of varying difficulty levels, general and clinical questions, and questions from different medical specialties. Additionally, authors examined the correlation between the number of publications and LLMs’ performance in different medical specialties.</div></div><div><h3>Results</h3><div>GPT-4o achieved highest accuracy rate of 89.2% and outperformed the other LLMs in overall performance and each specific category. All four LLMs performed better on non-image questions than image questions, with a 10% accuracy gap. They also performed better on easy questions compared to normal and difficult ones. GPT-4o achieved a 95.0% accuracy rate on easy questions, marking it as an effective knowledge source for medical education. Four LLMs performed worst on “Gastroenterology and Hepatology” specialty. There was a positive correlation between the number of publications and LLM performance in different specialties.</div></div><div><h3>Conclusions</h3><div>GPT-4o achieved an overall accuracy rate close to 90%, with 95.0% on easy questions, significantly outperforming the other LLMs. This indicates GPT-4o’s potential as a knowledge source for easy questions. Image-based questions and question difficulty significantly impact LLM accuracy. “Gastroenterology and Hepatology” is the specialty with the lowest performance. The LLMs’ performance across medical specialties correlates positively with the number of related publications.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105673"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing real world data interoperability in healthcare: A methodological approach to laboratory unit harmonization 加强医疗保健领域真实世界数据的互操作性:实验室单位统一的方法论
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-28 DOI: 10.1016/j.ijmedinf.2024.105665
Aída Muñoz Monjas , David Rubio Ruiz , David Pérez del Rey , Matvey B. Palchuk

Objective

The primary aim of this study is to address the critical issue of non-standardized units in clinical laboratory data, which poses significant challenges to data interoperability and secondary usage. Despite UCUM (Unified Code for Units of Measure) offering a unique representation for laboratory test units, nearly 60% of laboratory codes in healthcare organizations use non-standard units. We sought to design, implement and test a methodology for the harmonization of units to the UCUM standards across a large research network.

Methods

Using dimensional analysis and a curated equivalence table, the proposed methodology harmonizes disparate units to UCUM standards. The process focused on identifying and converting non-UCUM conforming units, with the goal of enhancing data comparability and interoperability across different systems.

Results

The methodology successfully achieved over 90% coverage of laboratory data with units in UCUM standards across the TriNetX research network, a significant improvement from baseline measurements. This enhancement in unit standardization directly contributed to increased interoperability of laboratory data, facilitating more reliable and comparable data analysis across various healthcare organizations.

Conclusion

The successful harmonization of laboratory data units to UCUM standards represents a significant advancement in the field of biomedical informatics. By demonstrating a practical and effective approach to overcoming the challenges of non-standardized units, our study contributes to the broader efforts to improve data interoperability and usability for secondary purposes such as research and observational studies. Future work will focus on addressing the remaining gaps in unit standardization and exploring the implications of this methodology on clinical outcomes and research capabilities.
目的本研究的主要目的是解决临床实验室数据中的非标准化单位这一关键问题,因为它给数据互操作性和二次使用带来了巨大挑战。尽管 UCUM(计量单位统一代码)为实验室检验单位提供了独特的表示方法,但医疗机构中近 60% 的实验室代码使用的是非标准单位。我们试图在一个大型研究网络中设计、实施并测试一种将单位统一为 UCUM 标准的方法。方法利用维度分析和策划的等价表,建议的方法将不同的单位统一为 UCUM 标准。这一过程的重点是识别和转换不符合 UCUM 标准的单位,目的是提高不同系统间的数据可比性和互操作性。结果该方法成功地使 TriNetX 研究网络中符合 UCUM 标准的单位覆盖了 90% 以上的实验室数据,与基线测量值相比有了显著提高。这种单位标准化的提高直接促进了实验室数据互操作性的增强,为不同医疗机构之间进行更可靠、更可比的数据分析提供了便利。 结论实验室数据单位与 UCUM 标准的成功统一是生物医学信息学领域的一大进步。通过展示克服非标准化单位挑战的实用有效方法,我们的研究为提高数据互操作性和二次用途(如研究和观察性研究)的可用性做出了更广泛的贡献。未来的工作将重点解决单位标准化方面的其余差距,并探索这种方法对临床结果和研究能力的影响。
{"title":"Enhancing real world data interoperability in healthcare: A methodological approach to laboratory unit harmonization","authors":"Aída Muñoz Monjas ,&nbsp;David Rubio Ruiz ,&nbsp;David Pérez del Rey ,&nbsp;Matvey B. Palchuk","doi":"10.1016/j.ijmedinf.2024.105665","DOIUrl":"10.1016/j.ijmedinf.2024.105665","url":null,"abstract":"<div><h3>Objective</h3><div>The primary aim of this study is to address the critical issue of non-standardized units in clinical laboratory data, which poses significant challenges to data interoperability and secondary usage. Despite UCUM (Unified Code for Units of Measure) offering a unique representation for laboratory test units, nearly 60% of laboratory codes in healthcare organizations use non-standard units. We sought to design, implement and test a methodology for the harmonization of units to the UCUM standards across a large research network.</div></div><div><h3>Methods</h3><div>Using dimensional analysis and a curated equivalence table, the proposed methodology harmonizes disparate units to UCUM standards. The process focused on identifying and converting non-UCUM conforming units, with the goal of enhancing data comparability and interoperability across different systems.</div></div><div><h3>Results</h3><div>The methodology successfully achieved over 90% coverage of laboratory data with units in UCUM standards across the TriNetX research network, a significant improvement from baseline measurements. This enhancement in unit standardization directly contributed to increased interoperability of laboratory data, facilitating more reliable and comparable data analysis across various healthcare organizations.</div></div><div><h3>Conclusion</h3><div>The successful harmonization of laboratory data units to UCUM standards represents a significant advancement in the field of biomedical informatics. By demonstrating a practical and effective approach to overcoming the challenges of non-standardized units, our study contributes to the broader efforts to improve data interoperability and usability for secondary purposes such as research and observational studies. Future work will focus on addressing the remaining gaps in unit standardization and exploring the implications of this methodology on clinical outcomes and research capabilities.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105665"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142579093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling the impact of socioeconomic disparity, biological markers and environmental exposures on phenotypic age using mediation analysis and structural equation models 利用中介分析和结构方程模型,模拟社会经济差异、生物标记和环境暴露对表型年龄的影响
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-28 DOI: 10.1016/j.ijmedinf.2024.105661
Daniele Pala , Yuezhi Xie , Jia Xu , Li Shen

Introduction

Average age is increasing worldwide, raising the public health burden of age-related diseases, as more resources will be required to manage treatments. Phenotypic Age is a score that can be useful to provide an estimate of the probability of developing aging-related conditions, and prevention of such conditions could be performed efficiently studying the mechanisms leading to an increased phenotypic age. The objective of this study is to characterize the mechanisms that lead to aging acceleration from the interactions among socio-demographic factors, health predispositions and biological phenotypes.

Methods

We present an approach based on the combination of mediation analysis and structural equation models (SEM) to better characterize these mechanisms, quantifying the interactions between biological and external factors and the effects of preexisting health conditions and socioeconomic disparities. We use two independent cohorts of the NHANES dataset: we use the largest (n = 13,186) to select the variables that enlarge the gap between phenotypic and chronological ages, we then create a SEM based on nested linear regressions to quantify the influence of all sociodemographic variables expressed in three latent variables indicating ethnicity, socioeconomic status and preexisting health status. We then replicate the model and apply it to the second cohort (n = 4,425) to compare the results.

Results

Results show that phenotypic age increases with poor glucose control or obesity-related biomarkers, especially if combined with a low socioeconomic status or the presence of chronic or vascular diseases, and provide a framework to quantify these relationships. Black ethnicity, low income/education and a history of chronic diseases are also associated with a higher phenotypic age. Although these findings are already known in literature, the proposed SEM-based framework provides an useful tool to assess the combinations of these heterogeneous factors from a quantitative point of view.

Conclusion

In an aging society, phenotypic age is an important metric that can be used to estimate the individual health risk, however its value is influenced by a myriad of external factors, both biological and sociodemographic. The framework proposed in this paper can help quantifying the combined effects of these factors and be a starting point to the creation of personalized prevention and intervention strategies.
导言:全世界的平均年龄都在增加,这增加了老年相关疾病的公共卫生负担,因为需要更多的资源来管理治疗。表型年龄(Phenotypic Age)是一种可用于估算罹患衰老相关疾病概率的评分,研究导致表型年龄增加的机制可有效预防此类疾病。本研究的目的是从社会人口因素、健康倾向和生物表型之间的相互作用来描述导致衰老加速的机制。方法我们提出了一种基于中介分析和结构方程模型(SEM)相结合的方法,以更好地描述这些机制,量化生物因素和外部因素之间的相互作用以及预先存在的健康状况和社会经济差异的影响。我们使用了 NHANES 数据集中的两个独立队列:我们使用最大的队列(n = 13,186 人)来选择扩大表型年龄和计时年龄之间差距的变量,然后创建一个基于嵌套线性回归的 SEM,以量化所有社会人口学变量的影响,这些变量用三个潜变量表示,即种族、社会经济地位和既往健康状况。结果表明,表型年龄会随着血糖控制不佳或肥胖相关生物标志物的增加而增加,尤其是在社会经济地位较低或患有慢性病或血管疾病的情况下,并为量化这些关系提供了一个框架。黑人、低收入/受教育程度低和有慢性病史也与表型年龄较高有关。结论在老龄化社会中,表型年龄是一个重要的指标,可用来估计个人的健康风险,但其价值受到生物和社会人口等众多外部因素的影响。本文提出的框架有助于量化这些因素的综合影响,并成为制定个性化预防和干预策略的起点。
{"title":"Modeling the impact of socioeconomic disparity, biological markers and environmental exposures on phenotypic age using mediation analysis and structural equation models","authors":"Daniele Pala ,&nbsp;Yuezhi Xie ,&nbsp;Jia Xu ,&nbsp;Li Shen","doi":"10.1016/j.ijmedinf.2024.105661","DOIUrl":"10.1016/j.ijmedinf.2024.105661","url":null,"abstract":"<div><h3>Introduction</h3><div>Average age is increasing worldwide, raising the public health burden of age-related diseases, as more resources will be required to manage treatments. Phenotypic Age is a score that can be useful to provide an estimate of the probability of developing aging-related conditions, and prevention of such conditions could be performed efficiently studying the mechanisms leading to an increased phenotypic age. The objective of this study is to characterize the mechanisms that lead to aging acceleration from the interactions among socio-demographic factors, health predispositions and biological phenotypes.</div></div><div><h3>Methods</h3><div>We present an approach based on the combination of mediation analysis and structural equation models (SEM) to better characterize these mechanisms, quantifying the interactions between biological and external factors and the effects of preexisting health conditions and socioeconomic disparities. We use two independent cohorts of the NHANES dataset: we use the largest (n = 13,186) to select the variables that enlarge the gap between phenotypic and chronological ages, we then create a SEM based on nested linear regressions to quantify the influence of all sociodemographic variables expressed in three latent variables indicating ethnicity, socioeconomic status and preexisting health status. We then replicate the model and apply it to the second cohort (n = 4,425) to compare the results.</div></div><div><h3>Results</h3><div>Results show that phenotypic age increases with poor glucose control or obesity-related biomarkers, especially if combined with a low socioeconomic status or the presence of chronic or vascular diseases, and provide a framework to quantify these relationships. Black ethnicity, low income/education and a history of chronic diseases are also associated with a higher phenotypic age. Although these findings are already known in literature, the proposed SEM-based framework provides an useful tool to assess the combinations of these heterogeneous factors from a quantitative point of view.</div></div><div><h3>Conclusion</h3><div>In an aging society, phenotypic age is an important metric that can be used to estimate the individual health risk, however its value is influenced by a myriad of external factors, both biological and sociodemographic. The framework proposed in this paper can help quantifying the combined effects of these factors and be a starting point to the creation of personalized prevention and intervention strategies.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105661"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CrossViT with ECAP: Enhanced deep learning for jaw lesion classification 带有 ECAP 的 CrossViT:用于颌骨病变分类的增强型深度学习。
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-28 DOI: 10.1016/j.ijmedinf.2024.105666
Wannakamon Panyarak , Wattanapong Suttapak , Phattaranant Mahasantipiya , Arnon Charuakkra , Nattanit Boonsong , Kittichai Wantanajittikul , Anak Iamaroon

Background

Radiolucent jaw lesions like ameloblastoma (AM), dentigerous cyst (DC), odontogenic keratocyst (OKC), and radicular cyst (RC) often share similar characteristics, making diagnosis challenging. In 2021, CrossViT, a novel deep learning approach using multi-scale vision transformers (ViT) with cross-attention, emerged for accurate image classification. Additionally, we introduced Extended Cropping and Padding (ECAP), a method to expand training data by iteratively cropping smaller images while preserving context. However, its application in dental radiographic classification remains unexplored. This study investigates the effectiveness of CrossViTs and ECAP against ResNets for classifying common radiolucent jaw lesions.

Methods

We conducted a retrospective study involving 208 prevalent radiolucent jaw lesions (49 AMs, 59 DCs, 48 OKCs, and 54 RCs) observed in panoramic radiographs or orthopantomograms (OPGs) with confirmed histological diagnoses. Three experienced oral radiologists provided annotations with consensus. We implemented horizontal flip and ECAP technique with CrossViT-15, −18, ResNet-50, −101, and −152. A four-fold cross-validation approach was employed. The models’ performance assessed through accuracy, specificity, precision, recall (sensitivity), F1-score, and area under the receiver operating characteristics (AUCs) metrics.

Results

Models using the ECAP technique generally achieved better results, with ResNet-152 showing a statistically significant increase in F1-score. CrossViT models consistently achieved higher accuracy, precision, recall, and F1-score compared to ResNet models, regardless of ECAP usage. CrossViT-18 achieved the best overall performance. While all models showed positive ability to differentiate lesions, DC had the highest AUCs (0.89–0.90) and OKC the lowest (0.72–0.81). Only CrossViT-15 achieved AUCs above 0.80 for all four lesion types.

Conclusion

ECAP, a targeted padding data technique, improves deep learning model performance for radiolucent jaw lesion classification. This context-preserving approach is beneficial for tasks requiring an understanding of the lesion’s surroundings. Combined with CrossViT models, ECAP shows promise for accurate classification, particularly for rare lesions with limited data.
背景:颌骨放射性病变,如釉母细胞瘤(AM)、齿状囊肿(DC)、牙源性角化囊肿(OKC)和根状囊肿(RC),往往具有相似的特征,这给诊断带来了挑战。2021 年,CrossViT--一种使用多尺度视觉变换器(ViT)和交叉注意的新型深度学习方法--应运而生,用于准确的图像分类。此外,我们还引入了扩展裁剪和填充(ECAP),这是一种通过迭代裁剪较小图像来扩展训练数据,同时保留上下文的方法。然而,这种方法在牙科放射成像分类中的应用仍有待探索。本研究调查了 CrossViTs 和 ECAP 与 ResNets 相比在颌骨常见放射病变分类中的有效性:我们进行了一项回顾性研究,涉及在全景X光片或正侧位X光片(OPG)中观察到的 208 个经组织学确诊的颌骨放射性病变(49 个 AM、59 个 DC、48 个 OKC 和 54 个 RC)。三位经验丰富的口腔放射科医生提供了具有共识的注释。我们使用 CrossViT-15、-18、ResNet-50、-101 和 -152 实现了水平翻转和 ECAP 技术。我们采用了四倍交叉验证方法。通过准确度、特异性、精确度、召回率(灵敏度)、F1 分数和接收器工作特征下面积(AUCs)指标评估了模型的性能:结果:使用 ECAP 技术的模型普遍取得了更好的结果,ResNet-152 的 F1 分数在统计上有显著提高。与 ResNet 模型相比,无论使用 ECAP 技术与否,CrossViT 模型的准确度、精确度、召回率和 F1 分数都更高。CrossViT-18 的整体性能最佳。虽然所有模型都显示出了区分病变的积极能力,但 DC 的 AUC 最高(0.89-0.90),OKC 最低(0.72-0.81)。只有 CrossViT-15 对所有四种病变类型的 AUC 都超过了 0.80:ECAP是一种有针对性的填充数据技术,可提高深度学习模型在颌骨放射性病变分类中的性能。这种保留上下文的方法有利于需要了解病变周围环境的任务。结合 CrossViT 模型,ECAP 有望实现准确分类,尤其是针对数据有限的罕见病变。
{"title":"CrossViT with ECAP: Enhanced deep learning for jaw lesion classification","authors":"Wannakamon Panyarak ,&nbsp;Wattanapong Suttapak ,&nbsp;Phattaranant Mahasantipiya ,&nbsp;Arnon Charuakkra ,&nbsp;Nattanit Boonsong ,&nbsp;Kittichai Wantanajittikul ,&nbsp;Anak Iamaroon","doi":"10.1016/j.ijmedinf.2024.105666","DOIUrl":"10.1016/j.ijmedinf.2024.105666","url":null,"abstract":"<div><h3>Background</h3><div>Radiolucent jaw lesions like ameloblastoma (AM), dentigerous cyst (DC), odontogenic keratocyst (OKC), and radicular cyst (RC) often share similar characteristics, making diagnosis challenging. In 2021, CrossViT, a novel deep learning approach using multi-scale vision transformers (ViT) with cross-attention, emerged for accurate image classification. Additionally, we introduced Extended Cropping and Padding (ECAP), a method to expand training data by iteratively cropping smaller images while preserving context. However, its application in dental radiographic classification remains unexplored. This study investigates the effectiveness of CrossViTs and ECAP against ResNets for classifying common radiolucent jaw lesions.</div></div><div><h3>Methods</h3><div>We conducted a retrospective study involving 208 prevalent radiolucent jaw lesions (49 AMs, 59 DCs, 48 OKCs, and 54 RCs) observed in panoramic radiographs or orthopantomograms (OPGs) with confirmed histological diagnoses. Three experienced oral radiologists provided annotations with consensus. We implemented horizontal flip and ECAP technique with CrossViT-15, −18, ResNet-50, −101, and −152. A four-fold cross-validation approach was employed. The models’ performance assessed through accuracy, specificity, precision, recall (sensitivity), F1-score, and area under the receiver operating characteristics (AUCs) metrics.</div></div><div><h3>Results</h3><div>Models using the ECAP technique generally achieved better results, with ResNet-152 showing a statistically significant increase in F1-score. CrossViT models consistently achieved higher accuracy, precision, recall, and F1-score compared to ResNet models, regardless of ECAP usage. CrossViT-18 achieved the best overall performance. While all models showed positive ability to differentiate lesions, DC had the highest AUCs (0.89–0.90) and OKC the lowest (0.72–0.81). Only CrossViT-15 achieved AUCs above 0.80 for all four lesion types.</div></div><div><h3>Conclusion</h3><div>ECAP, a targeted padding data technique, improves deep learning model performance for radiolucent jaw lesion classification. This context-preserving approach is beneficial for tasks requiring an understanding of the lesion’s surroundings. Combined with CrossViT models, ECAP shows promise for accurate classification, particularly for rare lesions with limited data.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105666"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emerging technologies for supporting patients during Hemodialysis: A scoping review 血液透析期间为患者提供支持的新兴技术:范围审查
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-24 DOI: 10.1016/j.ijmedinf.2024.105664
Ana Rita Martins , Marta Campos Ferreira , Carla Silvia Fernandes

Purpose

To synthesize the available evidence about the use of Health Information Technology (HIT) to support patients during hemodialysis.

Methods

The Joanna Briggs Institute’s methodological guidelines for scoping reviews and the PRISMA-ScR checklist were employed. Bibliographic searches across MEDLINE®, CINAHL®, Psychology and Behavioral Sciences Collection, Scopus, MedicLatina, and Cochrane yielded 932 records.

Results

Eighteen studies published between 2003 and 2023 were included. They explored a range of HITs, including virtual reality, exergames, websites, and mobile applications, all specifically developed for use during the intradialytic period.

Conclusion

This study highlights the HITs developed for use during hemodialysis treatment, supporting physical exercise, disease management, and enhancement of self-efficacy and self-care.
目的 对血液透析期间使用医疗信息技术(HIT)为患者提供支持的现有证据进行综合分析。方法 采用乔安娜-布里格斯研究所(Joanna Briggs Institute)的范围界定综述方法指南和 PRISMA-ScR 核对表。在 MEDLINE®、CINAHL®、Psychology and Behavioral Sciences Collection、Scopus、MedicLatina 和 Cochrane 中进行文献检索,共获得 932 条记录。这些研究探讨了一系列 HIT,包括虚拟现实、外部游戏、网站和移动应用程序,所有这些都是专为血液透析治疗期间开发的。结论本研究强调了为血液透析治疗期间开发的 HIT,这些 HIT 支持体育锻炼、疾病管理以及提高自我效能和自我护理。
{"title":"Emerging technologies for supporting patients during Hemodialysis: A scoping review","authors":"Ana Rita Martins ,&nbsp;Marta Campos Ferreira ,&nbsp;Carla Silvia Fernandes","doi":"10.1016/j.ijmedinf.2024.105664","DOIUrl":"10.1016/j.ijmedinf.2024.105664","url":null,"abstract":"<div><h3>Purpose</h3><div>To synthesize the available evidence about the use of Health Information Technology (HIT) to support patients during hemodialysis.</div></div><div><h3>Methods</h3><div>The Joanna Briggs Institute’s methodological guidelines for scoping reviews and the PRISMA-ScR checklist were employed. Bibliographic searches across MEDLINE®, CINAHL®, Psychology and Behavioral Sciences Collection, Scopus, MedicLatina, and Cochrane yielded 932 records.</div></div><div><h3>Results</h3><div>Eighteen studies published between 2003 and 2023 were included. They explored a range of HITs, including virtual reality, exergames, websites, and mobile applications, all specifically developed for use during the intradialytic period.</div></div><div><h3>Conclusion</h3><div>This study highlights the HITs developed for use during hemodialysis treatment, supporting physical exercise, disease management, and enhancement of self-efficacy and self-care.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105664"},"PeriodicalIF":3.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech recognition technology in prehospital documentation: A scoping review 院前记录中的语音识别技术:范围审查。
IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-23 DOI: 10.1016/j.ijmedinf.2024.105662
Desmond Hedderson , Karen L. Courtney , Helen Monkman , Ian E. Blanchard

Objectives

The nature of paramedic workflows, where paramedics are responsible to provide care and chart concurrently, can result in incomplete or non-existent patient care reports on patient handover to the emergency department (ED). Charting delays and retrospective recollection of care may lead to patient information gaps, which can increase ED workload, cause care delays, and increase the risk of adverse events. Speech recognition documentation technology has the potential to produce complete patient care reports quicker and improve paramedic-to-ED handover. We performed a scoping review to determine paramedic perceptions and user requirements for speech recognition documentation technology.

Methods

MEDLINE, Google Scholar, IEEE Explore, ProQuest, and CINAHL were searched from 2014 to March 2024. Criteria included studies focused on paramedics’ use or perceptions of speech recognition documentation technology. This review included studies conducted in the prehospital environment and adjacent agencies (i.e., ED, fire, police, military).

Results

The review identified eight articles that met inclusion criteria. All eight articles were small focus group-based studies in laboratory settings published on or after 2020. Five studies were conducted in the United States, two in Switzerland, and one in Japan. Of the eight studies, five recommended further live environment testing of the technology examined, and three underscored the importance of a user-centred design. The top user requirements for speech recognition adoption was hands-free use, noise reduction technology, battery life, and word accuracy. All eight studies recommended further research and development of speech recognition documentation technology in the prehospital workflow.

Conclusion

This scoping review has highlighted that while there is a growing interest in speech recognition documentation technology in the paramedicine workflow, more research is needed, especially with larger samples in a live environment. The user requirements and perceptions of speech recognition documentation technology in paramedicine must be better understood to design systems with high adoption rates.
目的:辅助医务人员的工作流程性质决定了辅助医务人员在提供护理的同时还要制作病历,这可能会导致在将病人移交给急诊科(ED)时出现病人护理报告不完整或不存在的情况。病历记录延误和护理回顾可能会导致病人信息缺失,从而增加急诊科的工作量,造成护理延误,并增加不良事件的风险。语音识别记录技术有可能更快地生成完整的患者护理报告,并改善护理人员与急诊室之间的交接。我们进行了一次范围审查,以确定护理人员对语音识别文档技术的看法和用户需求:方法:检索了 2014 年至 2024 年 3 月期间的 MEDLINE、Google Scholar、IEEE Explore、ProQuest 和 CINAHL。标准包括有关护理人员对语音识别文档技术的使用或看法的研究。综述包括在院前环境和邻近机构(如急诊室、消防、警察、军队)进行的研究:结果:综述确定了八篇符合纳入标准的文章。所有八篇文章都是在 2020 年或之后发表的实验室环境中进行的小型焦点小组研究。五项研究在美国进行,两项在瑞士进行,一项在日本进行。在这八项研究中,有五项建议对所研究的技术进行进一步的实际环境测试,有三项强调了以用户为中心的设计的重要性。用户对采用语音识别技术的首要要求是免提使用、降噪技术、电池寿命和文字准确性。所有八项研究都建议在院前工作流程中进一步研究和开发语音识别文档技术:本范围审查报告强调,虽然人们对辅助医疗工作流程中的语音识别文档技术越来越感兴趣,但还需要进行更多的研究,尤其是在现场环境中进行更大样本的研究。必须更好地了解用户对辅助医疗语音识别文档技术的要求和看法,才能设计出采用率高的系统。
{"title":"Speech recognition technology in prehospital documentation: A scoping review","authors":"Desmond Hedderson ,&nbsp;Karen L. Courtney ,&nbsp;Helen Monkman ,&nbsp;Ian E. Blanchard","doi":"10.1016/j.ijmedinf.2024.105662","DOIUrl":"10.1016/j.ijmedinf.2024.105662","url":null,"abstract":"<div><h3>Objectives</h3><div>The nature of paramedic workflows, where paramedics are responsible to provide care and chart concurrently, can result in incomplete or non-existent patient care reports on patient handover to the emergency department (ED). Charting delays and retrospective recollection of care may lead to patient information gaps, which can increase ED workload, cause care delays, and increase the risk of adverse events. Speech recognition documentation technology has the potential to produce complete patient care reports quicker and improve paramedic-to-ED handover. We performed a scoping review to determine paramedic perceptions and user requirements for speech recognition documentation technology.</div></div><div><h3>Methods</h3><div>MEDLINE, Google Scholar, IEEE Explore, ProQuest, and CINAHL were searched from 2014 to March 2024. Criteria included studies focused on paramedics’ use or perceptions of speech recognition documentation technology. This review included studies conducted in the prehospital environment and adjacent agencies (i.e., ED, fire, police, military).</div></div><div><h3>Results</h3><div>The review identified eight articles that met inclusion criteria. All eight articles were small focus group-based studies in laboratory settings published on or after 2020. Five studies were conducted in the United States, two in Switzerland, and one in Japan. Of the eight studies, five recommended further live environment testing of the technology examined, and three underscored the importance of a user-centred design. The top user requirements for speech recognition adoption was hands-free use, noise reduction technology, battery life, and word accuracy. All eight studies recommended further research and development of speech recognition documentation technology in the prehospital workflow<strong>.</strong></div></div><div><h3>Conclusion</h3><div>This scoping review has highlighted that while there is a growing interest in speech recognition documentation technology in the paramedicine workflow, more research is needed, especially with larger samples in a live environment. The user requirements and perceptions of speech recognition documentation technology in paramedicine must be better understood to design systems with high adoption rates.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105662"},"PeriodicalIF":3.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1