Pub Date : 2024-10-31DOI: 10.1016/j.ijmedinf.2024.105680
Shichao Fang , Shenda Hong , Qing Li , Pengfei Li , Tim Coats , Beiji Zou , Guilan Kong
Objective
Electronic health record systems have made it possible for clinicians to use previously encountered similar cases to support clinical decision-making. However, most studies for similar case retrieval were based on single-modal data. The existing studies on cross-modal clinical case retrieval were limited. We aimed to develop a CRoss-Modal Retrieval (CRMR) model to retrieve similar clinical cases recorded in different data modalities.
Materials and methods
The publically available Medical Information Mart for Intensive Care-Chest X-ray (MIMIC-CXR) dataset was used for model development and testing. The CRMR model was designed as a modular model containing two feature extraction models, two feature transformation models, one feature transformation optimization model, and one case retrieval model. The ability to retrieve similar clinical cases recorded in different data modalities was facilitated by the use of contrastive deep learning and k-nearest neighbor search.
Results
The average retrieval precision, denoted as AP@k, of the developed CRMR model, were 76.9 %@5, 76.7 %@10, 76.5 %@20, 76.3 %@50, and 77.9 %@100, respectively. Here k is the number of similar cases returned after retrieval. The average retrieval time varied from 0.013 ms to 0.016 ms with k varying from 5 to 100. Moreover, the model can retrieve similar cases with the same multiple radiographic manifestations as the query case.
Discussion
The CRMR model has shown promising cross-modal retrieval performance in clinical case analysis, with the potential for future scalability and improvement in handling diverse disease types and data modalities. The CRMR model has promising potential to aid clinicians in making optimal and explainable clinical decisions.
{"title":"Cross-modal similar clinical case retrieval using a modular model based on contrastive learning and k-nearest neighbor search","authors":"Shichao Fang , Shenda Hong , Qing Li , Pengfei Li , Tim Coats , Beiji Zou , Guilan Kong","doi":"10.1016/j.ijmedinf.2024.105680","DOIUrl":"10.1016/j.ijmedinf.2024.105680","url":null,"abstract":"<div><h3>Objective</h3><div>Electronic health record systems have made it possible for clinicians to use previously encountered similar cases to support clinical decision-making. However, most studies for similar case retrieval were based on single-modal data. The existing studies on cross-modal clinical case retrieval were limited. We aimed to develop a CRoss-Modal Retrieval (CRMR) model to retrieve similar clinical cases recorded in different data modalities.</div></div><div><h3>Materials and methods</h3><div>The publically available Medical Information Mart for Intensive Care-Chest X-ray (MIMIC-CXR) dataset was used for model development and testing. The CRMR model was designed as a modular model containing two feature extraction models, two feature transformation models, one feature transformation optimization model, and one case retrieval model. The ability to retrieve similar clinical cases recorded in different data modalities was facilitated by the use of contrastive deep learning and <em>k</em>-nearest neighbor search.</div></div><div><h3>Results</h3><div>The average retrieval precision, denoted as AP@<em>k</em>, of the developed CRMR model, were 76.9 %@5, 76.7 %@10, 76.5 %@20, 76.3 %@50, and 77.9 %@100, respectively. Here <em>k</em> is the number of similar cases returned after retrieval. The average retrieval time varied from 0.013 ms to 0.016 ms with <em>k</em> varying from 5 to 100. Moreover, the model can retrieve similar cases with the same multiple radiographic manifestations as the query case.</div></div><div><h3>Discussion</h3><div>The CRMR model has shown promising cross-modal retrieval performance in clinical case analysis, with the potential for future scalability and improvement in handling diverse disease types and data modalities. The CRMR model has promising potential to aid clinicians in making optimal and explainable clinical decisions.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105680"},"PeriodicalIF":3.7,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142579092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-31DOI: 10.1016/j.ijmedinf.2024.105682
Sk Imran Hossain , Jocelyn de Goër de Herve , David Abrial , Richard Emilion , Isabelle Lebert , Yann Frendo , Delphine Martineau , Olivier Lesens , Engelbert Mephu Nguifo
Background
Diagnosing erythema migrans (EM) skin lesion, the most common early symptom of Lyme disease, using deep learning techniques can be effective to prevent long-term complications. Existing works on deep learning based EM recognition only utilizes lesion image due to the lack of a dataset of Lyme disease related images with associated patient data. Doctors rely on patient information about the background of the skin lesion to confirm their diagnosis. To assist deep learning model with a probability score calculated from patient data, this study elicited opinions from fifteen expert doctors. To the best of our knowledge, this is the first expert elicitation work to calculate Lyme disease probability from patient data.
Methods
For the elicitation process, a questionnaire with questions and possible answers related to EM was prepared. Doctors provided relative weights to different answers to the questions. We converted doctors' evaluations to probability scores using Gaussian mixture based density estimation. We exploited formal concept analysis and decision tree for elicited model validation and explanation. We also proposed an algorithm for combining independent probability estimates from multiple modalities, such as merging the EM probability score from a deep learning image classifier with the elicited score from patient data.
Results
We successfully elicited opinions from fifteen expert doctors to create a model for obtaining EM probability scores from patient data.
Conclusions
The elicited probability score and the proposed algorithm can be utilized to make image based deep learning Lyme disease pre-scanners robust. The proposed elicitation and validation process is easy for doctors to follow and can help address related medical diagnosis problems where it is challenging to collect patient data.
背景利用深度学习技术诊断莱姆病最常见的早期症状--迁延性红斑(EM)皮损,可以有效预防长期并发症。由于缺乏与莱姆病相关的图像数据集和相关患者数据,现有基于深度学习的 EM 识别工作只能利用皮损图像。医生只能依靠患者提供的皮损背景信息来确诊。为了帮助深度学习模型从患者数据中计算出概率分数,本研究征求了 15 位专家医生的意见。据我们所知,这是首次从患者数据中计算莱姆病概率的专家征询工作。方法在征询过程中,我们准备了一份问卷,其中包含与EM相关的问题和可能的答案。医生对问题的不同答案给出了相对权重。我们使用基于高斯混合物的密度估计法将医生的评价转换为概率分数。我们利用正式概念分析和决策树来验证和解释模型。我们还提出了一种算法,用于合并来自多种模式的独立概率估计值,例如将来自深度学习图像分类器的EM概率分数与来自患者数据的诱导分数合并。结果我们成功地从15位专家医生那里获得了意见,从而创建了一个从患者数据中获得EM概率分数的模型。所提出的诱导和验证过程对医生来说很容易操作,有助于解决收集患者数据具有挑战性的相关医疗诊断问题。
{"title":"Expert opinion elicitation for assisting deep learning based Lyme disease classifier with patient data","authors":"Sk Imran Hossain , Jocelyn de Goër de Herve , David Abrial , Richard Emilion , Isabelle Lebert , Yann Frendo , Delphine Martineau , Olivier Lesens , Engelbert Mephu Nguifo","doi":"10.1016/j.ijmedinf.2024.105682","DOIUrl":"10.1016/j.ijmedinf.2024.105682","url":null,"abstract":"<div><h3>Background</h3><div>Diagnosing erythema migrans (EM) skin lesion, the most common early symptom of Lyme disease, using deep learning techniques can be effective to prevent long-term complications. Existing works on deep learning based EM recognition only utilizes lesion image due to the lack of a dataset of Lyme disease related images with associated patient data. Doctors rely on patient information about the background of the skin lesion to confirm their diagnosis. To assist deep learning model with a probability score calculated from patient data, this study elicited opinions from fifteen expert doctors. To the best of our knowledge, this is the first expert elicitation work to calculate Lyme disease probability from patient data.</div></div><div><h3>Methods</h3><div>For the elicitation process, a questionnaire with questions and possible answers related to EM was prepared. Doctors provided relative weights to different answers to the questions. We converted doctors' evaluations to probability scores using Gaussian mixture based density estimation. We exploited formal concept analysis and decision tree for elicited model validation and explanation. We also proposed an algorithm for combining independent probability estimates from multiple modalities, such as merging the EM probability score from a deep learning image classifier with the elicited score from patient data.</div></div><div><h3>Results</h3><div>We successfully elicited opinions from fifteen expert doctors to create a model for obtaining EM probability scores from patient data.</div></div><div><h3>Conclusions</h3><div>The elicited probability score and the proposed algorithm can be utilized to make image based deep learning Lyme disease pre-scanners robust. The proposed elicitation and validation process is easy for doctors to follow and can help address related medical diagnosis problems where it is challenging to collect patient data.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105682"},"PeriodicalIF":3.7,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.ijmedinf.2024.105678
Mieke Deschepper , Chloë De Smedt , Kirsten Colpaert
Purpose
To review the existing literature on predicting length of stay (LOS) and to apply the findings on a Real World Data example in a single hospital.
Methods
Performing a literature review on PubMed and Embase, focusing on adults, acute conditions, and hospital-wide prediction of LOS, summarizing all the variables and statistical methods used to predict LOS. Then, we use this set of variables on a single university hospital and run an XGBoost model with Survival Cox regression on the LOS, as well as a logistic regression on binary LOS (cut-off at 4 days). Model metrics are the concordance index (c-index) and area under the curve (AUC).
Results
After applying the search strategy and exclusion criteria, 57 articles are included in the study. The list of variables is long, but mostly non-clinical data are used in the existing literature. A wide range of statistical methods are used, with a recent trend toward machine learning models. The XGBoost model results for the Cox regression in a C-index of 0.87, and the logistic regression on binary LOS has an AUC of 0.94.
Conclusions
Many variables identified in the literature are not available at the time of admission, yet they are still used in models for predicting LOS. Machine learning has become the preferred statistical approach in recent studies, though mainly for binary LOS predictions. Based on the current literature, it remains challenging to derive a practical and high performing model for continuous LOS prediction.
目的:回顾有关预测住院时间(LOS)的现有文献,并将研究结果应用于一家医院的真实世界数据示例:方法: 在 PubMed 和 Embase 上进行文献综述,重点关注成人、急性病和全医院的 LOS 预测,总结用于预测 LOS 的所有变量和统计方法。然后,我们将这组变量用于一家大学医院,并运行一个 XGBoost 模型,对 LOS 进行生存 Cox 回归,并对二元 LOS(以 4 天为截止时间)进行逻辑回归。模型指标为一致性指数(c-index)和曲线下面积(AUC):采用检索策略和排除标准后,本研究共纳入 57 篇文章。变量清单很长,但现有文献大多使用非临床数据。使用了多种统计方法,最近的趋势是使用机器学习模型。XGBoost 模型对 Cox 回归的 C 指数为 0.87,对二元 LOS 的逻辑回归 AUC 为 0.94:文献中确定的许多变量在入院时并不存在,但它们仍被用于预测 LOS 的模型中。在最近的研究中,机器学习已成为首选的统计方法,但主要用于二元 LOS 预测。从目前的文献来看,为连续 LOS 预测建立一个实用且高性能的模型仍具有挑战性。
{"title":"A literature-based approach to predict continuous hospital length of stay in adult acute care patients using admission variables: A single university center experience","authors":"Mieke Deschepper , Chloë De Smedt , Kirsten Colpaert","doi":"10.1016/j.ijmedinf.2024.105678","DOIUrl":"10.1016/j.ijmedinf.2024.105678","url":null,"abstract":"<div><h3>Purpose</h3><div>To review the existing literature on predicting length of stay (LOS) and to apply the findings on a Real World Data example in a single hospital.</div></div><div><h3>Methods</h3><div>Performing a literature review on PubMed and Embase, focusing on adults, acute conditions, and hospital-wide prediction of LOS, summarizing all the variables and statistical methods used to predict LOS. Then, we use this set of variables on a single university hospital and run an XGBoost model with Survival Cox regression on the LOS, as well as a logistic regression on binary LOS (cut-off at 4 days). Model metrics are the concordance index (c-index) and area under the curve (AUC).</div></div><div><h3>Results</h3><div>After applying the search strategy and exclusion criteria, 57 articles are included in the study. The list of variables is long, but mostly non-clinical data are used in the existing literature. A wide range of statistical methods are used, with a recent trend toward machine learning models. The XGBoost model results for the Cox regression in a C-index of 0.87, and the logistic regression on binary LOS has an AUC of 0.94.</div></div><div><h3>Conclusions</h3><div>Many variables identified in the literature are not available at the time of admission, yet they are still used in models for predicting LOS. Machine learning has become the preferred statistical approach in recent studies, though mainly for binary LOS predictions. Based on the current literature, it remains challenging to derive a practical and high performing model for continuous LOS prediction.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105678"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Early and reliable prognostication in post-cardiac arrest patients remains challenging, with various factors linked to return of spontaneous circulation (ROSC), survival, and neurological results. Machine learning and deep learning models show promise in improving these predictions. This systematic review and meta-analysis evaluates how effective these approaches are in predicting clinical outcomes at different time points using structured data.
Methods
This study followed PRISMA guidelines, involving a comprehensive search across PubMed, Scopus, and Web of Science databases until March 2024. Studies aimed at predicting ROSC, survival (or mortality), and neurological outcomes after cardiac arrest through the application of machine learning or deep learning techniques with structured data were included. Data extraction followed the guidelines of the CHARMS checklist, and the bias risk was evaluated using PROBAST tool. Models reporting the AUC metric with 95 % confidence intervals were incorporated into the quantitative synthesis and meta-analysis.
Results
After extracting 2,753 initial records, 41 studies met the inclusion criteria, yielding 97 machine learning and 16 deep learning models. The pooled AUC for predicting favorable neurological outcomes (CPC 1 or 2) at hospital discharge was 0.871 (95 % CI: 0.813 – 0.928) for machine learning models and 0.877 (95 % CI: 0.831–0.924) across deep learning algorithms. For survival prediction, this value was found to be 0.837 (95 % CI: 0.757–0.916). Considerable heterogeneity and high risk of bias were observed, mainly attributable to inadequate management of missing data and the absence of calibration plots. Most studies focused on pre-hospital factors, with age, sex, and initial arrest rhythm being the most frequent features.
Conclusion
Predictive models utilizing AI-based approaches, including machine and deep learning models exhibit enhanced effectiveness compared to previous regression algorithms, but significant heterogeneity and high risk of bias limit their dependability. Evaluating state-of-the-art deep learning models tailored for tabular data and their clinical generalizability can enhance outcome prediction after cardiac arrest.
{"title":"Post-Cardiac arrest outcome prediction using machine learning: A systematic review and meta-analysis","authors":"Amirhosein Zobeiri , Alireza Rezaee , Farshid Hajati , Ahmadreza Argha , Hamid Alinejad-Rokny","doi":"10.1016/j.ijmedinf.2024.105659","DOIUrl":"10.1016/j.ijmedinf.2024.105659","url":null,"abstract":"<div><h3>Background</h3><div>Early and reliable prognostication in post-cardiac arrest patients remains challenging, with various factors linked to return of spontaneous circulation (ROSC), survival, and neurological results. Machine learning and deep learning models show promise in improving these predictions. This systematic review and <em>meta</em>-analysis evaluates how effective these approaches are in predicting clinical outcomes at different time points using structured data.</div></div><div><h3>Methods</h3><div>This study followed PRISMA guidelines, involving a comprehensive search across PubMed, Scopus, and Web of Science databases until March 2024. Studies aimed at predicting ROSC, survival (or mortality), and neurological outcomes after cardiac arrest through the application of machine learning or deep learning techniques with structured data were included. Data extraction followed the guidelines of the CHARMS checklist, and the bias risk was evaluated using PROBAST tool. Models reporting the AUC metric with 95 % confidence intervals were incorporated into the quantitative synthesis and <em>meta</em>-analysis.</div></div><div><h3>Results</h3><div>After extracting 2,753 initial records, 41 studies met the inclusion criteria, yielding 97 machine learning and 16 deep learning models. The pooled AUC for predicting favorable neurological outcomes (CPC 1 or 2) at hospital discharge was 0.871 (95 % CI: 0.813 – 0.928) for machine learning models and 0.877 (95 % CI: 0.831–0.924) across deep learning algorithms. For survival prediction, this value was found to be 0.837 (95 % CI: 0.757–0.916). Considerable heterogeneity and high risk of bias were observed, mainly attributable to inadequate management of missing data and the absence of calibration plots. Most studies focused on pre-hospital factors, with age, sex, and initial arrest rhythm being the most frequent features.</div></div><div><h3>Conclusion</h3><div>Predictive models utilizing AI-based approaches, including machine and deep learning models exhibit enhanced effectiveness compared to previous regression algorithms, but significant heterogeneity and high risk of bias limit their dependability. Evaluating state-of-the-art deep learning models tailored for tabular data and their clinical generalizability can enhance outcome prediction after cardiac arrest.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105659"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.ijmedinf.2024.105673
Mingxin Liu , Tsuyoshi Okuhara , Zhehao Dai , Wenbo Huang , Lin Gu , Hiroko Okada , Emi Furukawa , Takahiro Kiuchi
Study aims and objectives.
This study aims to evaluate the accuracy of medical knowledge in the most advanced LLMs (GPT-4o, GPT-4, Gemini 1.5 Pro, and Claude 3 Opus) as of 2024. It is the first to evaluate these LLMs using a non-English medical licensing exam. The insights from this study will guide educators, policymakers, and technical experts in the effective use of AI in medical education and clinical diagnosis.
Method
Authors inputted 790 questions from Japanese National Medical Examination into the chat windows of the LLMs to obtain responses. Two authors independently assessed the correctness. Authors analyzed the overall accuracy rates of the LLMs and compared their performance on image and non-image questions, questions of varying difficulty levels, general and clinical questions, and questions from different medical specialties. Additionally, authors examined the correlation between the number of publications and LLMs’ performance in different medical specialties.
Results
GPT-4o achieved highest accuracy rate of 89.2% and outperformed the other LLMs in overall performance and each specific category. All four LLMs performed better on non-image questions than image questions, with a 10% accuracy gap. They also performed better on easy questions compared to normal and difficult ones. GPT-4o achieved a 95.0% accuracy rate on easy questions, marking it as an effective knowledge source for medical education. Four LLMs performed worst on “Gastroenterology and Hepatology” specialty. There was a positive correlation between the number of publications and LLM performance in different specialties.
Conclusions
GPT-4o achieved an overall accuracy rate close to 90%, with 95.0% on easy questions, significantly outperforming the other LLMs. This indicates GPT-4o’s potential as a knowledge source for easy questions. Image-based questions and question difficulty significantly impact LLM accuracy. “Gastroenterology and Hepatology” is the specialty with the lowest performance. The LLMs’ performance across medical specialties correlates positively with the number of related publications.
{"title":"Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination","authors":"Mingxin Liu , Tsuyoshi Okuhara , Zhehao Dai , Wenbo Huang , Lin Gu , Hiroko Okada , Emi Furukawa , Takahiro Kiuchi","doi":"10.1016/j.ijmedinf.2024.105673","DOIUrl":"10.1016/j.ijmedinf.2024.105673","url":null,"abstract":"<div><div>Study aims and objectives.</div><div>This study aims to evaluate the accuracy of medical knowledge in the most advanced LLMs (GPT-4o, GPT-4, Gemini 1.5 Pro, and Claude 3 Opus) as of 2024. It is the first to evaluate these LLMs using a non-English medical licensing exam. The insights from this study will guide educators, policymakers, and technical experts in the effective use of AI in medical education and clinical diagnosis.</div></div><div><h3>Method</h3><div>Authors inputted 790 questions from Japanese National Medical Examination into the chat windows of the LLMs to obtain responses. Two authors independently assessed the correctness. Authors analyzed the overall accuracy rates of the LLMs and compared their performance on image and non-image questions, questions of varying difficulty levels, general and clinical questions, and questions from different medical specialties. Additionally, authors examined the correlation between the number of publications and LLMs’ performance in different medical specialties.</div></div><div><h3>Results</h3><div>GPT-4o achieved highest accuracy rate of 89.2% and outperformed the other LLMs in overall performance and each specific category. All four LLMs performed better on non-image questions than image questions, with a 10% accuracy gap. They also performed better on easy questions compared to normal and difficult ones. GPT-4o achieved a 95.0% accuracy rate on easy questions, marking it as an effective knowledge source for medical education. Four LLMs performed worst on “Gastroenterology and Hepatology” specialty. There was a positive correlation between the number of publications and LLM performance in different specialties.</div></div><div><h3>Conclusions</h3><div>GPT-4o achieved an overall accuracy rate close to 90%, with 95.0% on easy questions, significantly outperforming the other LLMs. This indicates GPT-4o’s potential as a knowledge source for easy questions. Image-based questions and question difficulty significantly impact LLM accuracy. “Gastroenterology and Hepatology” is the specialty with the lowest performance. The LLMs’ performance across medical specialties correlates positively with the number of related publications.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105673"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.ijmedinf.2024.105665
Aída Muñoz Monjas , David Rubio Ruiz , David Pérez del Rey , Matvey B. Palchuk
Objective
The primary aim of this study is to address the critical issue of non-standardized units in clinical laboratory data, which poses significant challenges to data interoperability and secondary usage. Despite UCUM (Unified Code for Units of Measure) offering a unique representation for laboratory test units, nearly 60% of laboratory codes in healthcare organizations use non-standard units. We sought to design, implement and test a methodology for the harmonization of units to the UCUM standards across a large research network.
Methods
Using dimensional analysis and a curated equivalence table, the proposed methodology harmonizes disparate units to UCUM standards. The process focused on identifying and converting non-UCUM conforming units, with the goal of enhancing data comparability and interoperability across different systems.
Results
The methodology successfully achieved over 90% coverage of laboratory data with units in UCUM standards across the TriNetX research network, a significant improvement from baseline measurements. This enhancement in unit standardization directly contributed to increased interoperability of laboratory data, facilitating more reliable and comparable data analysis across various healthcare organizations.
Conclusion
The successful harmonization of laboratory data units to UCUM standards represents a significant advancement in the field of biomedical informatics. By demonstrating a practical and effective approach to overcoming the challenges of non-standardized units, our study contributes to the broader efforts to improve data interoperability and usability for secondary purposes such as research and observational studies. Future work will focus on addressing the remaining gaps in unit standardization and exploring the implications of this methodology on clinical outcomes and research capabilities.
{"title":"Enhancing real world data interoperability in healthcare: A methodological approach to laboratory unit harmonization","authors":"Aída Muñoz Monjas , David Rubio Ruiz , David Pérez del Rey , Matvey B. Palchuk","doi":"10.1016/j.ijmedinf.2024.105665","DOIUrl":"10.1016/j.ijmedinf.2024.105665","url":null,"abstract":"<div><h3>Objective</h3><div>The primary aim of this study is to address the critical issue of non-standardized units in clinical laboratory data, which poses significant challenges to data interoperability and secondary usage. Despite UCUM (Unified Code for Units of Measure) offering a unique representation for laboratory test units, nearly 60% of laboratory codes in healthcare organizations use non-standard units. We sought to design, implement and test a methodology for the harmonization of units to the UCUM standards across a large research network.</div></div><div><h3>Methods</h3><div>Using dimensional analysis and a curated equivalence table, the proposed methodology harmonizes disparate units to UCUM standards. The process focused on identifying and converting non-UCUM conforming units, with the goal of enhancing data comparability and interoperability across different systems.</div></div><div><h3>Results</h3><div>The methodology successfully achieved over 90% coverage of laboratory data with units in UCUM standards across the TriNetX research network, a significant improvement from baseline measurements. This enhancement in unit standardization directly contributed to increased interoperability of laboratory data, facilitating more reliable and comparable data analysis across various healthcare organizations.</div></div><div><h3>Conclusion</h3><div>The successful harmonization of laboratory data units to UCUM standards represents a significant advancement in the field of biomedical informatics. By demonstrating a practical and effective approach to overcoming the challenges of non-standardized units, our study contributes to the broader efforts to improve data interoperability and usability for secondary purposes such as research and observational studies. Future work will focus on addressing the remaining gaps in unit standardization and exploring the implications of this methodology on clinical outcomes and research capabilities.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105665"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142579093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.ijmedinf.2024.105661
Daniele Pala , Yuezhi Xie , Jia Xu , Li Shen
Introduction
Average age is increasing worldwide, raising the public health burden of age-related diseases, as more resources will be required to manage treatments. Phenotypic Age is a score that can be useful to provide an estimate of the probability of developing aging-related conditions, and prevention of such conditions could be performed efficiently studying the mechanisms leading to an increased phenotypic age. The objective of this study is to characterize the mechanisms that lead to aging acceleration from the interactions among socio-demographic factors, health predispositions and biological phenotypes.
Methods
We present an approach based on the combination of mediation analysis and structural equation models (SEM) to better characterize these mechanisms, quantifying the interactions between biological and external factors and the effects of preexisting health conditions and socioeconomic disparities. We use two independent cohorts of the NHANES dataset: we use the largest (n = 13,186) to select the variables that enlarge the gap between phenotypic and chronological ages, we then create a SEM based on nested linear regressions to quantify the influence of all sociodemographic variables expressed in three latent variables indicating ethnicity, socioeconomic status and preexisting health status. We then replicate the model and apply it to the second cohort (n = 4,425) to compare the results.
Results
Results show that phenotypic age increases with poor glucose control or obesity-related biomarkers, especially if combined with a low socioeconomic status or the presence of chronic or vascular diseases, and provide a framework to quantify these relationships. Black ethnicity, low income/education and a history of chronic diseases are also associated with a higher phenotypic age. Although these findings are already known in literature, the proposed SEM-based framework provides an useful tool to assess the combinations of these heterogeneous factors from a quantitative point of view.
Conclusion
In an aging society, phenotypic age is an important metric that can be used to estimate the individual health risk, however its value is influenced by a myriad of external factors, both biological and sociodemographic. The framework proposed in this paper can help quantifying the combined effects of these factors and be a starting point to the creation of personalized prevention and intervention strategies.
{"title":"Modeling the impact of socioeconomic disparity, biological markers and environmental exposures on phenotypic age using mediation analysis and structural equation models","authors":"Daniele Pala , Yuezhi Xie , Jia Xu , Li Shen","doi":"10.1016/j.ijmedinf.2024.105661","DOIUrl":"10.1016/j.ijmedinf.2024.105661","url":null,"abstract":"<div><h3>Introduction</h3><div>Average age is increasing worldwide, raising the public health burden of age-related diseases, as more resources will be required to manage treatments. Phenotypic Age is a score that can be useful to provide an estimate of the probability of developing aging-related conditions, and prevention of such conditions could be performed efficiently studying the mechanisms leading to an increased phenotypic age. The objective of this study is to characterize the mechanisms that lead to aging acceleration from the interactions among socio-demographic factors, health predispositions and biological phenotypes.</div></div><div><h3>Methods</h3><div>We present an approach based on the combination of mediation analysis and structural equation models (SEM) to better characterize these mechanisms, quantifying the interactions between biological and external factors and the effects of preexisting health conditions and socioeconomic disparities. We use two independent cohorts of the NHANES dataset: we use the largest (n = 13,186) to select the variables that enlarge the gap between phenotypic and chronological ages, we then create a SEM based on nested linear regressions to quantify the influence of all sociodemographic variables expressed in three latent variables indicating ethnicity, socioeconomic status and preexisting health status. We then replicate the model and apply it to the second cohort (n = 4,425) to compare the results.</div></div><div><h3>Results</h3><div>Results show that phenotypic age increases with poor glucose control or obesity-related biomarkers, especially if combined with a low socioeconomic status or the presence of chronic or vascular diseases, and provide a framework to quantify these relationships. Black ethnicity, low income/education and a history of chronic diseases are also associated with a higher phenotypic age. Although these findings are already known in literature, the proposed SEM-based framework provides an useful tool to assess the combinations of these heterogeneous factors from a quantitative point of view.</div></div><div><h3>Conclusion</h3><div>In an aging society, phenotypic age is an important metric that can be used to estimate the individual health risk, however its value is influenced by a myriad of external factors, both biological and sociodemographic. The framework proposed in this paper can help quantifying the combined effects of these factors and be a starting point to the creation of personalized prevention and intervention strategies.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105661"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Radiolucent jaw lesions like ameloblastoma (AM), dentigerous cyst (DC), odontogenic keratocyst (OKC), and radicular cyst (RC) often share similar characteristics, making diagnosis challenging. In 2021, CrossViT, a novel deep learning approach using multi-scale vision transformers (ViT) with cross-attention, emerged for accurate image classification. Additionally, we introduced Extended Cropping and Padding (ECAP), a method to expand training data by iteratively cropping smaller images while preserving context. However, its application in dental radiographic classification remains unexplored. This study investigates the effectiveness of CrossViTs and ECAP against ResNets for classifying common radiolucent jaw lesions.
Methods
We conducted a retrospective study involving 208 prevalent radiolucent jaw lesions (49 AMs, 59 DCs, 48 OKCs, and 54 RCs) observed in panoramic radiographs or orthopantomograms (OPGs) with confirmed histological diagnoses. Three experienced oral radiologists provided annotations with consensus. We implemented horizontal flip and ECAP technique with CrossViT-15, −18, ResNet-50, −101, and −152. A four-fold cross-validation approach was employed. The models’ performance assessed through accuracy, specificity, precision, recall (sensitivity), F1-score, and area under the receiver operating characteristics (AUCs) metrics.
Results
Models using the ECAP technique generally achieved better results, with ResNet-152 showing a statistically significant increase in F1-score. CrossViT models consistently achieved higher accuracy, precision, recall, and F1-score compared to ResNet models, regardless of ECAP usage. CrossViT-18 achieved the best overall performance. While all models showed positive ability to differentiate lesions, DC had the highest AUCs (0.89–0.90) and OKC the lowest (0.72–0.81). Only CrossViT-15 achieved AUCs above 0.80 for all four lesion types.
Conclusion
ECAP, a targeted padding data technique, improves deep learning model performance for radiolucent jaw lesion classification. This context-preserving approach is beneficial for tasks requiring an understanding of the lesion’s surroundings. Combined with CrossViT models, ECAP shows promise for accurate classification, particularly for rare lesions with limited data.
{"title":"CrossViT with ECAP: Enhanced deep learning for jaw lesion classification","authors":"Wannakamon Panyarak , Wattanapong Suttapak , Phattaranant Mahasantipiya , Arnon Charuakkra , Nattanit Boonsong , Kittichai Wantanajittikul , Anak Iamaroon","doi":"10.1016/j.ijmedinf.2024.105666","DOIUrl":"10.1016/j.ijmedinf.2024.105666","url":null,"abstract":"<div><h3>Background</h3><div>Radiolucent jaw lesions like ameloblastoma (AM), dentigerous cyst (DC), odontogenic keratocyst (OKC), and radicular cyst (RC) often share similar characteristics, making diagnosis challenging. In 2021, CrossViT, a novel deep learning approach using multi-scale vision transformers (ViT) with cross-attention, emerged for accurate image classification. Additionally, we introduced Extended Cropping and Padding (ECAP), a method to expand training data by iteratively cropping smaller images while preserving context. However, its application in dental radiographic classification remains unexplored. This study investigates the effectiveness of CrossViTs and ECAP against ResNets for classifying common radiolucent jaw lesions.</div></div><div><h3>Methods</h3><div>We conducted a retrospective study involving 208 prevalent radiolucent jaw lesions (49 AMs, 59 DCs, 48 OKCs, and 54 RCs) observed in panoramic radiographs or orthopantomograms (OPGs) with confirmed histological diagnoses. Three experienced oral radiologists provided annotations with consensus. We implemented horizontal flip and ECAP technique with CrossViT-15, −18, ResNet-50, −101, and −152. A four-fold cross-validation approach was employed. The models’ performance assessed through accuracy, specificity, precision, recall (sensitivity), F1-score, and area under the receiver operating characteristics (AUCs) metrics.</div></div><div><h3>Results</h3><div>Models using the ECAP technique generally achieved better results, with ResNet-152 showing a statistically significant increase in F1-score. CrossViT models consistently achieved higher accuracy, precision, recall, and F1-score compared to ResNet models, regardless of ECAP usage. CrossViT-18 achieved the best overall performance. While all models showed positive ability to differentiate lesions, DC had the highest AUCs (0.89–0.90) and OKC the lowest (0.72–0.81). Only CrossViT-15 achieved AUCs above 0.80 for all four lesion types.</div></div><div><h3>Conclusion</h3><div>ECAP, a targeted padding data technique, improves deep learning model performance for radiolucent jaw lesion classification. This context-preserving approach is beneficial for tasks requiring an understanding of the lesion’s surroundings. Combined with CrossViT models, ECAP shows promise for accurate classification, particularly for rare lesions with limited data.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105666"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-24DOI: 10.1016/j.ijmedinf.2024.105664
Ana Rita Martins , Marta Campos Ferreira , Carla Silvia Fernandes
Purpose
To synthesize the available evidence about the use of Health Information Technology (HIT) to support patients during hemodialysis.
Methods
The Joanna Briggs Institute’s methodological guidelines for scoping reviews and the PRISMA-ScR checklist were employed. Bibliographic searches across MEDLINE®, CINAHL®, Psychology and Behavioral Sciences Collection, Scopus, MedicLatina, and Cochrane yielded 932 records.
Results
Eighteen studies published between 2003 and 2023 were included. They explored a range of HITs, including virtual reality, exergames, websites, and mobile applications, all specifically developed for use during the intradialytic period.
Conclusion
This study highlights the HITs developed for use during hemodialysis treatment, supporting physical exercise, disease management, and enhancement of self-efficacy and self-care.
{"title":"Emerging technologies for supporting patients during Hemodialysis: A scoping review","authors":"Ana Rita Martins , Marta Campos Ferreira , Carla Silvia Fernandes","doi":"10.1016/j.ijmedinf.2024.105664","DOIUrl":"10.1016/j.ijmedinf.2024.105664","url":null,"abstract":"<div><h3>Purpose</h3><div>To synthesize the available evidence about the use of Health Information Technology (HIT) to support patients during hemodialysis.</div></div><div><h3>Methods</h3><div>The Joanna Briggs Institute’s methodological guidelines for scoping reviews and the PRISMA-ScR checklist were employed. Bibliographic searches across MEDLINE®, CINAHL®, Psychology and Behavioral Sciences Collection, Scopus, MedicLatina, and Cochrane yielded 932 records.</div></div><div><h3>Results</h3><div>Eighteen studies published between 2003 and 2023 were included. They explored a range of HITs, including virtual reality, exergames, websites, and mobile applications, all specifically developed for use during the intradialytic period.</div></div><div><h3>Conclusion</h3><div>This study highlights the HITs developed for use during hemodialysis treatment, supporting physical exercise, disease management, and enhancement of self-efficacy and self-care.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105664"},"PeriodicalIF":3.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/j.ijmedinf.2024.105662
Desmond Hedderson , Karen L. Courtney , Helen Monkman , Ian E. Blanchard
Objectives
The nature of paramedic workflows, where paramedics are responsible to provide care and chart concurrently, can result in incomplete or non-existent patient care reports on patient handover to the emergency department (ED). Charting delays and retrospective recollection of care may lead to patient information gaps, which can increase ED workload, cause care delays, and increase the risk of adverse events. Speech recognition documentation technology has the potential to produce complete patient care reports quicker and improve paramedic-to-ED handover. We performed a scoping review to determine paramedic perceptions and user requirements for speech recognition documentation technology.
Methods
MEDLINE, Google Scholar, IEEE Explore, ProQuest, and CINAHL were searched from 2014 to March 2024. Criteria included studies focused on paramedics’ use or perceptions of speech recognition documentation technology. This review included studies conducted in the prehospital environment and adjacent agencies (i.e., ED, fire, police, military).
Results
The review identified eight articles that met inclusion criteria. All eight articles were small focus group-based studies in laboratory settings published on or after 2020. Five studies were conducted in the United States, two in Switzerland, and one in Japan. Of the eight studies, five recommended further live environment testing of the technology examined, and three underscored the importance of a user-centred design. The top user requirements for speech recognition adoption was hands-free use, noise reduction technology, battery life, and word accuracy. All eight studies recommended further research and development of speech recognition documentation technology in the prehospital workflow.
Conclusion
This scoping review has highlighted that while there is a growing interest in speech recognition documentation technology in the paramedicine workflow, more research is needed, especially with larger samples in a live environment. The user requirements and perceptions of speech recognition documentation technology in paramedicine must be better understood to design systems with high adoption rates.
{"title":"Speech recognition technology in prehospital documentation: A scoping review","authors":"Desmond Hedderson , Karen L. Courtney , Helen Monkman , Ian E. Blanchard","doi":"10.1016/j.ijmedinf.2024.105662","DOIUrl":"10.1016/j.ijmedinf.2024.105662","url":null,"abstract":"<div><h3>Objectives</h3><div>The nature of paramedic workflows, where paramedics are responsible to provide care and chart concurrently, can result in incomplete or non-existent patient care reports on patient handover to the emergency department (ED). Charting delays and retrospective recollection of care may lead to patient information gaps, which can increase ED workload, cause care delays, and increase the risk of adverse events. Speech recognition documentation technology has the potential to produce complete patient care reports quicker and improve paramedic-to-ED handover. We performed a scoping review to determine paramedic perceptions and user requirements for speech recognition documentation technology.</div></div><div><h3>Methods</h3><div>MEDLINE, Google Scholar, IEEE Explore, ProQuest, and CINAHL were searched from 2014 to March 2024. Criteria included studies focused on paramedics’ use or perceptions of speech recognition documentation technology. This review included studies conducted in the prehospital environment and adjacent agencies (i.e., ED, fire, police, military).</div></div><div><h3>Results</h3><div>The review identified eight articles that met inclusion criteria. All eight articles were small focus group-based studies in laboratory settings published on or after 2020. Five studies were conducted in the United States, two in Switzerland, and one in Japan. Of the eight studies, five recommended further live environment testing of the technology examined, and three underscored the importance of a user-centred design. The top user requirements for speech recognition adoption was hands-free use, noise reduction technology, battery life, and word accuracy. All eight studies recommended further research and development of speech recognition documentation technology in the prehospital workflow<strong>.</strong></div></div><div><h3>Conclusion</h3><div>This scoping review has highlighted that while there is a growing interest in speech recognition documentation technology in the paramedicine workflow, more research is needed, especially with larger samples in a live environment. The user requirements and perceptions of speech recognition documentation technology in paramedicine must be better understood to design systems with high adoption rates.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105662"},"PeriodicalIF":3.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}