EMR-LIP: A lightweight framework for standardizing the preprocessing of longitudinal irregular data in electronic medical records

IF 4.8 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer methods and programs in biomedicine Pub Date : 2025-02-01 Epub Date: 2024-11-24 DOI:10.1016/j.cmpb.2024.108521
Jiawei Luo , Shixin Huang , Lan Lan , Shu Yang , Tingqian Cao , Jin Yin , Jiajun Qiu , Xiaoyan Yang , Yingqiang Guo , Xiaobo Zhou
{"title":"EMR-LIP: A lightweight framework for standardizing the preprocessing of longitudinal irregular data in electronic medical records","authors":"Jiawei Luo ,&nbsp;Shixin Huang ,&nbsp;Lan Lan ,&nbsp;Shu Yang ,&nbsp;Tingqian Cao ,&nbsp;Jin Yin ,&nbsp;Jiajun Qiu ,&nbsp;Xiaoyan Yang ,&nbsp;Yingqiang Guo ,&nbsp;Xiaobo Zhou","doi":"10.1016/j.cmpb.2024.108521","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Longitudinal data from Electronic Medical Records (EMRs) are increasingly utilized to construct predictive models for various clinical tasks, offering enhanced insights into patient health. However, significant discrepancies exist in preprocessing the irregular and intricate EMR data across studies due to the absence of universally accepted tools and standardization methods. This study introduces the <strong><u>E</u></strong>lectronic <strong><u>M</u></strong>edical <strong><u>R</u></strong>ecord <strong><u>L</u></strong>ongitudinal <strong><u>I</u></strong>rregular Data <strong><u>P</u></strong>reprocessing (EMR-LIP) framework, a lightweight approach for optimizing the preprocessing of longitudinal, irregular EMR data, aiming to enhance research efficiency, consistency, reproducibility, and comparability.</div></div><div><h3>Materials and Methods</h3><div>EMR-LIP modularizes the preprocessing of longitudinal irregular EMR data, offering tools with a low level of encapsulation. Compared to other pipelines, EMR-LIP categorizes variables in a more granular manner, designing specific preprocessing techniques for each type. To demonstrate its versatility, EMR-LIP was applied in an empirical study to two public EMR databases, MIMIC-IV and eICU-CRD. Data processed with EMR-LIP was then used to test several renowned deep learning models on a range of commonly used benchmark tasks.</div></div><div><h3>Results</h3><div>In both the MIMIC-IV and eICU-CRD databases, models based on EMR-LIP showed superior baseline performance compared to previous studies. Interestingly, using data preprocessed by EMR-LIP, traditional models such as LSTM and GRU outperformed more complex models, achieving an AUROC of up to 0.94 for in-hospital death prediction. Additionally, models based on EMR-LIP showed stable performance across various resampling intervals and exhibited better fairness in performance across different ethnic groups.</div></div><div><h3>Conclusion</h3><div>EMR-LIP streamlines the preprocessing of irregular longitudinal EMR data, offering an end-to-end solution for model-ready data creation, and has been open-sourced for collaborative refinement by the research community.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"259 ","pages":"Article 108521"},"PeriodicalIF":4.8000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724005145","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

Longitudinal data from Electronic Medical Records (EMRs) are increasingly utilized to construct predictive models for various clinical tasks, offering enhanced insights into patient health. However, significant discrepancies exist in preprocessing the irregular and intricate EMR data across studies due to the absence of universally accepted tools and standardization methods. This study introduces the Electronic Medical Record Longitudinal Irregular Data Preprocessing (EMR-LIP) framework, a lightweight approach for optimizing the preprocessing of longitudinal, irregular EMR data, aiming to enhance research efficiency, consistency, reproducibility, and comparability.

Materials and Methods

EMR-LIP modularizes the preprocessing of longitudinal irregular EMR data, offering tools with a low level of encapsulation. Compared to other pipelines, EMR-LIP categorizes variables in a more granular manner, designing specific preprocessing techniques for each type. To demonstrate its versatility, EMR-LIP was applied in an empirical study to two public EMR databases, MIMIC-IV and eICU-CRD. Data processed with EMR-LIP was then used to test several renowned deep learning models on a range of commonly used benchmark tasks.

Results

In both the MIMIC-IV and eICU-CRD databases, models based on EMR-LIP showed superior baseline performance compared to previous studies. Interestingly, using data preprocessed by EMR-LIP, traditional models such as LSTM and GRU outperformed more complex models, achieving an AUROC of up to 0.94 for in-hospital death prediction. Additionally, models based on EMR-LIP showed stable performance across various resampling intervals and exhibited better fairness in performance across different ethnic groups.

Conclusion

EMR-LIP streamlines the preprocessing of irregular longitudinal EMR data, offering an end-to-end solution for model-ready data creation, and has been open-sourced for collaborative refinement by the research community.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
EMR-LIP:用于标准化电子病历中纵向不规则数据预处理的轻量级框架
目的越来越多地利用电子病历(emr)的纵向数据来构建各种临床任务的预测模型,从而更好地了解患者的健康状况。然而,由于缺乏普遍接受的工具和标准化方法,各研究在不规则和复杂的EMR数据预处理方面存在显著差异。本研究引入了电子病历纵向不规则数据预处理(EMR- lip)框架,这是一种轻量级的方法,用于优化纵向不规则电子病历数据的预处理,旨在提高研究效率、一致性、可重复性和可比性。材料和方法semr - lip将纵向不规则EMR数据的预处理模块化,提供了低封装水平的工具。与其他管道相比,EMR-LIP以更细粒度的方式对变量进行分类,为每种类型设计特定的预处理技术。为了证明其通用性,EMR- lip在两个公共EMR数据库MIMIC-IV和eICU-CRD中进行了实证研究。然后使用EMR-LIP处理的数据在一系列常用的基准任务上测试几个著名的深度学习模型。结果在MIMIC-IV和eICU-CRD数据库中,基于EMR-LIP的模型与先前的研究相比显示出更好的基线性能。有趣的是,使用EMR-LIP预处理的数据,LSTM和GRU等传统模型的表现优于更复杂的模型,在院内死亡预测中实现了高达0.94的AUROC。此外,基于EMR-LIP的模型在不同的重采样间隔内表现出稳定的性能,并且在不同的种族群体中表现出更好的公平性。结论EMR- lip简化了不规则纵向EMR数据的预处理,为模型就绪数据创建提供了端到端解决方案,并已开放源代码,供研究界协作改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
期刊最新文献
Inhalation exposure and particle deposition across 16 nonhuman primate airway models using computational fluid–particle dynamics for allometric extrapolation Robust prediction of parameterized cardiovascular hemodynamics using deep operator networks with time normalization CRMIPred: Identifying the spatial interactions among cis-regulatory modules via considering their cross-attended epigenetic profiles A robust topology optimization based biomechanical computational framework for patient-specific trabecular bone microstructure reconstruction Med-ViX-Ray: Enhancing explainable chest X-ray analysis with clinical knowledge graphs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1