lab2clean:一种新算法,用于自动清理用于二次使用的回顾性临床实验室结果数据。

IF 3.3 3区 医学 Q2 MEDICAL INFORMATICS BMC Medical Informatics and Decision Making Pub Date : 2024-09-03 DOI:10.1186/s12911-024-02652-7
Ahmed Medhat Zayed, Arne Janssens, Pavlos Mamouris, Nicolas Delvaux
{"title":"lab2clean:一种新算法,用于自动清理用于二次使用的回顾性临床实验室结果数据。","authors":"Ahmed Medhat Zayed, Arne Janssens, Pavlos Mamouris, Nicolas Delvaux","doi":"10.1186/s12911-024-02652-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The integrity of clinical research and machine learning models in healthcare heavily relies on the quality of underlying clinical laboratory data. However, the preprocessing of this data to ensure its reliability and accuracy remains a significant challenge due to variations in data recording and reporting standards.</p><p><strong>Methods: </strong>We developed lab2clean, a novel algorithm aimed at automating and standardizing the cleaning of retrospective clinical laboratory results data. lab2clean was implemented as two R functions specifically designed to enhance data conformance and plausibility by standardizing result formats and validating result values. The functionality and performance of the algorithm were evaluated using two extensive electronic medical record (EMR) databases, encompassing various clinical settings.</p><p><strong>Results: </strong>lab2clean effectively reduced the variability of laboratory results and identified potentially erroneous records. Upon deployment, it demonstrated effective and fast standardization and validation of substantial laboratory data records. The evaluation highlighted significant improvements in the conformance and plausibility of lab results, confirming the algorithm's efficacy in handling large-scale data sets.</p><p><strong>Conclusions: </strong>lab2clean addresses the challenge of preprocessing and cleaning clinical laboratory data, a critical step in ensuring high-quality data for research outcomes. It offers a straightforward, efficient tool for researchers, improving the quality of clinical laboratory data, a major portion of healthcare data. Thereby, enhancing the reliability and reproducibility of clinical research outcomes and clinical machine learning models. Future developments aim to broaden its functionality and accessibility, solidifying its vital role in healthcare data management.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370074/pdf/","citationCount":"0","resultStr":"{\"title\":\"lab2clean: a novel algorithm for automated cleaning of retrospective clinical laboratory results data for secondary uses.\",\"authors\":\"Ahmed Medhat Zayed, Arne Janssens, Pavlos Mamouris, Nicolas Delvaux\",\"doi\":\"10.1186/s12911-024-02652-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The integrity of clinical research and machine learning models in healthcare heavily relies on the quality of underlying clinical laboratory data. However, the preprocessing of this data to ensure its reliability and accuracy remains a significant challenge due to variations in data recording and reporting standards.</p><p><strong>Methods: </strong>We developed lab2clean, a novel algorithm aimed at automating and standardizing the cleaning of retrospective clinical laboratory results data. lab2clean was implemented as two R functions specifically designed to enhance data conformance and plausibility by standardizing result formats and validating result values. The functionality and performance of the algorithm were evaluated using two extensive electronic medical record (EMR) databases, encompassing various clinical settings.</p><p><strong>Results: </strong>lab2clean effectively reduced the variability of laboratory results and identified potentially erroneous records. Upon deployment, it demonstrated effective and fast standardization and validation of substantial laboratory data records. The evaluation highlighted significant improvements in the conformance and plausibility of lab results, confirming the algorithm's efficacy in handling large-scale data sets.</p><p><strong>Conclusions: </strong>lab2clean addresses the challenge of preprocessing and cleaning clinical laboratory data, a critical step in ensuring high-quality data for research outcomes. It offers a straightforward, efficient tool for researchers, improving the quality of clinical laboratory data, a major portion of healthcare data. Thereby, enhancing the reliability and reproducibility of clinical research outcomes and clinical machine learning models. Future developments aim to broaden its functionality and accessibility, solidifying its vital role in healthcare data management.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370074/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-024-02652-7\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02652-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

背景:医疗保健领域临床研究和机器学习模型的完整性在很大程度上依赖于基础临床实验室数据的质量。然而,由于数据记录和报告标准的差异,对这些数据进行预处理以确保其可靠性和准确性仍然是一项重大挑战:我们开发了 Lab2clean,这是一种新型算法,旨在实现回顾性临床实验室结果数据清理的自动化和标准化。Lab2clean 是以两个 R 函数的形式实现的,专门设计用于通过标准化结果格式和验证结果值来提高数据的一致性和可信度。结果:lab2clean 有效降低了化验结果的可变性,并识别了潜在的错误记录。在部署后,它对大量实验室数据记录进行了有效、快速的标准化和验证。结论:lab2clean 解决了临床实验室数据预处理和清理的难题,这是确保研究成果数据高质量的关键一步。它为研究人员提供了一个简单、高效的工具,提高了临床实验室数据(医疗数据的重要组成部分)的质量。从而提高临床研究成果和临床机器学习模型的可靠性和可重复性。未来的发展目标是扩大其功能性和可访问性,巩固其在医疗数据管理中的重要作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
lab2clean: a novel algorithm for automated cleaning of retrospective clinical laboratory results data for secondary uses.

Background: The integrity of clinical research and machine learning models in healthcare heavily relies on the quality of underlying clinical laboratory data. However, the preprocessing of this data to ensure its reliability and accuracy remains a significant challenge due to variations in data recording and reporting standards.

Methods: We developed lab2clean, a novel algorithm aimed at automating and standardizing the cleaning of retrospective clinical laboratory results data. lab2clean was implemented as two R functions specifically designed to enhance data conformance and plausibility by standardizing result formats and validating result values. The functionality and performance of the algorithm were evaluated using two extensive electronic medical record (EMR) databases, encompassing various clinical settings.

Results: lab2clean effectively reduced the variability of laboratory results and identified potentially erroneous records. Upon deployment, it demonstrated effective and fast standardization and validation of substantial laboratory data records. The evaluation highlighted significant improvements in the conformance and plausibility of lab results, confirming the algorithm's efficacy in handling large-scale data sets.

Conclusions: lab2clean addresses the challenge of preprocessing and cleaning clinical laboratory data, a critical step in ensuring high-quality data for research outcomes. It offers a straightforward, efficient tool for researchers, improving the quality of clinical laboratory data, a major portion of healthcare data. Thereby, enhancing the reliability and reproducibility of clinical research outcomes and clinical machine learning models. Future developments aim to broaden its functionality and accessibility, solidifying its vital role in healthcare data management.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
期刊最新文献
Real-world data to support post-market safety and performance of embolization coils: evidence generation from a medical device manufacturer and data institute partnership. Development of message passing-based graph convolutional networks for classifying cancer pathology reports Machine learning-based evaluation of prognostic factors for mortality and relapse in patients with acute lymphoblastic leukemia: a comparative simulation study A cross domain access control model for medical consortium based on DBSCAN and penalty function RCC-Supporter: supporting renal cell carcinoma treatment decision-making using machine learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1