A case is not a case is not a case - challenges and solutions in determining urolithiasis caseloads using the digital infrastructure of a clinical data warehouse

medRxiv - Health Informatics Pub Date : 2024-09-18 DOI:10.1101/2024.09.13.24313333

Martin Schoenthaler, Noah Hempen, Maria Weymann, Maximilian Ferry von Bargen, Maximilian Glienke, Antonia Elsaesser, Max Behrens, Harald Binder, Nadine Binder

{"title":"A case is not a case is not a case - challenges and solutions in determining urolithiasis caseloads using the digital infrastructure of a clinical data warehouse","authors":"Martin Schoenthaler, Noah Hempen, Maria Weymann, Maximilian Ferry von Bargen, Maximilian Glienke, Antonia Elsaesser, Max Behrens, Harald Binder, Nadine Binder","doi":"10.1101/2024.09.13.24313333","DOIUrl":null,"url":null,"abstract":"Background:\nTo provide more evidence in urolithiasis research, we have established the German Nationwide Register for RECurrent URolithiasis (RECUR) using local clinical data warehouses (CDWH). For RECUR and other registers relying on digitalized clinical data, it is crucial to ensure the data's reliability for answering scientific questions. In this work, we aim to compare the results of different CDWH-based queries on urolithiasis cases next to manual case extraction from the primary source.\nMethods:\nSources for data extraction included the Medical Center University of Freiburg (MCUF) hospital information system (HIS), MCUF performance data (a clinical data set with merged data from patients including data from various time points throughout their treatment), and MCUF reimbursement data. We extracted data on caseloads in urolithiasis algorithmically (performance and reimbursement data) and compared those to a reference group compiled of manually extracted data from the local HIS and algorithmically extracted data.\nResults:\nAlgorithmic extraction based on performance data resulted in correct and complete case identification as compared to the reference group. The case numbers from manual extraction from HIS data and algorithmic extraction from reimbursement data differed by 14% and 12%, respectively. The reasons for deviations in HIS data included human errors and a lack of data availability from different wards. Deviations in reimbursement data arose primarily due to the merging of cases in the context of reimbursement mechanisms. As the CDWH at MCUF is part of the German Medical Informatics Initiative (MII), the results can be transferred to other medical centers with similar CDWH structure.\nConclusions:\nThe current study provides firm evidence of the importance of clearly defining a studys target variable, e.g., urolithiasis cases, and a thorough understanding of the data sources and modes used to extract the target data. Our work clearly shows that, depending on various data sources, a case is not a case is not a case.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"12 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.13.24313333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: To provide more evidence in urolithiasis research, we have established the German Nationwide Register for RECurrent URolithiasis (RECUR) using local clinical data warehouses (CDWH). For RECUR and other registers relying on digitalized clinical data, it is crucial to ensure the data's reliability for answering scientific questions. In this work, we aim to compare the results of different CDWH-based queries on urolithiasis cases next to manual case extraction from the primary source. Methods: Sources for data extraction included the Medical Center University of Freiburg (MCUF) hospital information system (HIS), MCUF performance data (a clinical data set with merged data from patients including data from various time points throughout their treatment), and MCUF reimbursement data. We extracted data on caseloads in urolithiasis algorithmically (performance and reimbursement data) and compared those to a reference group compiled of manually extracted data from the local HIS and algorithmically extracted data. Results: Algorithmic extraction based on performance data resulted in correct and complete case identification as compared to the reference group. The case numbers from manual extraction from HIS data and algorithmic extraction from reimbursement data differed by 14% and 12%, respectively. The reasons for deviations in HIS data included human errors and a lack of data availability from different wards. Deviations in reimbursement data arose primarily due to the merging of cases in the context of reimbursement mechanisms. As the CDWH at MCUF is part of the German Medical Informatics Initiative (MII), the results can be transferred to other medical centers with similar CDWH structure. Conclusions: The current study provides firm evidence of the importance of clearly defining a studys target variable, e.g., urolithiasis cases, and a thorough understanding of the data sources and modes used to extract the target data. Our work clearly shows that, depending on various data sources, a case is not a case is not a case.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

病例不是病例不是病例--利用临床数据仓库的数字基础设施确定尿路结石病例数的挑战和解决方案

背景：为了给尿石症研究提供更多证据，我们利用当地的临床数据仓库（CDWH）建立了德国全国复发性尿石症登记册（RECUR）。对于 RECUR 和其他依赖于数字化临床数据的登记册来说，确保数据的可靠性对于回答科学问题至关重要。方法：数据提取来源包括弗莱堡医学中心大学（MCUF）医院信息系统（HIS）、MCUF绩效数据（临床数据集，包含患者治疗过程中不同时间点的合并数据）和MCUF报销数据。我们通过算法提取了泌尿系结石的病例数据（绩效数据和报销数据），并将其与由当地 HIS 人工提取的数据和算法提取的数据组成的参照组进行了比较。从 HIS 数据中人工提取的病例数与从报销数据中算法提取的病例数分别相差 14% 和 12%。HIS 数据出现偏差的原因包括人为失误和缺乏来自不同病房的数据。报销数据出现偏差的主要原因是报销机制中的病例合并。结论：目前的研究有力地证明了明确定义研究目标变量（如尿路结石病例）的重要性，以及透彻了解数据来源和用于提取目标数据的模式的重要性。我们的工作清楚地表明，根据不同的数据来源，病例并非病例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

medRxiv - Health Informatics

自引率

0.00%

发文量