Stefan Förstel , Markus Förstel , Markus Gallistl , Dario Zanca , Bjoern M. Eskofier , Eva M. Rothgang
{"title":"Data quality in hospital information systems: Lessons learned from analyzing 30 years of patient data in a regional German hospital","authors":"Stefan Förstel , Markus Förstel , Markus Gallistl , Dario Zanca , Bjoern M. Eskofier , Eva M. Rothgang","doi":"10.1016/j.ijmedinf.2024.105636","DOIUrl":null,"url":null,"abstract":"<div><div><em>Background</em>: The integration of Hospital Information Systems (HIS) into healthcare delivery has significantly enhanced patient care and operational efficiency. Nonetheless, the rapid acceleration of digital transformation has led to a substantial increase in the volume of data managed by these systems. This emphasizes the need for robust mechanisms for data management and quality assurance.</div><div><em>Objective</em>: This study addresses data quality issues related to patient identifiers within the Hospital Information System (HIS) of a regional German hospital, focusing on improving the accuracy and consistency of these administrative data entries.</div><div><em>Methods</em>: Employing a combination of data analysis and expert interviews, this study reviews and programmatically cleanses a dataset with over 2,000,000 patient data entries extracted from the HIS. The areas of investigation are patient admissions, discharges, and geographical data.</div><div><em>Results</em>: The analysis revealed that roughly 25% of the dataset was rendered unusable by errors and inconsistencies. By implementing a thorough data cleansing process, we significantly enhanced the utility of the dataset. In doing so, we identified the primary issues affecting data quality, including ambiguities among similar variables and a gap between the intended and actual use of the system.</div><div><em>Conclusion</em>: The findings highlight the critical importance of enhancing data quality in healthcare information systems. This study shows the necessity of a careful review of data extracted from the HIS before it can be reliably utilized for machine learning tasks, thereby rendering the data more usable for both clinical and analytical purposes.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"192 ","pages":"Article 105636"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624002995","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The integration of Hospital Information Systems (HIS) into healthcare delivery has significantly enhanced patient care and operational efficiency. Nonetheless, the rapid acceleration of digital transformation has led to a substantial increase in the volume of data managed by these systems. This emphasizes the need for robust mechanisms for data management and quality assurance.
Objective: This study addresses data quality issues related to patient identifiers within the Hospital Information System (HIS) of a regional German hospital, focusing on improving the accuracy and consistency of these administrative data entries.
Methods: Employing a combination of data analysis and expert interviews, this study reviews and programmatically cleanses a dataset with over 2,000,000 patient data entries extracted from the HIS. The areas of investigation are patient admissions, discharges, and geographical data.
Results: The analysis revealed that roughly 25% of the dataset was rendered unusable by errors and inconsistencies. By implementing a thorough data cleansing process, we significantly enhanced the utility of the dataset. In doing so, we identified the primary issues affecting data quality, including ambiguities among similar variables and a gap between the intended and actual use of the system.
Conclusion: The findings highlight the critical importance of enhancing data quality in healthcare information systems. This study shows the necessity of a careful review of data extracted from the HIS before it can be reliably utilized for machine learning tasks, thereby rendering the data more usable for both clinical and analytical purposes.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.