Macey L Murray, Laura Sato, Jaspal Panesar, Sharon B Love, Rebecca Lee, James R Carpenter, Marion Mafham, Mahesh KB Parmar, Heather Pinches, Matthew R Sydes
{"title":"Demonstrating the data integrity of routinely collected healthcare systems data for clinical trials (DEDICaTe): A proof-of-concept study","authors":"Macey L Murray, Laura Sato, Jaspal Panesar, Sharon B Love, Rebecca Lee, James R Carpenter, Marion Mafham, Mahesh KB Parmar, Heather Pinches, Matthew R Sydes","doi":"10.1177/14604582241276969","DOIUrl":null,"url":null,"abstract":"Introduction/aims: Healthcare systems data (also known as real-world or routinely collected health data) could transform the conduct of clinical trials. Demonstrating integrity and provenance of these data is critical for clinical trials, to enable their use where appropriate and avoid duplication using scarce trial resources. Building on previous work, this proof-of-concept study used a data intelligence tool, the “Central Metastore,” to provide metadata and lineage information of nationally held data. Methods: The feasibility of NHS England’s Central Metastore to capture detailed records of the origins, processes, and methods that produce four datasets was assessed. These were England’s Hospital Episode Statistics (Admitted Patient Care, Outpatients, Critical Care) and the Civil Registration of Deaths (England and Wales). The process comprised: information gathering; information ingestion using the tool; and auto-generation of lineage diagrams/content to show data integrity. A guidance document to standardise this process was developed. Results/Discussion: The tool can ingest, store and display data provenance in sufficient detail to support trust and transparency in using these datasets for trials. The slowest step was information gathering from multiple sources, so consistency in record-keeping is essential.","PeriodicalId":55069,"journal":{"name":"Health Informatics Journal","volume":"38 1","pages":"14604582241276969"},"PeriodicalIF":2.2000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Informatics Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/14604582241276969","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction/aims: Healthcare systems data (also known as real-world or routinely collected health data) could transform the conduct of clinical trials. Demonstrating integrity and provenance of these data is critical for clinical trials, to enable their use where appropriate and avoid duplication using scarce trial resources. Building on previous work, this proof-of-concept study used a data intelligence tool, the “Central Metastore,” to provide metadata and lineage information of nationally held data. Methods: The feasibility of NHS England’s Central Metastore to capture detailed records of the origins, processes, and methods that produce four datasets was assessed. These were England’s Hospital Episode Statistics (Admitted Patient Care, Outpatients, Critical Care) and the Civil Registration of Deaths (England and Wales). The process comprised: information gathering; information ingestion using the tool; and auto-generation of lineage diagrams/content to show data integrity. A guidance document to standardise this process was developed. Results/Discussion: The tool can ingest, store and display data provenance in sufficient detail to support trust and transparency in using these datasets for trials. The slowest step was information gathering from multiple sources, so consistency in record-keeping is essential.
导言/目的:医疗保健系统数据(也称为真实世界或常规收集的健康数据)可以改变临床试验的开展。证明这些数据的完整性和出处对临床试验至关重要,这样才能在适当的时候使用这些数据,避免重复使用稀缺的试验资源。在以往工作的基础上,这项概念验证研究使用了一种数据智能工具--"中央元数据库",以提供全国性数据的元数据和来源信息。研究方法我们评估了英格兰国家医疗服务系统中央元数据存储库(NHS England's Central Metastore)详细记录产生四个数据集的来源、过程和方法的可行性。这四个数据集分别是英格兰医院事件统计(入院病人护理、门诊病人、危重病人护理)和死亡民事登记(英格兰和威尔士)。该流程包括:信息收集;使用工具摄取信息;自动生成脉络图/内容以显示数据完整性。为使这一过程标准化,制定了一份指导文件。结果/讨论:该工具可以摄取、存储和显示足够详细的数据来源,以支持在试验中使用这些数据集时的信任度和透明度。最慢的步骤是从多个来源收集信息,因此记录保存的一致性至关重要。
期刊介绍:
Health Informatics Journal is an international peer-reviewed journal. All papers submitted to Health Informatics Journal are subject to peer review by members of a carefully appointed editorial board. The journal operates a conventional single-blind reviewing policy in which the reviewer’s name is always concealed from the submitting author.