Differences in recording of cancer diagnosis between datasets in England: A population-based study of linked cancer registration, hospital, and primary care data
Emma Whitfield , Becky White , Matthew E. Barclay , Meena Rafiq , Cristina Renzi , Brian Rous , Spiros Denaxas , Georgios Lyratzopoulos
{"title":"Differences in recording of cancer diagnosis between datasets in England: A population-based study of linked cancer registration, hospital, and primary care data","authors":"Emma Whitfield , Becky White , Matthew E. Barclay , Meena Rafiq , Cristina Renzi , Brian Rous , Spiros Denaxas , Georgios Lyratzopoulos","doi":"10.1016/j.canep.2024.102703","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Differences in the recording of cancer case status and diagnosis date have been observed between cancer registry (CR) – the reference standard – and electronic health records (EHRs); such differences may affect estimates of cancer risk or misclassify diagnostic pathways. This study aims to quantify differences in recording of case status and date of cancer diagnosis between cancer registry and EHRs.</div></div><div><h3>Methods</h3><div>Linked primary care (Clinical Practice Research Datalink (CPRD)), secondary care (Hospital Episode Statistics (HES)) and national Cancer Registry (CR) data, were used to identify 14,301 patients with a recorded diagnosis of brain, colon, lung, ovarian, or pancreatic cancer between 1999 and 2018. Agreement in case status between datasets, differences in recorded diagnosis dates, and change in agreement over time were investigated for each cancer site.</div></div><div><h3>Results</h3><div>Between 84 % (ovary) to 92 % (colon) of diagnoses in cancer registry were also recorded in combined CPRD-HES data. Agreement with cancer registry was slightly lower in HES (78 % (ovary) to 86 % (colon)) and CPRD (61 % (ovary, pancreas) to 72 % (brain)). The proportion of CPRD-HES diagnoses confirmed in CR varied by cancer site (50 % (brain) to 86 % (lung)). Agreement between CR and HES was relatively stable within cancer sites over time. Concordance between CR and CPRD was more heterogeneous between cancer sites and over time. Best agreement in diagnosis date was observed between CR and HES (median difference 0 or 1 days, all cancer sites).</div></div><div><h3>Conclusion</h3><div>Agreement between CR and EHR data is heterogeneous across cancer sites. Concordance does not appear to have improved over time. Combined data from primary and secondary care may be sufficient to approximate case status in CR in some circumstances, but the date we consider to represent the diagnosis may impact study outcomes.</div></div>","PeriodicalId":56322,"journal":{"name":"Cancer Epidemiology","volume":"94 ","pages":"Article 102703"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877782124001826","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Differences in the recording of cancer case status and diagnosis date have been observed between cancer registry (CR) – the reference standard – and electronic health records (EHRs); such differences may affect estimates of cancer risk or misclassify diagnostic pathways. This study aims to quantify differences in recording of case status and date of cancer diagnosis between cancer registry and EHRs.
Methods
Linked primary care (Clinical Practice Research Datalink (CPRD)), secondary care (Hospital Episode Statistics (HES)) and national Cancer Registry (CR) data, were used to identify 14,301 patients with a recorded diagnosis of brain, colon, lung, ovarian, or pancreatic cancer between 1999 and 2018. Agreement in case status between datasets, differences in recorded diagnosis dates, and change in agreement over time were investigated for each cancer site.
Results
Between 84 % (ovary) to 92 % (colon) of diagnoses in cancer registry were also recorded in combined CPRD-HES data. Agreement with cancer registry was slightly lower in HES (78 % (ovary) to 86 % (colon)) and CPRD (61 % (ovary, pancreas) to 72 % (brain)). The proportion of CPRD-HES diagnoses confirmed in CR varied by cancer site (50 % (brain) to 86 % (lung)). Agreement between CR and HES was relatively stable within cancer sites over time. Concordance between CR and CPRD was more heterogeneous between cancer sites and over time. Best agreement in diagnosis date was observed between CR and HES (median difference 0 or 1 days, all cancer sites).
Conclusion
Agreement between CR and EHR data is heterogeneous across cancer sites. Concordance does not appear to have improved over time. Combined data from primary and secondary care may be sufficient to approximate case status in CR in some circumstances, but the date we consider to represent the diagnosis may impact study outcomes.
期刊介绍:
Cancer Epidemiology is dedicated to increasing understanding about cancer causes, prevention and control. The scope of the journal embraces all aspects of cancer epidemiology including:
• Descriptive epidemiology
• Studies of risk factors for disease initiation, development and prognosis
• Screening and early detection
• Prevention and control
• Methodological issues
The journal publishes original research articles (full length and short reports), systematic reviews and meta-analyses, editorials, commentaries and letters to the editor commenting on previously published research.