Evaluating the quality of prostate cancer diagnosis recording in CPRD GOLD and CPRD Aurum primary care databases for observational research: A study using linked English electronic health records
Gayasha Somathilake , Elizabeth Ford , Jo Armes , Sotiris Moschoyiannis , Michelle Collins , Patrick Francsics , Agnieszka Lemanska
{"title":"Evaluating the quality of prostate cancer diagnosis recording in CPRD GOLD and CPRD Aurum primary care databases for observational research: A study using linked English electronic health records","authors":"Gayasha Somathilake , Elizabeth Ford , Jo Armes , Sotiris Moschoyiannis , Michelle Collins , Patrick Francsics , Agnieszka Lemanska","doi":"10.1016/j.canep.2024.102715","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Primary care data in the UK are widely used for cancer research, but the reliability of recording key events like diagnoses remains uncertain. Although data linkage can improve reliability, its costs, time requirements, and sample size constraints may discourage its use. We evaluated accuracy, completeness, and date concordance of prostate cancer (PCa) diagnosis recording in Clinical Practice Research Datalink (CPRD) GOLD and Aurum compared to linked Cancer Registry (CR) and Hospital Episode Statistics (HES) Admitted Patient Care (APC) in England.</div></div><div><h3>Methods</h3><div>Incident PCa diagnoses (2000–2016) for males aged ≥46 at diagnosis who remained registered with their General Practitioner (GP) by age 65 and were recorded in at least one data source were analysed. Accuracy was the proportion of diagnoses recorded in GOLD or Aurum with a corresponding record in CR or HES. Completeness was the proportion of CR or HES diagnoses with a corresponding record in GOLD or Aurum.</div></div><div><h3>Results</h3><div>The final cohorts for comparisons included 29,500 records for GOLD and 26,475 for Aurum. Compared to CR, GOLD was 86 % accurate and 65 % complete, while Aurum was 87 % accurate and 77 % complete. Compared to HES, GOLD was 76 % accurate and 60 % complete, and Aurum was 79 % accurate and 70 % complete. Concordance in diagnosis dates improved over time in both GOLD and Aurum, with 93 % of diagnoses recorded within a year compared to CR, and 66 % (GOLD) and 71 % (Aurum) compared to HES. Delays of 2–3 weeks in primary care diagnosis recording were observed compared to CR, whereas most diagnoses appeared at least 3 months earlier in primary care than in HES.</div></div><div><h3>Conclusions</h3><div>Aurum demonstrated better accuracy and completeness for PCa diagnosis recording than GOLD. However, linkage to HES or CR is recommended for improved case capture. Researchers should address the limitations of each data source to ensure research validity.</div></div>","PeriodicalId":56322,"journal":{"name":"Cancer Epidemiology","volume":"94 ","pages":"Article 102715"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877782124001942","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Primary care data in the UK are widely used for cancer research, but the reliability of recording key events like diagnoses remains uncertain. Although data linkage can improve reliability, its costs, time requirements, and sample size constraints may discourage its use. We evaluated accuracy, completeness, and date concordance of prostate cancer (PCa) diagnosis recording in Clinical Practice Research Datalink (CPRD) GOLD and Aurum compared to linked Cancer Registry (CR) and Hospital Episode Statistics (HES) Admitted Patient Care (APC) in England.
Methods
Incident PCa diagnoses (2000–2016) for males aged ≥46 at diagnosis who remained registered with their General Practitioner (GP) by age 65 and were recorded in at least one data source were analysed. Accuracy was the proportion of diagnoses recorded in GOLD or Aurum with a corresponding record in CR or HES. Completeness was the proportion of CR or HES diagnoses with a corresponding record in GOLD or Aurum.
Results
The final cohorts for comparisons included 29,500 records for GOLD and 26,475 for Aurum. Compared to CR, GOLD was 86 % accurate and 65 % complete, while Aurum was 87 % accurate and 77 % complete. Compared to HES, GOLD was 76 % accurate and 60 % complete, and Aurum was 79 % accurate and 70 % complete. Concordance in diagnosis dates improved over time in both GOLD and Aurum, with 93 % of diagnoses recorded within a year compared to CR, and 66 % (GOLD) and 71 % (Aurum) compared to HES. Delays of 2–3 weeks in primary care diagnosis recording were observed compared to CR, whereas most diagnoses appeared at least 3 months earlier in primary care than in HES.
Conclusions
Aurum demonstrated better accuracy and completeness for PCa diagnosis recording than GOLD. However, linkage to HES or CR is recommended for improved case capture. Researchers should address the limitations of each data source to ensure research validity.
期刊介绍:
Cancer Epidemiology is dedicated to increasing understanding about cancer causes, prevention and control. The scope of the journal embraces all aspects of cancer epidemiology including:
• Descriptive epidemiology
• Studies of risk factors for disease initiation, development and prognosis
• Screening and early detection
• Prevention and control
• Methodological issues
The journal publishes original research articles (full length and short reports), systematic reviews and meta-analyses, editorials, commentaries and letters to the editor commenting on previously published research.